I got a file of 100 line, each line contains 5 elements such as
Corresponse to places,type,population,latitude,longitude respectively
I splited them in 5 arrays and want to use function sorted(population) to rewrite this file ordered by increasing population size.
However, I can't match them in a right order when I sorted(population),eg.Aberdeen,City,59122,57.14369,-2.09814
I just made population in a increasing order.
Because the items in your list are dicts rather than simple values (I'm assuming you used csv.DictReader to read in your file, and I'm going to call it list_of_dicts for clarity), there's no reliable automatic sorting. What you need is a "key function" that tells sorted how to determine what the order is.
You need to either:
- define a function:
def getPopulation (record) : return record["population"]
and then call sorted with this as the named parameter "key":
sortedValues = sorted(list_of_dicts, key=getPopulation)
- provide an anonymous function (lambda) as the key parameter
*sortedValues = sorted(list_of_dicts,key=(lambda x: x["population"])
- define a function:
Ah wait, sorry... I see what's happening here...
"I splited them in 5 arrays"
You don't want to do that. Each row of your file is a single entity and should not be split.
You should read each row as a single array, tuple or dictionary. If you read them as a dictionary, use the solution I supplied above. If you're reading them as an array or tuple, modify the key in the code above to be 1 rather than "population".
Check out the Python docs for the csv module to read your files in -- that does most of the string handling work for you.
Sorting a list of CSV lines is easier if you make every line a tuple (doesn't the
csvmodule give you the rows as tuples already?) and put the tuples in a list. Then you can sort the list with a
keyto sort by a specific tuple element:
geodata = ... # List of rows (tuples) from the CSV file # Make a sorted copy of geodata, sorted by the third entry in each row (the population) sorted_geodata = sorted(geodata, key=lambda row: row) # Or sort geodata in place (this overwrites geodata with the sorted version) geodata.sort(key=lambda row: row)
If you don't know what
lambdameans, it's a shorthand for defining a function. You could also write:
def get_pop_from_row(row): return row sorted_geodata = sorted(geodata, key=get_pop_from_row)
Here you can also see what
get_pop_from_rowdoes. For example, if you run
get_pop_from_row(geodata)in the console (after running your script), it returns
189120(assuming that Aberdeen is first in the list). The
sortedfunction does this for every row, and compares the values returned by
But for very short functions, like sort
lambdais more convenient.
I think a small problem is that it is assumed that @Kangaroo is using csv to write his files. He doesn't actually say he is, but ok, it's comma delimited file. But he may not be writing them out with csv. Not sure. I also had a quick look at his problem. But soon decided if I don't know how he is writing it out, is difficult to help him. So it's what is the real problem, the sort or loading the file into a list/var that is sortable. Of course it's a different problem if there are 1 million lines or a 100 lines.
Anyway, just saying.
Thank you guys! I solved my problem by importing numpy! Thanks again!