Possible Duplicate:
sorted() using Generator Expressions Rather Than Lists
We all know using generators instead of instantiating lists all the time saves time and memory, especially if we use comprehensions a lot.
Here’s a question though, consider the following code:
output = SomeExpensiveCallEgDatabase()
results = [result[0] for result in output]
return sorted(results)
The call to sorted will return a sorted list of the results. Would it be better or worse to declare results as below and then call sorted?
results = (result[0] for result in output)
My guess is the call to sorted() would traverse the generator and instantiate a list itself in order to run quicksort or mergesort on it. So there would be no advantage in using the generator here. Is this assumption correct?
I believe your assumption to be true, since there is no easy way of ordering the collection without first having the whole list in memory (at least certainly not with the default sorting algorithm, TimSort if I’m not mistaken).
Check this out:
sorted() using Generator Expressions Rather Than Lists
To create the new List, the builtin sorted method uses
PySequence_List:Pros and cons of both approaches:
Memory-wise:
The returned list is the one used for the sorted version, so this would mean that in this case, only one list is stored completely in memory at any given time, using the generator version.
This makes the generator version more efficient memory-wise.
Speed:
Here the version with the whole list wins.
To create a new list based on a generator, an empty list must be created (or at best with the first element), and each following element appended to the list, with the possible redimensioning steps this may provoke.
To create a new list based on a previous list, the size of the list is known beforehand, and thus can be allocated at once and each of the entries assigned (possibly, there are other optimizations at work here, but I can’t back that up).
So regarding speed, the list wins.
The answer to “what’s the best”, comes down to the most common answer in any field of engineering… it depends….