I know that __builtin__ sorted() function works on any iterable. But can someone explain this huge (10x) performance difference between anylist.sort() vs sorted(anylist) ? Also, please point out if I am doing anything wrong with way this is measured.
"""
Example Output:
$ python list_sort_timeit.py
Using sort method: 20.0662879944
Using sorted builin method: 259.009809017
"""
import random
import timeit
print 'Using sort method:',
x = min(timeit.Timer("test_list1.sort()","import random;test_list1=random.sample(xrange(1000),1000)").repeat())
print x
print 'Using sorted builin method:',
x = min(timeit.Timer("sorted(test_list2)","import random;test_list2=random.sample(xrange(1000),1000)").repeat())
print x
As the title says, I was interested in comparing list.sort() vs sorted(list). The above snippet showed something interesting that, python’s sort function behaves very well for already sorted data. As pointed out by Anurag, in the first case, the sort method is working on already sorted data and while in second sorted it is working on fresh piece to do work again and again.
So I wrote this one to test and yes, they are very close.
"""
Example Output:
$ python list_sort_timeit.py
Using sort method: 19.0166599751
Using sorted builin method: 23.203567028
"""
import random
import timeit
print 'Using sort method:',
x = min(timeit.Timer("test_list1.sort()","import random;test_list1=random.sample(xrange(1000),1000);test_list1.sort()").repeat())
print x
print 'Using sorted builin method:',
x = min(timeit.Timer("sorted(test_list2)","import random;test_list2=random.sample(xrange(1000),1000);test_list2.sort()").repeat())
print x
Oh, I see Alex Martelli with a response, as I was typing this one.. ( I shall leave the edit, as it might be useful).
Your error in measurement is as follows: after your first call of
test_list1.sort(), that list object IS sorted — and Python’s sort, aka timsort, is wickedly fast on already sorted lists!!! That’s the most frequent error in usingtimeit— inadvertently getting side effects and not accounting for them.Here’s a good set of measurements, using
timeitfrom the command line as it’s best used:As you see,
y.sort()andsorted(x)are neck and neck, butx.sort()thanks to the side effects gains over an order of magnitude’s advantage — just because of your measurement error, though: this tells you nothing aboutsortvssortedper se! -)