Well I was reading this post and then I came across a code which was:
jokes=range(1000000) domain=[(0,(len(jokes)*2)-i-1) for i in range(0,len(jokes)*2)]
I thought wouldn’t it be better to calculate the value of len(jokes) once outside the list comprehension?
Well I tried it and timed three codes
jv@Pioneer:~$ python -m timeit -s 'jokes=range(1000000);domain=[(0,(len(jokes)*2)-i-1) for i in range(0,len(jokes)*2)]' 10000000 loops, best of 3: 0.0352 usec per loop jv@Pioneer:~$ python -m timeit -s 'jokes=range(1000000);l=len(jokes);domain=[(0,(l*2)-i-1) for i in range(0,l*2)]' 10000000 loops, best of 3: 0.0343 usec per loop jv@Pioneer:~$ python -m timeit -s 'jokes=range(1000000);l=len(jokes)*2;domain=[(0,l-i-1) for i in range(0,l)]' 10000000 loops, best of 3: 0.0333 usec per loop
Observing the marginal difference 2.55% between the first and the second made me think – is the first list comprehension
domain=[(0,(len(jokes)*2)-i-1) for i in range(0,len(jokes)*2)]
optimized internally by python? or is 2.55% a big enough optimization (given that the len(jokes)=1000000)?
If this is – What are the other implicit/internal optimizations in Python ?
What are the developer's rules of thumb for optimization in Python?
Edit1: Since most of the answers are ‘don’t optimize, do it later if its slow’ and I got some tips and links from Triptych and Ali A for the do’s. I will change the question a bit and request for don’ts.
Can we have some experiences from people who faced the ‘slowness‘, what was the problem and how it was corrected?
Edit2: For those who haven’t here is an interesting read
Edit3: Incorrect usage of timeit in question please see dF’s answer for correct usage and hence timings for the three codes.
You’re not using
timeitcorrectly: the argument to-s(setup) is a statement to be executed once initially, so you’re really just testing an empty statement. You want to doWhile the speedup is still not dramatic, it’s more significant (16% and 25% respectively). So since it doesn’t make the code any more complicated, this simple optimization is probably worth it.
To address the actual question… the usual rule of thumb in Python is to
Favor straightforward and readable code over optimization when coding.
Profile your code (
profile / cProfileandpstatsare your friends) to figure out what you need to optimize (usually things like tight loops).As a last resort, re-implement these as C extensions, which is made much easier with tools like pyrex and cython.
One thing to watch out for: compared to many other languages, function calls are relatively expensive in Python which is why the optimization in your example made a difference even though
lenis O(1) for lists.