In Python, is there any difference between creating a generator object through a generator expression versus using the yield statement?
Using yield:
def Generator(x, y):
for i in xrange(x):
for j in xrange(y):
yield(i, j)
Using generator expression:
def Generator(x, y):
return ((i, j) for i in xrange(x) for j in xrange(y))
Both functions return generator objects, which produce tuples, e.g. (0,0), (0,1) etc.
Any advantages of one or the other? Thoughts?
There are only slight differences in the two. You can use the
dismodule to examine this sort of thing for yourself.Edit: My first version decompiled the generator expression created at module-scope in the interactive prompt. That’s slightly different from the OP’s version with it used inside a function. I’ve modified this to match the actual case in the question.
As you can see below, the “yield” generator (first case) has three extra instructions in the setup, but from the first
FOR_ITERthey differ in only one respect: the “yield” approach uses aLOAD_FASTin place of aLOAD_DEREFinside the loop. TheLOAD_DEREFis “rather slower” thanLOAD_FAST, so it makes the “yield” version slightly faster than the generator expression for large enough values ofx(the outer loop) because the value ofyis loaded slightly faster on each pass. For smaller values ofxit would be slightly slower because of the extra overhead of the setup code.It might also be worth pointing out that the generator expression would usually be used inline in the code, rather than wrapping it with the function like that. That would remove a bit of the setup overhead and keep the generator expression slightly faster for smaller loop values even if
LOAD_FASTgave the “yield” version an advantage otherwise.In neither case would the performance difference be enough to justify deciding between one or the other. Readability counts far more, so use whichever feels most readable for the situation at hand.