I was fiddling around with Python’s generators and iterable class, just for fun. Basically I wanted test out something that I’ve never been too sure about: that classes in Pythons have some significant overhead and it’s better to rely on methods that implement yield instead of classes that implement an iterator protocol, if you can.
I couldn’t find a satisfying explanation on this topic in Google, so I decided to test them out on my own using these two simple scripts: func_iter.py and class_iter.py
Here’s func_iter.py:
#!/usr/bin/env python
import time
x = 0
def create_generator(num):
mylist = range(num)
for i in mylist:
yield i
t = time.time()
gen = create_generator(100000)
for i in gen:
x = x + i
print "%.3f" % (time.time() - t)
And here’s class_iter.py:
#!/usr/bin/env python
import time
x = 0
class Generator(object):
def __init__(self, num):
self.start = 0
self.end = num
def __iter__(self):
return self
def next(self):
if self.start == self.end:
raise StopIteration
else:
self.start = self.start + 1
return self.start
t = time.time()
gen = Generator(100000)
for i in gen:
x = x + i
print "%.3f" % (time.time() - t)
I then ran each of them 10 times using this in bash (for class_iter.py, for example):
for i in {1..10}; do ./class_iter.py; done
And here are the average running times for each of them:
class_iter.py: 0.0864
func_iter.py: 0.0307
Now, my questions are:
- Are my methods correct? Is my comparison fair?
- If so, why the big difference? Why did
class_iter.pytake almost three times as long asfunc_iter.pyto run? - If not, how can I improve my methods or come up with a better comparison?
EDIT: As Dacav suggested, I also tried running func_iter.py using xrange instead of range. This decreases its average running time to 0.0263 seconds.
The class version spends lots of time accessing its own variables. Each
self.whatevercosts cycles. If you define your__iter__as a generator and minimize the use of instance variables, the difference between class and function versions will be negligible:Results:
so the second generator class is almost as fast as the function version.
Please do note that
GeneratorandGenerator2in the example are not fully equivalent, there are cases when you cannot simply replace a “plain” iterator with a generator (e.g. marshaling).