Wondering about the performance impact of doing one iteration vs many iterations. I work in Python — I’m not sure if that affects the answer or not.
Consider trying to perform a series of data transformations to every item in a list.
def one_pass(my_list):
for i in xrange(0, len(my_list)):
my_list[i] = first_transformation(my_list[i])
my_list[i] = second_transformation(my_list[i])
my_list[i] = third_transformation(my_list[i])
return my_list
def multi_pass(my_list):
range_end = len(my_list)
for i in xrange(0, range_end):
my_list[i] = first_transformation(my_list[i])
for i in xrange(0, range_end):
my_list[i] = second_transformation(my_list[i])
for i in xrange(0, range_end):
my_list[i] = third_transformation(my_list[i])
return my_list
Now, apart from issues with readability, strictly in performance terms, is there a real advantage to one_pass over multi_pass? Assuming most of the work happens in the transformation functions themselves, wouldn’t each iteration in multi_pass only take roughly 1/3 as long?
The difference will be how often the values and code you’re reading are in the CPU’s cache.
If the elements of
my_listare large, but fit into the CPU cache, the first version may be beneficial. On the other hand, if the (byte)code of the transformations is large, caching the operations may be better than caching the data.Both versions are probably slower than the way more readable:
Timing it yields: