I’m trying to create a generator function:
def combinations(iterable, r, maxGapSize):
maxGapSizePlusOne = maxGapSize+1
pool = tuple(iterable)
n = len(pool)
if r > n:
return
indices = list(range(r))
while True:
for i in reversed(range(r)):
if indices[i] != i + n - r:
break
else:
return
indices[i] += 1
for j in range(i+1, r):
indices[j] = indices[j-1] + 1
previous = indices[0]
for k in indices[1:]:
if k-previous>maxGapSizePlusOne:
isGapTooBig = True
break
previous = k
else:
isGapTooBig = False
if not isGapTooBig:
print(indices)
combinations(("Aa","Bbb","Ccccc","Dd","E","Ffff",),2,1)
I’m printing out the indices that I wish to use to select the elements from the argument called ‘iterable’ for debugging purposes. This gives me:
[0, 2] [1, 2] [1, 3] [2, 3] [2, 4] [3, 4] [3, 5] [4, 5]
Ignoring [0,1] as this is produced elsewhere…
This is exactly what I want but I’m guessing my code it over complicated and inefficient. The size of iterable is likely to be in the thousands and it’s likely maxGapSize < 5.
Any tips to help me do this better?
Much of your code looks exactly like the Python code for itertools.combination. The CPython implementation of
itertools.combinationis written in C. The documentation linked to above shows Python-equivalent code.You can speed up the function by simply using
itertools.combinationinstead of using the Python-equivalent code:You can use timeit to compare the relative speed of alternate implementations this way:
original version:
versus
using
itertools.combination:The code above produces all combinations, including the initial combination,
range(len(iterable)). I think it is more beautiful to leave it that way. But if you really want to remove the first combination, you could useBy the way, the function
combinationsdoes not really depend oniterable. It only depends on the length ofiterable. Therefore, it would be better to make the call signature