I was recently trying to solve some task in Python and I have found the solution that seems to have the complexity of O(n log n), but I believe it is very inefficient for some inputs (such as first parameter being 0 and pairs being very long list of zeros).
It has also three levels of for loops. I believe it can be optimized, but at the moment I cannot optimize it more, I am probably just missing something obvious 😉
So, basically, the problem is as follows:
Given list of integers (
values), the function needs to return the number of indexes’ pairs that meet the following criteria:
- lets assume single index pair is a tuple like
(index1, index2),- then
values[index1] == complementary_diff - values[index2]is true,Example:
If given a list like[1, 3, -4, 0, -3, 5]asvaluesand1ascomplementary_diff, the function should return4(which is the length of the following list of indexes’ pairs:[(0, 3), (2, 5), (3, 0), (5, 2)]).
This is what I have so far, it should work perfectly most of the time, but – as I said – in some cases it could run very slowly, despite the approximation of its complexity O(n log n) (it looks like pessimistic complexity is O(n^2)).
def complementary_pairs_number (complementary_diff, values):
value_key = {} # dictionary storing indexes indexed by values
for index, item in enumerate(values):
try:
value_key[item].append(index)
except (KeyError,): # the item has not been found in value_key's keys
value_key[item] = [index]
key_pairs = set() # key pairs are unique by nature
for pos_value in value_key: # iterate through keys of value_key dictionary
sym_value = complementary_diff - pos_value
if sym_value in value_key: # checks if the symmetric value has been found
for i1 in value_key[pos_value]: # iterate through pos_values' indexes
for i2 in value_key[sym_value]: # as above, through sym_values
# add indexes' pairs or ignore if already added to the set
key_pairs.add((i1, i2))
key_pairs.add((i2, i1))
return len(key_pairs)
For the given example it behaves like that:
>>> complementary_pairs_number(1, [1, 3, -4, 0, -3, 5])
4
If you see how the code could be “flattened” or “simplified”, please let me know.
I am not sure if just checking for complementary_diff == 0 etc. is the best approach – if you think it is, please let me know.
EDIT: I have corrected the example (thanks, unutbu!).
I think this improves the complexity to
O(n):value_key.setdefault(item,[]).append(index)is faster than usingthe
try..exceptblocks. It is also faster than using acollections.defaultdict(list). (I tested this with ipython %timeit.)pos_valuein
value_key, there is a uniquesym_valueassociated withpos_value. There are solutions whensym_valueis also invalue_key. But when we iterate over the keys invalue_key,pos_valueis eventually assigned to the value ofsym_value, whichmake the code repeat the calculation it has already done. So you can
cut the work in half if you can stop
pos_valuefrom equaling theold
sym_value. I implemented that with aseen = set()to keeptrack of seen
sym_values.The code only cares about
len(key_pairs), not thekey_pairsthemselves. So instead of keeping track of the pairs (with aset), we can simply keep track of the count (withnum_pairs). So we can replace the two inner for-loops withor half that in the “unique diagonal” case,
pos_value == sym_value.