So I wrote this function that is given possible numbers, and it has to find the two numbers inside the possible numbers that make up the given number. However, I am still learning Python (a very wonderful language) so I can only use a limited set of functions.
I created this function:
def sumPair(theList, n):
theList = charCount(theList) #charCount is a function i made to convert the list into a dictionary
for i in theList:
for a,b in theList.iteritems():
print a,b
if a + i == n:
if theList[b] > 1:
return [i, b]
if a != i:
return [i, b]
return "[]"
print sumPair([6,3,6,8,3,2,8,3,2], 11)
Like I said, it finds the two numbers that add up to the given number. charCount is a function I wrote that adds the array into a dictionary.
In this program, I make sure that the value is bigger then one in case the numbers that are being added are the same. Sometimes if it checks for the sum of 10 and you give it a number of 5, it will just add the 5 to itself and return 10. That’s why the if theList[b] > 1:
is there.
Why am I here? My instructor wasn’t happy with two loops. I spent 5 hours troubleshooting and got nowhere. I need to convert this program into a single loop program.
I spent all day on this, I’m not trying to make you do my homework, I’m just really stuck and I need your help. I’ve heard I’m supposed to check if a key exists to get this done.
It always helps to think about the problem in terms how would I do it by hand, with pencil and paper or even only looking at the row of the numbers on the paper. However, the better solutions may look overcomplicated at first, and their advantage may not be that clear at first look — see gnibbler’s solution (his answer is my personal winner, see below).
First of all, you need to compare one number against all of the rest. Then second number with the rest, etc. When using the naive approach, there is no way to avoid two nested loops when using a single procesor. Then the time complexity is always O(n^2) where n is the length of the sequence. The truth is that some of the loops may be hidden in the operations like
inorlist.index()which does not make the solution better in principle.Imagine the cartesian product of the numbers — it consists of couples of the numbers. There is n^2 of such couples, but about a half is the same with respect to the comutative nature of the addition operation, and n of them are the pairs with itsef. It means that you need to check only
n^2 / 2 - npairs. It is much better to avoid looping through the unneccessary pairs than to test later if they fit for the testing:Use slicing for the rest of theList from the checked one on, use the
enumerate()in the first (and possibly also in the second) loop to know the index.It is always good idea to minimize operations in the loops. Think about the inner loop body is done the most times. This way you can compute the searched number before entering the inner loop:
searched = sum - first. Then the second loop plus theifcan be replaced byif searched in the rest of theList:[Edited after more full solutions appeared here]
Here is the O(n^2) solution to find the first occurence or None (pure Python, simple, no libraires, built-in functions and slicing only, few lines):
[added after gnibbler’s comment on slicing and copying]
gnibbler is right about slicing. The slice is the copy. (The question is whether slicing is not optimized using “copy on write” technique — I do not know. If yes, then slicing would be a cheap operation for the purpose.) To avoid copying, the test can be done using the
list.index()method that allows to pass the starting index. The only strange thing is that it raises theValueErrorexception when the item is not found. This way theif complement...must be replaced by thetry ... except:Gnibbler’s comment made me thinking more about the problem. The truth is that the
setcan be close to O(1) to test whether it contains the element and O(n) to construct the set. It is not that clear for non-numeric elements (where the set type cannot be implemented as a bit array). When hash arrays comes to the play and possible conflicts should be solved using other techniques, then the quality depends on the implementation.When in doubt, measure. Here the gnibbler’s solution was slightly modified to be as same as the other solutions:
The original numbers from the question were used for the first time test. The
n = 1causes the worst case when the solution cannot be found:It produces the following output on my console:
It is impossible to say anything about the quality in the sense of the time complexity from that short sequence of numbers and one special case. Anyway, gnibbler’s solution won.
The gnibbler’s solution uses the most memory in cases when the sequence contains unique values. Let’s try much longer sequence containing 0, 1, 2, …, 9999. The n equal to 11 and 3000 represents the task with a solution. For the case with n equal to 30000, the couple of numbers cannot be found. All elements must be checked — worst case:
Notice that the sequence is much longer. The test is repeated only 100 times to get the results in reasonable time. (The time cannot be compared with the previous test unless you divide it by the
number.) It displays the following on my console:Here the gnibbler’s solution seems to be slow for the non-worst case. The reason is that it needs the preparation phase that goes through all the sequence. The naive solutions found the numbers in about one third of the first pass. What tells anythig is the worst case. The gnibbler’s solution is about 1000 times faster, and the difference would increase for longer sequences. Gnibbler’s solution is the clear winner.