Problem 14 on Project Euler describes a certain puzzle that many people have asked about here. My question is not how to solve the problem or how to fix other people’s errors. After thinking about the puzzle, the following “solution” was written but appears to be wrong. Could someone explain my error?
def main():
# start has all candidate numbers; found has known sequence numbers
start, found = set(range(1, 1000000)), set()
# if are numbers in start, then there are still unfound candidates
while start:
# pick a random starting number to test in the sequence generator
number = start.pop()
# define the set of numbers that the generator created for study
result = set(sequence(number, found))
# remove them from the candidates since another number came first
start -= result
# record that these numbers are part of an already found sequence
found |= result
# whatever number was used last should yield the longest sequence
print(number)
def sequence(n, found):
# generate all numbers in the sequence defined by the problem
while True:
# since the first number begins the sequence, yield it back
yield n
# since 1 is the last sequence number, stop if we yielded it
if n == 1:
break
# generate the next number in the sequence with binary magic
n = 3 * n + 1 if n & 1 else n >> 1
# if the new number was already found, this sequence is done
if n in found:
break
if __name__ == '__main__':
main()
The documentation was added later and is hopefully clear enough to explain why I thought it would work.
After explaining the proposed solution to a colleague, the answer came to me: this solution does not take into consideration the length of sequences generated outside of the range of numbers being tested. Therefore, a new solution will need to be devised that considers the length of complete sequences.
To test the algorithm, the following program was written. The method works for a sequence over an entire closed range. This is quite impossible to accomplish in the Collatz Problem, so the code fails.
The corrected version of the code follows a similar pattern in its design but keeps track of values outside of the original range. Execution time has been found to be similar to other Python solutions to the puzzle.