For concreteness, how would you read text line, and select and print one random line, when you don’t know the number of lines in advance?
Yes this is a problem from the programming pearl which I get confused.
The solution choose the 1st element, then select the second with probability 1/2, the third with 1/3, and so forth.
An algorithm:
i = 0
while more input lines
with probability 1.0/++i
choice = this input line
print choice
Suppose the final choice is the 3rd element, the probability is
1 x 1/2 x 1/3 x 3/4 x … x n-2/n-1 x n-1/n == 1/2n ? But 1/n should be correct.
Your algorithm is correct, but the analysis is not. You can prove it by induction. Loosely: It works for N = 1 of course. If it works up to N-1, then what happens at N? The chance that the Nth element is chosen and overwrites the last choice is 1/N — good. The chance that it isn’t is (N-1)/N. In which case the choice from the previous step is used. But at that point all elements had an 1/(N-1) chance of being chosen. Now it’s 1/N. Good.