Consider the following algorithm.
function Rand():
return a uniformly random real between 0.0 and 1.0
function Sieve(n):
assert(n >= 2)
for i = 2 to n
X[i] = true
for i = 2 to n
if (X[i])
for j = i+1 to n
if (Rand() < 1/i)
X[j] = false
return X[n]
What is the probability that Sieve(k) returns true as a function of k ?
Let’s define a series of random variables recursively:
Let Xk,r denote the indicator variable, taking value
1iffX[k] == trueby the end of the iteration in which the variableitook valuer.In order to have fewer symbols and since it makes more intuitive sense with the code, we’ll just write Xk,i which is valid although would have been confusing in the definition since
itaking valueiis confusing when the first refers to the variable in the loop and the latter to the value of the variable.Now we note that:
P(Xk,i ~ 0) = P(Xk,i-1 ~ 0) + P(Xk,i-1 ~ 1) * P(Xk-1,i-1 ~ 1) * 1/i
(~ is used in place of = just to make it understandable, since = would otherwise take two separate meanings and looks confusing).
This equality holds by virtue of the fact that either
X[k]wasfalseat the end of theiiteration either because it was false at the end of thei-1, or it wastrueat that point, but in that last iterationX[k-1]wastrueand so we entered the loop and changedX[k]with probability of 1/i. The events are mutually exclusive, so there is no intersection.The base of the recursion is simply the fact that P(Xk,1 ~ 1) = 1 and P(X2,i ~ 1) = 1.
Lastly, we note simply that P(
X[k] == true) = P(Xk,k-1 ~ 1).This can be programmed rather easily. Here’s a javascript implementation that employs memoisation (you can benchmark if using nested indices is better than string concatenation for the dictionary index, you could also redesign the calculation to maintain the same runtime complexity but not run out of stack size by building bottom-up and not top-down). Naturally the implementation will have a runtime complexity of
O(k^2)so it’s not practical for arbitrarily large numbers:More interesting would be how the 1/i probability changes things. I.e. whether or not the probability converges to 0 or to some other value, and if so, how changing the 1/i affects that.
Of course if you ask on mathSE you might get a better answer – this answer is pretty simplistic, I’m sure there is a way to manipulate it to acquire a direct formula.