I’m looking for the most efficient algorithm to randomly choose a set of n distinct integers, where all the integers are in some range [0..maxValue].
Constraints:
- maxValue is larger than n, and possibly much larger
- I don’t care if the output list is sorted or not
- all integers must be chosen with equal probability
My initial idea was to construct a list of the integers [0..maxValue] then extract n elements at random without replacement. But that seems quite inefficient, especially if maxValue is large.
Any better solutions?
For small values of maxValue such that it is reasonable to generate an array of all the integers in memory then you can use a variation of the Fisher-Yates shuffle except only performing the first
nsteps.If
nis much smaller thanmaxValueand you don’t wish to generate the entire array then you can use this algorithm:lof number picked so far, initially empty.xbetween 0 andmaxValue– (elements inl)lif it smaller than or equal tox, add 1 toxxinto the sorted list and repeat.If
nis very close tomaxValuethen you can randomly pick the elements that aren’t in the result and then find the complement of that set.Here is another algorithm that is simpler but has potentially unbounded execution time:
sof element picked so far, initially empty.maxValue.s, add it tos.shasnelements.In practice if
nis small andmaxValueis large this will be good enough for most purposes.