I have 1000 unique objects in a java.util.List, each referring to an image, each image in the 1000-list is unique and now I’d like to shuffle them, so that I can use the first 20 objects and present them to the website-user.
The user can then click a button saying “Shuffle”, and I retrieve the 1000 images again from scratch and calling again shuffle().
However, it seems that out of 1000 image objects, I very often see the same image again and again between the 20-image-selections.
Something seems to be wrong, any better suggestion, advices?
My code is very simple:
List<String> imagePaths = get1000Images();
Collections.shuffle(imagePaths);
int i = 0;
for (String path: imagePaths) {
... do something with the path ...
i++;
if (i >= 20) break;
}
I know that Collections.shuffle() is well distributed:
see for instance http://blog.ryanrampersad.com/2012/03/03/more-on-shuffling-an-array-correctly/
However, I just have the feeling that the probability of seeing the same image over and over again in a set of 20 images out of 1000 should be much less…
Inputs highly appreciated.
If you’re showing 20 images out of 1000 the probability of seeing any one of that 20 repeated in the next iteration is approximately 0.34 so you shouldn’t be surprised to see images repeating.
The chances of seeing a specific image is still one in a thousand, but if you’re looking for twenty images the chances are much higher.
We can calculate the probability of none of the previous 20 images repeating as:
And so the probability of seeing a repeat is one minus this, or approximately 0.34.
And the probability of seeing an image repeated in either of the next two iterations is:
In other words, it’s more likely than not that you’ll see a repeated image over the two following cycles. (And this isn’t including images repeated from the second cycle in the third which will only make it more likely.)
For what it’s worth, here’s some Java code to do the above calculation: