I need to explain to the client why dupes are showing up between 2

Question

0

Asked: May 10, 20262026-05-10T16:38:36+00:00 2026-05-10T16:38:36+00:00

I need to explain to the client why dupes are showing up between 2

0

I need to explain to the client why dupes are showing up between 2 supposedly different exams. It’s been 20 years since Prob and Stats.

I have a generated Multiple choice exam. There are 192 questions in the database, 100 are chosen at random (no dupes).

Obviously, there is a 100% chance of there being at least 8 dupes between any two exams so generated. (Pigeonhole principle)

How do I calculate the probability of there being 25 dupes? 50 dupes? 75 dupes?

— Edit after the fact — I ran this through excel, taking sums of the probabilities from n-100, For this particular problem, the probabilities were

n   P(n+ dupes) 40  97.5% 52  ~50%  61  ~0

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

score 0 · Answer 1 · 2026-05-10T16:38:36+00:00

Erm, this is really really hazy for me. But there are (192 choose 100) possible exams, right?

And there are (100 choose N) ways of picking N dupes, each with (92 choose 100-N) ways of picking the rest of the questions, no?

So isn’t the probability of picking N dupes just:

(100 choose N) * (92 choose 100-N) / (192 choose 100)

EDIT: So if you want the chances of N or more dupes instead of exactly N, you have to sum the top half of that fraction for all values of N from the minimum number of dupes up to 100.

Errrr, maybe…

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I need to explain to the client why dupes are showing up between 2

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply