This question has been asked in Microsoft interview. Very much curious to know why these people ask so strange questions on probability?
Given a rand(N), a random generator which generates random number from 0 to N-1.
int A[N]; // An array of size N
for(i = 0; i < N; i++)
{
int m = rand(N);
int n = rand(N);
swap(A[m],A[n]);
}
EDIT: Note that the seed is not fixed.
what is the probability that array A remains the same?
Assume that the array contains unique elements.
Well I had a little fun with this one. The first thing I thought of when I first read the problem was group theory (the symmetric group Sn, in particular). The for loop simply builds a permutation σ in Sn by composing transpositions (i.e. swaps) on each iteration. My math is not all that spectacular and I’m a little rusty, so if my notation is off bear with me.
Overview
Let
Abe the event that our array is unchanged after permutation. We are ultimately asked to find the probability of eventA,Pr(A).My solution attempts to follow the following procedure:
1) Possible Outcomes
Notice that each iteration of the for loop creates a swap (or transposition) that results one of two things (but never both):
We label the second case. Let’s define an identity transposition as follows:
For any given run of the listed code, we compose
Ntranspositions. There can be0, 1, 2, ... , Nof the identity transpositions appearing in this “chain”.For example, consider an
N = 3case:Note that there is an odd number of non-identity transpositions (1) and the array is changed.
2) Partitioning Based On the Number of Identity Transpositions
Let
K_ibe the event thatiidentity transpositions appear in a given permutation. Note this forms an exhaustive partition of all possible outcomes:0andNidentity transpositions.Thus we can apply the Law of Total Probability:
Now we can finally take advantage of the the partition. Note that when the number of non-identity transpositions is odd, there is no way the array can go unchanged*. Thus:
*From group theory, a permutation is even or odd but never both. Therefore an odd permutation cannot be the identity permutation (since the identity permutation is even).
3) Determining Probabilities
So we now must determine two probabilities for
N-ieven:The First Term
The first term,
, represents the probability of obtaining a permutation with
iidentity transpositions. This turns out to be binomial since for each iteration of the for loop:1/N.Thus for
Ntrials, the probability of obtainingiidentity transpositions is:The Second Term
So if you’ve made it this far, we have reduced the problem to finding
for
N - ieven. This represents the probability of obtaining an identity permutation giveniof the transpositions are identities. I use a naive counting approach to determine the number of ways of achieving the identity permutation over the number of possible permutations.First consider the permutations
(n, m)and(m, n)equivalent. Then, letMbe the number of non-identity permutations possible. We will use this quantity frequently.The goal here is to determine the number of ways a collections of transpositions can be combined to form the identity permutation. I will try to construct the general solution along side an example of
N = 4.Let’s consider the
N = 4case with all identity transpositions (i.e.i = N = 4). LetXrepresent an identity transposition. For eachX, there areNpossibilities (they are:n = m = 0, 1, 2, ... , N - 1). Thus there areN^i = 4^4possibilities for achieving the identity permutation. For completeness, we add the binomial coefficient,C(N, i), to consider ordering of the identity transpositions (here it just equals 1). I’ve tried to depict this below with the physical layout of elements above and the number of possibilities below:Now without explicitly substituting
N = 4andi = 4, we can look at the general case. Combining the above with the denominator found previously, we find:This is intuitive. In fact, any other value other than
1should probably alarm you. Think about it: we are given the situation in which allNtranspositions are said to be identities. What’s the probably that the array is unchanged in this situation? Clearly,1.Now, again for
N = 4, let’s consider 2 identity transpositions (i.e.i = N - 2 = 2). As a convention, we will place the two identities at the end (and account for ordering later). We know now that we need to pick two transpositions which, when composed, will become the identity permutation. Let’s place any element in the first location, call itt1. As stated above, there areMpossibilities supposingt1is not an identity (it can’t be as we have already placed two).The only element left that could possibly go in the second spot is the inverse of
t1, which is in factt1(and this is the only one by uniqueness of inverse). We again include the binomial coefficient: in this case we have 4 open locations and we are looking to place 2 identity permutations. How many ways can we do that? 4 choose 2.Again looking at the general case, this all corresponds to:
Finally we do the
N = 4case with no identity transpositions (i.e.i = N - 4 = 0). Since there are a lot of possibilities, it starts to get tricky and we must be careful not to double count. We start similarly by placing a single element in the first spot and working out possible combinations. Take the easiest first: the same transposition 4 times.Let’s now consider two unique elements
t1andt2. There areMpossibilities fort1and onlyM-1possibilities fort2(sincet2cannot be equal tot1). If we exhaust all arrangements, we are left with the following patterns:Now let’s consider three unique elements,
t1,t2,t3. Let’s placet1first and thent2. As usual, we have:We can’t yet say how many possible
t2s there can be yet, and we will see why in a minute.We now place
t1in the third spot. Notice,t1must go there since if were to go in the last spot, we would just be recreating the(3)rdarrangement above. Double counting is bad! This leaves the third unique elementt3to the final position.So why did we have to take a minute to consider the number of
t2s more closely? The transpositionst1andt2cannot be disjoint permutations (i.e. they must share one (and only one since they also cannot be equal) of theirnorm). The reason for this is because if they were disjoint, we could swap the order of permutations. This means we would be double counting the(1)starrangement.Say
t1 = (n, m).t2must be of the form(n, x)or(y, m)for somexandyin order to be non-disjoint. Note thatxmay not benormandymany not benorm. Thus, the number of possible permutations thatt2could be is actually2 * (N - 2).So, coming back to our layout:
Now
t3must be the inverse of the composition oft1 t2 t1. Let’s do it out manually:Thus
t3must be(m, x). Note this is not disjoint tot1and not equal to eithert1ort2so there is no double counting for this case.Finally, putting all of these together:
4) Putting it all together
So that’s it. Work backwards, substituting what we found into the original summation given in step 2. I computed the answer to the
N = 4case below. It matches the empirical number found in another answer very closely!N = 4 M = 6 _________ _____________ _________ | Pr(K_i) | Pr(A | K_i) | Product | _________|_________|_____________|_________| | | | | | | i = 0 | 0.316 | 120 / 1296 | 0.029 | |_________|_________|_____________|_________| | | | | | | i = 2 | 0.211 | 6 / 36 | 0.035 | |_________|_________|_____________|_________| | | | | | | i = 4 | 0.004 | 1 / 1 | 0.004 | |_________|_________|_____________|_________| | | | | Sum: | 0.068 | |_____________|_________|Correctness
It would be cool if there was a result in group theory to apply here– and maybe there is! It would certainly help make all this tedious counting go away completely (and shorten the problem to something much more elegant). I stopped working at
N = 4. ForN > 5, what is given only gives an approximation (how good, I’m not sure). It is pretty clear why that is if you think about it: for example, givenN = 8transpositions, there are clearly ways of creating the identity with four unique elements which are not accounted for above. The number of ways becomes seemingly more difficult to count as the permutation gets longer (as far as I can tell…).Anyway, I definitely couldn’t do something like this within the scope of an interview. I would get as far as the denominator step if I was lucky. Beyond that, it seems pretty nasty.