Given you have an array A[1..n] of size n, it contains elements from the set {1..n}. However, two of the elements are missing, (and perhaps two of the array elements are repeated). Find the missing elements.
Eg if n=5, A may be A[5] = {1,2,1,3,2}; and so the missing elements are {4,5}
The approach I used was:
int flag[n] = {0};
int i;
for(i = 0; i < n; i++) {
flag[A[i]-1] = 1;
}
for(i = 0; i < n; i++) {
if(!flag[i]) {
printf("missing: %d", (i+1));
}
the space complexity comes to O(n). I feel this is a very kiddish and inefficient code. So could you please provide a better algo with better space and time complexity.
Theoretically,
It is possible to do in O(1) space (in RAM model, i.e. O(1) words) and O(n) time even with a read-only array.
Algorithm
Assume the missing numbers are x and y.
There are two possibilities for the array:
For this case, the bucketed XOR trick will work.
Do a XOR of all elements of the array with 1,2,…,n.
You end up with z = x XOR y.
There is at least one bit of z which is non-zero.
Now differentiating the elements of the array based on that bit (two buckets) do a XOR pass through the array again.
You will end up with x and y.
Once you have the x and y, you can confirm if these are indeed the missing elements.
If it so happens that the confirmation step fails, then we must have the second case:
Let the two repeated elements be a and b (x and y are the missing ones).
Let
S_k = 1^k + 2^k + .. + n^kFor instance
S_1 = n(n+1)/2,S_2 = n(n+1)(2n+1)/6etc.Now we compute seven things:
Note, we can use O(1) words (intsead of one) to deal with the overflow issues. (I estimate 8-10 words will be enough).
Let
Ci = T_i - S_iNow assume that a,b,x,y are the roots of the 4th degree polynomial
P(z) = z^4 + pz^3 + qz^2 + rz + sNow we try to transform the above seven equations into four linear equations in
p,q,r,s.For instance, if we do
4th Eqn + p * 3rd Eqn + q* 2nd equation + r* 1st equationwe get
C4 + p*C3 + q*C2 + r*C1 = 0Similarly we get
These are four linear equations in
p,q,r,swhich can be solved by Linear Algebra techniques like Gaussian Elimination.Note that
p,q,r,swill be rationals and so can be computed only with integer arithmetic.Now suppose we are given a solution
p,q,r,sto the above set of equations.Consider
P(z) = z^4 + pz^3 + qz^2 + rz + s.What the above equations are saying is basically
Now the matrix
has the same determinant as the Vandermonde matrix and thus is invertible, if
a,b,x,yare distinct.Thus we must have that
P(a) = P(b) = P(x) = P(y) = 0.Now check which of
1,2,3,...,nare roots ofx^4 + px^3 + qx^2 + rx + s = 0.Thus this is a linear time constant space algorithm.
Code
I wrote the following C# (.Net 4.0) code and it seems to work for the few samples I tried… (Note: I didn’t bother catering to case 1 above).
The output is