Processing math: 100%

Monday, December 1, 2014

Solution to find the three missing numbers

The puzzle (posted earlier):

You have an array of 2n+3 positive integers, such that each element appears exactly twice in the array, except for three of them (say a,b,c) which appear exactly once.

Can you give an O(n) time and O(1) space (assuming RAM model) algorithm to find out what the three unique integers a,b,c are?

Solution

If we had just one number missing, this would be the classic puzzle whose solution is to XOR (bitwise exclusive or, denoted ) all the numbers. The result will be the unique number.

A similar solution works in the two unique case (missing a,b), where you XOR all the number together, and use a set bit position in the result (which is ab) to differentiate between the two unique numbers by making another pass and bucketing based on that bit. [See Collins' comments on the question post for more details.]

When you try to apply that to the three unique case, we compute S=abc, but this might well be 0. Even if some bit of S was set to 1, the corresponding bits of a,b,c could all be 1 and we cannot be sure of using that position to distinguish one of them.

But, consider this: If S differed from a at some bit position, then that bit position could be used as a distinguishing bit for a,b,c. Because if a,b,c were all the same at that position, then S would be too, and cannot differ from a. S will always differ from some bit of a as otherwise we will S=a and thus b=c.

But, we don't really know what a,b,c are, to find out a differing bit.

That does not matter though, if we consider the smallest position where they might differ.

If f(x,y) is the smallest bit position (rightmost bit position being 0) where x and y differ, then we cannot have that f(S,a)=f(S,b)=f(S,c), as otherwise a,b,c would agree at that position and thus would not differ from S at that position. [Note that f(x,y) can be computed in O(1) time and space by using the classic bithack: x&(x1).]

Thus the algorithm becomes:

Compute S=abc by doing an XOR of all ai.

Now make another pass through the array and compute the XOR T of all 2f(S,ai), where ai are the elements of the array.

Note that 2f(S,ai) can be computed in O(1) time, and would fit in the underlying register (so O(1) space). We also ignore ai for which S=ai.

Now, T cannot be 0 as it is the bitwise XOR of three powers of 2. Thus, a set bit of T can now be used to distinguish one of a,b,c, and we can then fall back to the two unique case to find the other two.

No comments:

Post a Comment