Algorithms from "100 Interview Questions for Software Developers" : Part 3

Find repeating element in array

I felt this  question deserves a post of its own. So here it is :

 Q5. In an array from 1 to n, one number is present twice. How to do you determine which one?

There are multiple answers - all correct. We would, obviously, aim for the optimized one - benchmark parameters being time & space complexity

Source : Algorithms from 100 Interview Questions for Software Developers

10 comments:

SV said...

The best solution I could get had the worst case time comparison of n(n-1)/2

I tried applying the mergesort kinda logic as well. Anything better?

Madhur Kumar Tanwani said...

Solution :
Find summation of the elements in the array - lets say this is 'x'

Since we know the numbers are from 1 to n, the summation of first 'n' natural numbers is n(n+1)/2 - lets say this is 'y'.

The number that is repeating is
x - y

Time complexity : O(n)
Space complexity : O(1)

Madhur Kumar Tanwani said...

A tougher variant of this problem has been solved on Inder's blog : http://inder-gnu.blogspot.com/2007/03/finding-missing-integer.html

SV said...

From the problem statement I did not realize that the values were constrained from 1 to n.

My bad!

Madhur Kumar Tanwani said...

I think its a fair thing.

I assumed that "In an array from 1 to n" meant and array containing elements from '1' to 'n'.

This problem and its variants in multiple places, wee of this type :

* Suppose you have an array of 1001 integers. The integers are in random order, but you know each of the integers is between 1 and 1000 (inclusive). In addition, each number appears only once in the array, except for one number, which occurs twice. Assume that you can access each element of the array only once. Describe an algorithm to find the repeated number. If you used auxiliary storage in your algorithm, can you find an algorithm that does not require it?

Your variation is also interesting. I feel following solutions should work in that case :
Solution 1:
1.> Sort the array - (n log n)
2.> Iterate over the array and find the repeating element - O(n)
Overall order O(n log n)
In this case, we might be able to optimize and find the repeating element at the time of sorting itself (while sorting / placing the repeating element, if we encounter the same element, we know the repeating element)

Solution 2:
1.> Iterate over the array
2.> For each element, find if the element repeats in the remaining array
Overall order O(n^2)

Anything better that's possible??

Madan Kumar said...

Use a hash. When hash collision happens, it is a duplicate. Time O(n).

spsneo said...

Using xor operator is a better solution. Using sum, may cause overflow.

Madhur Kumar Tanwani said...

@spsneo
I can think of XOR being helpful if the elements are in sorted order. But that would make the solution a O(n log(n)) solution for unsorted array.

Can you please explain how XOR will help for unsorted arrays?

spsneo said...

xor operation is associative and commutative. So it doesn't matter whether the array is sorted or not. It works.

Prateek said...

How will XOR give you answer? It will only be correct if all the numbers occur twice except for one number. In that case applying XOR will 'cancel' equal numbers giving us the 'lonely' number... Please tell me if this is wrong.

 
Stats