What
I have written a small piece of code to find and remove if any duplicate number in an integer array. I have made use of List for this.
The Code
static int[] RemoveDuplicate(int[] input)
{
List<int> correctedList = new List<int>();
for(int i = 0; i < input.Length; i++)
{
if (!correctedList.Contains(input[i]))
{
correctedList.Add(input[i]);
}
else
{
//skip
}
}
return correctedList.ToArray();
}
My Difficulty
I need to know how to find time complexity for this small piece of code written, and if possible how to optimize it.
What have I tried
I have done some reading on the Internet about how to calculate time and space complexity of an algorithm, and below is what I feel is the answer, but since I am new to this I thought rather than going with a wrong assumption it would be better to consult some experts on this.
Below is what I tried.
List correctedList = new List(); –> This will be executed 1 time
int i =0; –>This will be executed 1 time
int i < input.Length –> This will be executed N times
i++ –> This will be executed N times
if (!correctedList.Contains(input[i])) –> This may be executed N times
correctedList.Add(input[i]); –> This may be executed N times
So, the total number of operations = 1 + 1 + N + N + N + N = 4N+2
Is this equal to O(N) ?
and is my method of calculating time complexity correct ?
Thank in advance
It would be O(N) if all of the operations that you’ve called were themselves O(1). It is unlikely that Contains on a list is going to be O(1). In general, searching through an array is O(N) which would mean that your algorithm is O(N^2) (which I think it is).
Options for optimization:
1) The best you can possibly do is O(N), but that means that you’d need a contains function that is O(1). You can do this with a hash table (or hash set). Hash tables have O(1) insertion and removal.
2) You can insert into a structure that maintains sorted order (e.g. a balanced tree). Insertions will be O(log N) and the overall performance of the solution will be O(N log N).
3) Probably the most common and simplest algorithm is to just sort the array O(N log N) and then scan for duplicates.