I’m relatively new to the practice of determining algorithm runtimes using big-O notation and I have a question regarding the runtime of a sorting algorithm. Let’s say I have a set of pairs (a, b) in an array and I sort the data using a known sorting algorithm that runs in O(n log n). Next, I take a subset of some number of the n data points and run the same sorting algorithm on that subset (so theoretically I could sort the entire array twice – the first sort would be comparing a’s and the second set would be comparing b’s). So in other words my code is
pairArray[n];
Sort(pairArray); //runs in O(n log n)
subsetArray[subset]; //where subset <= n
for (int i = 0; i < subset; i++) {
subsetArray[i] = pairArray[i];
}
Sort(subsetArray) //runs in O(n log n)
Is the runtime of this code still O(n log n)? I guess I have two questions: does running an O(something) sort twice increase complexity from the original “something”, and does the iteration to reassign to a different array increase complexity? I’m more worried about the first one as the iteration can be eliminated with pointers.
Constact factors are ignored in big-O notation. Sorting twice is still O(n log n).
The loop with the assignment you are doing is an O(n) operation. This is also ignored. Only the largest term is mentioned in big-O notation.
If you want to decide which of two algorithms is better but their big-O is the same then you can use performance measurements on realistic data. When measuring actual performance you can see if one algorithm is typically twice as slow as another. This cannot be seen from the big-O notation.