I’ve seen many mergesort implementations. Here is the version in Data Structures and Algorithms in Java (2nd Edition) by Robert Lafore:
private void recMergeSort(long[] workSpace, int lowerBound,int upperBound)
{
if(lowerBound == upperBound) // if range is 1,
return; // no use sorting
else
{ // find midpoint
int mid = (lowerBound+upperBound) / 2;
// sort low half
recMergeSort(workSpace, lowerBound, mid);
// sort high half
recMergeSort(workSpace, mid+1, upperBound);
// merge them
merge(workSpace, lowerBound, mid+1, upperBound);
} // end else
} // end recMergeSort()
private void merge(long[] workSpace, int lowPtr,
int highPtr, int upperBound)
{
int j = 0; // workspace index
int lowerBound = lowPtr;
int mid = highPtr-1;
int n = upperBound-lowerBound+1; // # of items
while(lowPtr <= mid && highPtr <= upperBound)
if( theArray[lowPtr] < theArray[highPtr] )
workSpace[j++] = theArray[lowPtr++];
else
workSpace[j++] = theArray[highPtr++];
while(lowPtr <= mid)
workSpace[j++] = theArray[lowPtr++];
while(highPtr <= upperBound)
workSpace[j++] = theArray[highPtr++];
for(j=0; j<n; j++)
theArray[lowerBound+j] = workSpace[j];
} // end merge()
One interesting thing about the merge method is that almost all the implementations didn’t pass the mid parameter to the merge method. mid is calculated in the merge. This is strange, since highPtr is assigned to mid + 1 from the calling method.
Why did the author not pass mid to merge like merge(workSpace, lowerBound,mid, mid+1, upperBound);? If we write it like , we can easily see that [lowerBound,mid] is the lower range ,[mid+1,upperBound] is higher range.
I think there must be a reason, otherwise I can’t understand why all implementations of an algorithm older than half a century are coincident in the such a little detail.
Basically we’re talking about two intervals that are adjacent
[a..b-1]and[b..n](bounds are inclusive), and you’re asking why it’s presented as(a, b, n)instead of(a, b-1, b, n)?You said it yourself: it’s “such a little detail”.
It doesn’t affect correctness, and any performance gained in not recomputing
b-1may be offset by the cost of passing it as an additional parameter. Is it worth the effort of profiling to investigate one way or another? No. It has no effect on the algorithm’s asymptotic behavior, and any difference in performance is negligible; it’s not worth fussing about little things like this.By the way, it is worth noting that the way of computing average of two number as above is now considered broken (see: Google Research blog: Extra, Extra – Read All About It: Nearly All Binary Searches and Mergesorts are Broken). Josh Bloch recommends:
Going back to the original question, generally speaking, if a parameter can be derived from another, then often it’s best to omit them; it simplifies calling it, and it also simplifies analysis due to obviously smaller domain.
A function that takes 10 parameters, 5 them of derivable, is much harder to analyze than one that simply takes 5. Of course, if the derivable parameters are expensive and/or non-trivial to compute, and you’ve already computed their values before you call the function, then you may consider passing them on.
In fact, in the merge sort example,
(a, n)suffices, sincebis derivable from both. That computation is not trivial, however (the bug Bloch mentions escapes detection for 2 decades), so it’s decided to simply pass it on.b-1, by contrast, is way too trivial and is best omitted.