I am reading a quick sort implementation using a stack at the following link.
My question is regarding the following paragraph.
The policy of putting the larger of the small subfiles on the stack
ensures that each entry on the stack is no more than one-half of the
size of the one below it, so that the stack needs to contain room for
only about lg N entries. This maximum stack usage occurs when the
partition always falls at the center of the file. For random files,
the actual maximum stack size is much lower; for degenerate files it
is likely to be small.This technique does not necessarily work in a truly recursive
implementation, because it depends on end- or tail-recursion removal.
If the last action of a procedure is to call another procedure, some
programming environments will arrange things such that local variables
are cleared from the stack before, rather than after, the call.
Without end-recursion removal, we cannot guarantee that the stack size
will be small for quicksort.
-
What does the author mean by “that each entry on the stack is no more than one-half of the size of the one below it”? Could you please give an example of this.
-
How did the author came to the conclusion that the stack needs space for only about
lg Nentries? -
What does authore mean by “Without end-recursion removal, we cannot guarantee that the stack size will be small for quicksort” ?
Thanks for your time and help.
That is not quite true. Consider you want to sort a 100-element array, and the first pivot goes right in the middle. Then you have a stack
then you pop the 49-element part off the stack, partition, and push the two parts on the stack. Let’s say the choice of pivot was not quite as good this time, there were 20 elements not larger than the pivot. Then you’d get the stack
and each stack entry is more than half of the one below.
But that cannot continue forever, and we have
That is obviously true when the sorting begins, since then there is only one element on the stack, at level 0, which is the entire array of size
total_size.Now, assume the stated property holds on entering the loop (
while(!stack.empty())).A subarray of length
sis popped from stack levelm. Ifs <= 1, nothing else is done before the next loop iteration, and the invariant continues to hold. Otherwise, ifs >= 2, After partitioning that, there are two new subarrays to be pushed on the stack, withs-1elements together. The smaller of those two then has a sizesmaller_size <= (s-1)/2, and the larger has a sizelarger_size <= s-1. Stack levelmwill be occupied by the larger of the two, and we havefor the stack levels
mresp.m+1at the end of the loop body. The invariant holds for the next iteration.Since at most one subarray of size 0 is ever on the stack (it is then immediately popped off in the next iteration), there are never more than
lg total_size + 1stack levels occupied.Regarding
In a recursive implementation, you can have deep recursion, and when the stack frame is not reused for the end-call, you may need linear stack space. Consider a stupid pivot selection, always choosing the first element as pivot, and an already sorted array.
partition, pivot goes in position 0, the smaller subarray is empty. The recursive call for the larger subarray
[1,2,3,4], allocates a new stack frame (so there are now two stack frames). Same principle, the next recursive call with the subarray[2,3,4]allocates a third stack frame, etc.If one has end-recursion removal, i.e. the stack frame is reused, one has the same guarantees as with the manual stack above.