I am trying to implement the simple variant of quicksort defined in Wikipedia. However, it seems to me that local variables in an older recursive call are leaking to a later call. (My current interpretation.). Is this the case? Any workarounds?
Here’s the sample code
# Quick sort - Simple variant. Requires 0(n) extra store
# Divide and conquer. Pick a pivot, compare elements to
# pivot generating two sublists; one greater than pivot
# and the other less than pivot. Sort these two recursively.
# See http://en.wikipedia.org/wiki/Quicksort#Simple_version
function dump_array(arr_arg){
arr_arg_len = length(arr_arg)
RSEP = ORS;
ORS=" ";
dctr = 1;
# Do not use length(arr_arg) in place of arr_arg
# It fails. The following doesn't work
#while (dctr <= arr_arg_len){
while (dctr <= arr_arg_len){
print arr_arg[dctr];
dctr++;
};
ORS = RSEP;
print "\n";
}
function simple_quicksort(unsorted_str)
{
# Unpack from the str - space separated
print "******************************"
print "Called with "unsorted_str
# Split the space separated string into an array
split(unsorted_str, unsorted_array, " ");
array_len = length(unsorted_array);
# No more sorting to be done. Break recursion
if (array_len <= 1){
print "Ending recursion with "unsorted_str
return unsorted_str
}
# Pick a random value as pivot
# index must not be 0
idx = 0
while (idx == 0){
srand()
idx = int(rand() * array_len)
}
pivot = unsorted_array[idx]
if (debug >= 1){
print "idx:"idx" pivot is: "pivot
}
num = 1;
# we don't use the zero'th element,
# this helps us declare an empty array
# dunno any other method
# we'll remove it anyway
less_arr[0] = 0
less_ctr = 1
more_arr[0] = 0
more_ctr = 1
while (num <= array_len){
# Skip pivot
if (idx != num){
if (unsorted_array[num] <= pivot){
if (debug >= 1){
print "Element less than pivot: "unsorted_array[num]
}
less_arr[less_ctr] = unsorted_array[num]
less_ctr++;
}else{
if (debug >= 1){
print "Element more than pivot: "unsorted_array[num]
}
more_arr[more_ctr] = unsorted_array[num]
more_ctr++;
}
}
num++
};
# strip out the holder in idx 0
delete less_arr[0]
delete more_arr[0]
if (debug >= 1){
print "Less than pivot:"
print dump_array(less_arr)
print "More than pivot:"
print dump_array(more_arr)
}
# Marshal array back to a string
less_str=""
less_length = length(less_arr)
num = 1
print "Less length: "less_length
while (num <= less_length){
less_str = less_str" "less_arr[num]
num++;
}
# same thing for more
more_str=""
more_length = length(more_arr)
num = 1
while (num <= more_length){
more_str = more_str" "more_arr[num]
num++;
}
if (debug >= 1){
print "Going for a recursive call with elements < pivot: "less_str
print "Going for a recursive call with elements > pivot: "more_str
print "pivot was: "pivot
}
# Tried to delete the local variables
# Coz it seems like local vars are visible to recursed functions
# Is this why it fails?
delete less_arr
delete more_arr
delete unsorted_array
print "^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^"
print ""
return simple_quicksort(less_str) " "pivot" "simple_quicksort(more_str)
}
BEGIN{
print "-- quick sort --"
}
{
# print the unsorted objects
print "Unsorted "NF" objects:\n"$0;
# We'll use a slightly different method,
# Pass the $0 string to the sorter, let it split
# it into an array, qsort that array, generate sub-
# strings and recursively qsort them
#Simple version
sorted = simple_quicksort($0)
}
END{
print "Sorted "NF" objects";
print "Sorted >>"sorted
}
A sample run:
echo 5 12 7 2 13719 28019 21444 30578 30647 | awk -f devel/andorian-blog/awk/sorting/quick_sort.awk -v debug=1
-- quick sort --
Unsorted 9 objects:
5 12 7 2 13719 28019 21444 30578 30647
******************************
Called with 5 12 7 2 13719 28019 21444 30578 30647
idx:1 pivot is: 5
Element more than pivot: 12
Element more than pivot: 7
Element less than pivot: 2
Element more than pivot: 13719
Element more than pivot: 28019
Element more than pivot: 21444
Element more than pivot: 30578
Element more than pivot: 30647
Less than pivot:
2
More than pivot:
12 7 13719 28019 21444 30578 30647
Less length: 1
Going for a recursive call with elements < pivot: 2
Going for a recursive call with elements > pivot: 12 7 13719 28019 21444 30578 30647
pivot was: 5
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
******************************
Called with 2
Ending recursion with 2
******************************
Called with 12 7 13719 28019 21444 30578 30647
idx:1 pivot is: 12
Element less than pivot: 7
Element more than pivot: 13719
Element more than pivot: 28019
Element more than pivot: 21444
Element more than pivot: 30578
Element more than pivot: 30647
Less than pivot:
7
More than pivot:
13719 28019 21444 30578 30647
Less length: 1
Going for a recursive call with elements < pivot: 7
Going for a recursive call with elements > pivot: 13719 28019 21444 30578 30647
pivot was: 12
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
******************************
Called with 7
Ending recursion with 7
******************************
Called with 13719 28019 21444 30578 30647
idx:3 pivot is: 21444
Element less than pivot: 13719
Element more than pivot: 28019
Element more than pivot: 30578
Element more than pivot: 30647
Less than pivot:
13719
More than pivot:
28019 30578 30647
Less length: 1
Going for a recursive call with elements < pivot: 13719
Going for a recursive call with elements > pivot: 28019 30578 30647
pivot was: 21444
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
******************************
Called with 13719
Ending recursion with 13719
******************************
Called with 28019 30578 30647
idx:1 pivot is: 28019
Element more than pivot: 30578
Element more than pivot: 30647
Less than pivot:
More than pivot:
30578 30647
Less length: 0
Going for a recursive call with elements < pivot:
Going for a recursive call with elements > pivot: 30578 30647
pivot was: 28019
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
******************************
Called with
Ending recursion with
******************************
Called with 30578 30647
idx:1 pivot is: 30578
Element more than pivot: 30647
Less than pivot:
More than pivot:
30647
Less length: 0
Going for a recursive call with elements < pivot:
Going for a recursive call with elements > pivot: 30647
pivot was: 30578
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
******************************
Called with
Ending recursion with
******************************
Called with 30647
Ending recursion with 30647
Sorted 9 objects
Sorted >> 2 5 7 12 13719 21444 28019 30578 30647
That run looks good. Compare it with the next one:
$ echo 5 12 7 2 13719 28019 21444 30578 30647 | awk -f devel/andorian-blog/awk/sorting/quick_sort.awk -v debug=1
-- quick sort --
Unsorted 9 objects:
5 12 7 2 13719 28019 21444 30578 30647
******************************
Called with 5 12 7 2 13719 28019 21444 30578 30647
idx:6 pivot is: 28019
Element less than pivot: 5
Element less than pivot: 12
Element less than pivot: 7
Element less than pivot: 2
Element less than pivot: 13719
Element less than pivot: 21444
Element more than pivot: 30578
Element more than pivot: 30647
Less than pivot:
5 12 7 2 13719 21444
More than pivot:
30578 30647
Less length: 6
Going for a recursive call with elements < pivot: 5 12 7 2 13719 21444
Going for a recursive call with elements > pivot: 30578 30647
pivot was: 28019
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
******************************
Called with 5 12 7 2 13719 21444
idx:4 pivot is: 2
Element more than pivot: 5
Element more than pivot: 12
Element more than pivot: 7
Element more than pivot: 13719
Element more than pivot: 21444
Less than pivot:
More than pivot:
5 12 7 13719 21444
Less length: 0
Going for a recursive call with elements < pivot:
Going for a recursive call with elements > pivot: 5 12 7 13719 21444
pivot was: 2
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
******************************
Called with
Ending recursion with
******************************
Called with 5 12 7 13719 21444
idx:3 pivot is: 7
Element less than pivot: 5
Element more than pivot: 12
Element more than pivot: 13719
E lement more than pivot: 21444
Less than pivot:
5
More than pivot:
12 13719 21444
Less length: 1
Going for a recursive call with elements < pivot: 5
Going for a recursive call with elements > pivot: 12 13719 21444
pivot was: 7
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
******************************
Called with 5
Ending recursion with 5
******************************
Called with 12 13719 21444
idx:2 pivot is: 13719
Element less than pivot: 12
Element more than pivot: 21444
Less than pivot:
1 2
More than pivot:
21444
Less length: 1
Going for a recursive call with elements < pivot: 12
Going for a recursive call with elements > pivot: 21444
pivot was: 13719
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
******************************
Called with 12
Ending recursion with 12
******************************
Called with 21444
Ending recursion with 21444
******************************
Called with 21444
Ending recursion with 21444
Sorted 9 objects
Sorted >> 2 5 7 12 13719 21444 13719 21444
In AWK, all the variables you reference to in your functions are global variables. There is no such thing as “local variables”.
However, function arguments in AWK behave like local variables in the sense they are “hidden” when a nested or recursive function is called.
So you simply need to take all the “local variables” in your function and add them as extra arguments of your function.
When you call the function you can omit these extra arguments.
It is conventional to place some extra space between the arguments and the “local variables”, in order to document how your function is supposed to be used.
This behavior and convention are documented on GAWK documentation: Function Definition Syntax.