I’m trying to implement the rselect algorithm that I just learnt in class. However, cant seem to figure out where Im going wrong in the implementation. Here’s my code. *EDIT *
: I tried using the info provided in the answer by David,but my code still acts weird. Here’s the revised code:
def rselect(seq,length,i):# i is the i'th order statistic.
if len(seq)<=1:return seq
lo,pi,hi,loc_pi= random_partition(seq
if loc_pi==i:return pi
if loc_pi>i:return rselect(lo,loc_pi-1,i)
elif loc_pi<i:return rselect(hi,length-loc_pi,i-loc_pi)#
from random import choice
def random_partition(seq):
pi =choice(seq)
#print 'pi',pi
loc_pi=seq.index(pi)
print 'Location',loc_pi
lo=[x for x in seq if x<=pi]
hi=[x for x in seq if x>pi]
return lo,pi,hi,len(lo)+1 #--A
def test_rselect(seq,i):
print 'Sequence',seq
l=len(seq)
print 'Statistic', rselect(seq,l,i)
However the output is different at different times and even right at times!. I’m a noob to both algorithms and python, any help on where Im going wrong would be much appreciated.
Edit: Im getting different values for the ith order statistic each time I run the code , which is my issue
For instance each run of the code as below gives
Revised Output:
/py-scripts$ python quicksort.py
Sequence [54, -1, 1000, 565, 64, 2, 5]
Statistic Location 1
-1
@ubuntu:~/py-scripts$ python quicksort.py
Sequence [54, -1, 1000, 565, 64, 2, 5]
Statistic Location 5
Location 1
Location 0
-1
Expected output: Im expecting find the ith order statistic here.
And therefore
test_rselect([54,-1,1000,565,64,2,5],2) should return 5 as the Statistic all the time .
Any help in where Im going wrong with this implementation would be helpful.. Thanks!!
EDIT 2: From trying to analyse the algorithm I believe the error lies in how I’m returning the pivot location(loc_pi) in line marked A. Considering the following sequence of events for the above program.
test_rselect( [ 55, 900, -1,10, 545, 250], 3) // call to input array
calls rselect ([ 55, 900, -1,10, 545, 250],6,3)
1st call to random_partition:
pi=545 and loc_pi=4
lo=[55,-1,10,250,545]
hi=[900]
return to rselect function (lo,545,hi,6)
here loc_pi>i: so rselect(lo,5,3)// and discard the hi part
2nd recursive call to rselect:
2nd recursive call to random_partition:
call random_partition on (lo) // as 'hi' is discarded
pi=55 loc_pi=0
lo=[-1,10,55]
hi=[250,545]
return to rselect(lo,55,hi,4)
here loc_pi>i: rselect(lo,3,3)// The pivot element is lost already as it is in 'hi' here!!
Any help on how I can deal with returning the location of the pivot element, in order to to gain the correct o/p would be helpful. Setting a bounty, for an answer that clearly explains where I’m doing it wrong and how I could correct it ( great tips are welcome since I’m looking forward to learn :)). Looking forward to great answers!
I don’t think there is any principal error (in how you are returning the pivot or otherwise), it’s just a lot of off-by-one (ore even two) confusion, plus I think you mean to compare with i on the first line of rselect, not 1.
Here’s my take on it, with as little change as possible:
Edit: Here’s a version that should work if there are duplicate elements. Now, I had to change some more, so I took out some stuff I found confusing in order to make it easier for myself.