I have two heapsort algorithms. The first one is written by me, while the 2nd one is taken from some website. According to me, both have the same logic, but the 2nd one is performing way better than the first. Any reason why is this happening? The only difference I can see is that mine uses a recursion, while the other one does it iteratively. Can that alone be the differentiating factor?
My code:
def heapify(arr,i,n):
pivot = arr[i] #the value of the root node
left,right = (i<<1)+1,(i<<1)+2 #indices of the left and right subtree root nodes
if right <= n-1: #if right is within the array, so is left
if arr[left] <= pivot and arr[right] <= pivot:
return #if both are less than the root node, it's already heapified
maximum = left if arr[left] >= arr[right] else right #else find which child has a higher value
arr[maximum],arr[i] = arr[i],arr[maximum] #swap the root node with that child
return heapify(arr,maximum,n) #make the changed child the new root and recurse
else:
if left <= n-1: #right is outside the array, so check for left only
if arr[left] <= pivot:
return
arr[i],arr[left] = arr[left], arr[i] #same logic as above
return heapify(arr,left,n)
else:
return
def heapit(array,n):
for i in range((len(array)-1)/2,-1,-1): #all elements after (len(array)-1)/2 are the leaf nodes, so we have to heapify earlier nodes
heapify(array,i,n)
def heapsort(array):
n = len(array)
for i in range(n,0,-1):
heapit(array,i) #make the array a heap
array[0],array[i-1] = array[i-1],array[0] #swap the root node with the last element
The other code:
def HeapSort(A):
def heapify(A):
start = (len(A) - 2) / 2
while start >= 0:
siftDown(A, start, len(A) - 1)
start -= 1
def siftDown(A, start, end):
root = start
while root * 2 + 1 <= end:
child = root * 2 + 1
if child + 1 <= end and A[child] < A[child + 1]:
child += 1
if child <= end and A[root] < A[child]:
A[root], A[child] = A[child], A[root]
root = child
else:
return
heapify(A)
end = len(A) - 1
while end > 0:
A[end], A[0] = A[0], A[end]
siftDown(A, 0, end - 1)
end -= 1
Even for small array with size about 100,000, the difference becomes substantial. I am invoking either code through just passing the array to be sorted to the function: HeapSort(list) or heapsort(list).
Edit:
I have replaced the heapsort function by this one:
def heapsort(array):
n = len(array)
heapit(array,n)
array[n-1],array[0] = array[0],array[n-1]
for i in range(n-1):
heapify(array,0,n-1-i)
array[n-i-2],array[0] = array[0],array[n-i-2]
This gives a comparable performance, but it is still slower. For a 1 million dollar array, the results are almost 20 seconds : 4 seconds. What else can be done?
EDIT: my remarks below might explain a major slowdown, but the most important thing is that your algorithm is not heapsort.
Inside the function
heapsort, you perform a loopfor i in range(n,0,-1). That’sniterations wherenis the size of your input. Inside that loop, you callheapit, which loopsfor i in range((len(array)-1)/2,-1,-1); that’s roughlyn//2iterations.n * (n // 2)= Θ(n²). In other words, you have an algorithm that takes at least quadratic time, while the second algorithm implements the true heapsort, that runs in O(nlgn) time./EDIT
It’s very likely the recursion that’s killing performance, in combination with the calling of functions defined at the module level. Python (CPython at least) is not optimized for recursive programs, but for iterative ones. For every recursive call in
heapify, CPython has to perform the following seven byte code instructions:(determined using
dis). The final two instructions are performed after the recursive call has finished, because Python does not perform tail call optimization.While this may not look expensive, a
LOAD_GLOBALhas to do at least one hash table lookup just to findheapify, and the reference counts forheapify,arr,maximumandihave to be incremented. When the recursive call finishes, the reference counts have to be decremented again. Function calling is pretty expensive in Python.As
import thissays, “flat is better than nested”: prefer iteration over recursion whenever possible.