I have two heapsort algorithms. The first one is written by me, while the

Question

0

Asked: June 11, 20262026-06-11T19:53:25+00:00 2026-06-11T19:53:25+00:00

I have two heapsort algorithms. The first one is written by me, while the

0

I have two heapsort algorithms. The first one is written by me, while the 2nd one is taken from some website. According to me, both have the same logic, but the 2nd one is performing way better than the first. Any reason why is this happening? The only difference I can see is that mine uses a recursion, while the other one does it iteratively. Can that alone be the differentiating factor?

My code:

def heapify(arr,i,n):
    pivot = arr[i]   #the value of the root node
    left,right = (i<<1)+1,(i<<1)+2  #indices of the left and right subtree root nodes
    if right <= n-1:  #if right is within the array, so is left
        if arr[left] <= pivot and arr[right] <= pivot:
            return  #if both are less than the root node, it's already heapified
        maximum = left if arr[left] >= arr[right] else right #else find which child has a higher value
        arr[maximum],arr[i] = arr[i],arr[maximum]  #swap the root node with that child
        return heapify(arr,maximum,n)  #make the changed child the new root and recurse
    else:
      if left <= n-1:  #right is outside the array, so check for left only
        if arr[left] <= pivot:
            return
        arr[i],arr[left] = arr[left], arr[i]  #same logic as above
        return heapify(arr,left,n)
      else:
          return

def heapit(array,n):
    for i in range((len(array)-1)/2,-1,-1):  #all elements after (len(array)-1)/2 are the leaf nodes, so we have to heapify earlier nodes
        heapify(array,i,n)

def heapsort(array):
    n = len(array)
    for i in range(n,0,-1):
        heapit(array,i)  #make the array a heap
        array[0],array[i-1] = array[i-1],array[0]  #swap the root node with the last element

The other code:

def HeapSort(A):
     def heapify(A):
        start = (len(A) - 2) / 2
        while start >= 0:
            siftDown(A, start, len(A) - 1)
            start -= 1

     def siftDown(A, start, end):
        root = start
        while root * 2 + 1 <= end:
            child = root * 2 + 1
            if child + 1 <= end and A[child] < A[child + 1]:
                child += 1
            if child <= end and A[root] < A[child]:
                A[root], A[child] = A[child], A[root]
                root = child
            else:
                return

     heapify(A)
     end = len(A) - 1
     while end > 0:
        A[end], A[0] = A[0], A[end]
        siftDown(A, 0, end - 1)
        end -= 1

Even for small array with size about 100,000, the difference becomes substantial. I am invoking either code through just passing the array to be sorted to the function: HeapSort(list) or heapsort(list).

Edit:

I have replaced the heapsort function by this one:

def heapsort(array):
     n = len(array)
     heapit(array,n)
     array[n-1],array[0] = array[0],array[n-1]
     for i in range(n-1):
       heapify(array,0,n-1-i)
       array[n-i-2],array[0] = array[0],array[n-i-2]

This gives a comparable performance, but it is still slower. For a 1 million dollar array, the results are almost 20 seconds : 4 seconds. What else can be done?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-11T19:53:26+00:00

EDIT: my remarks below might explain a major slowdown, but the most important thing is that your algorithm is not heapsort.

Inside the function heapsort, you perform a loop for i in range(n,0,-1). That’s n iterations where n is the size of your input. Inside that loop, you call heapit, which loops for i in range((len(array)-1)/2,-1,-1); that’s roughly n//2 iterations.

n * (n // 2) = Θ(n²). In other words, you have an algorithm that takes at least quadratic time, while the second algorithm implements the true heapsort, that runs in O(n lg n) time.

/EDIT

It’s very likely the recursion that’s killing performance, in combination with the calling of functions defined at the module level. Python (CPython at least) is not optimized for recursive programs, but for iterative ones. For every recursive call in heapify, CPython has to perform the following seven byte code instructions:

  9         158 LOAD_GLOBAL              0 (heapify)
            161 LOAD_FAST                0 (arr)
            164 LOAD_FAST                6 (maximum)
            167 LOAD_FAST                2 (n)
            170 CALL_FUNCTION            3
            173 RETURN_VALUE        
        >>  174 POP_TOP

(determined using dis). The final two instructions are performed after the recursive call has finished, because Python does not perform tail call optimization.

While this may not look expensive, a LOAD_GLOBAL has to do at least one hash table lookup just to find heapify, and the reference counts for heapify, arr, maximum and i have to be incremented. When the recursive call finishes, the reference counts have to be decremented again. Function calling is pretty expensive in Python.

As import this says, “flat is better than nested”: prefer iteration over recursion whenever possible.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have two heapsort algorithms. The first one is written by me, while the

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply