Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6865349
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 27, 20262026-05-27T03:02:01+00:00 2026-05-27T03:02:01+00:00

I rewrote the original radix sort algorithm for Python from Wikipedia using arrays from

  • 0

I rewrote the original radix sort algorithm for Python from Wikipedia using arrays from SciPy to gain performance and to reduce code length, which I managed to accomplish. Then I took the classic (in-memory, pivot based) quick sort algorithm from Literate Programming and compared their performance.

I had the expectation that radix sort will beat quick sort beyond a certain threshold, which it did not. Further, I found Erik Gorset’s Blog’s asking the question “Is radix sort faster than quick sort for integer arrays?“. There the answer is that

.. the benchmark shows the MSB in-place radix sort to be consistently over 3 times faster than quicksort for large arrays.

Unfortunately, I could not reproduce the result; the differences are that (a) Erik chose Java and not Python and (b) he uses the MSB in-place radix sort, whereas I just fill buckets inside a Python dictionary.

According to theory radix sort should be faster (linear) compared to quick sort; but apparently it depends a lot on the implementation. So where is my mistake?

Here is the code comparing both algorithms:

from sys   import argv
from time  import clock

from pylab import array, vectorize
from pylab import absolute, log10, randint
from pylab import semilogy, grid, legend, title, show

###############################################################################
# radix sort
###############################################################################

def splitmerge0 (ls, digit): ## python (pure!)

    seq = map (lambda n: ((n // 10 ** digit) % 10, n), ls)
    buf = {0:[], 1:[], 2:[], 3:[], 4:[], 5:[], 6:[], 7:[], 8:[], 9:[]}

    return reduce (lambda acc, key: acc.extend(buf[key]) or acc,
        reduce (lambda _, (d,n): buf[d].append (n) or buf, seq, buf), [])

def splitmergeX (ls, digit): ## python & numpy

    seq = array (vectorize (lambda n: ((n // 10 ** digit) % 10, n)) (ls)).T
    buf = {0:[], 1:[], 2:[], 3:[], 4:[], 5:[], 6:[], 7:[], 8:[], 9:[]}

    return array (reduce (lambda acc, key: acc.extend(buf[key]) or acc,
        reduce (lambda _, (d,n): buf[d].append (n) or buf, seq, buf), []))

def radixsort (ls, fn = splitmergeX):

    return reduce (fn, xrange (int (log10 (absolute (ls).max ()) + 1)), ls)

###############################################################################
# quick sort
###############################################################################

def partition (ls, start, end, pivot_index):

    lower = start
    upper = end - 1

    pivot = ls[pivot_index]
    ls[pivot_index] = ls[end]

    while True:

        while lower <= upper and ls[lower] <  pivot: lower += 1
        while lower <= upper and ls[upper] >= pivot: upper -= 1
        if lower > upper: break

        ls[lower], ls[upper] = ls[upper], ls[lower]

    ls[end] = ls[lower]
    ls[lower] = pivot

    return lower

def qsort_range (ls, start, end):

    if end - start + 1 < 32:
        insertion_sort(ls, start, end)
    else:
        pivot_index = partition (ls, start, end, randint (start, end))
        qsort_range (ls, start, pivot_index - 1)
        qsort_range (ls, pivot_index + 1, end)

    return ls

def insertion_sort (ls, start, end):

    for idx in xrange (start, end + 1):
        el = ls[idx]
        for jdx in reversed (xrange(0, idx)):
            if ls[jdx] <= el:
                ls[jdx + 1] = el
                break
            ls[jdx + 1] = ls[jdx]
        else:
            ls[0] = el

    return ls

def quicksort (ls):

    return qsort_range (ls, 0, len (ls) - 1)

###############################################################################
if __name__ == "__main__":
###############################################################################

    lower = int (argv [1]) ## requires: >= 2
    upper = int (argv [2]) ## requires: >= 2
    color = dict (enumerate (3*['r','g','b','c','m','k']))

    rslbl = "radix sort"
    qslbl = "quick sort"

    for value in xrange (lower, upper):

        #######################################################################

        ls = randint (1, value, size=value)

        t0 = clock ()
        rs = radixsort (ls)
        dt = clock () - t0

        print "%06d -- t0:%0.6e, dt:%0.6e" % (value, t0, dt)
        semilogy (value, dt, '%s.' % color[int (log10 (value))], label=rslbl)

        #######################################################################

        ls = randint (1, value, size=value)

        t0 = clock ()
        rs = quicksort (ls)
        dt = clock () - t0

        print "%06d -- t0:%0.6e, dt:%0.6e" % (value, t0, dt)
        semilogy (value, dt, '%sx' % color[int (log10 (value))], label=qslbl)

    grid ()
    legend ((rslbl,qslbl), numpoints=3, shadow=True, prop={'size':'small'})
    title ('radix & quick sort: #(integer) vs duration [s]')
    show ()

###############################################################################
###############################################################################

And here is the result comparing sorting durations in seconds (logarithmic vertical axis) for integer arrays of size in range from 2 to 1250 (horizontal axis); lower curve belongs to quick sort:

  • Radix vs Quick Sort Comparison

Quick sort is smooth at the power changes (e.g. at 10, 100 or 1000), but radix sort just jumps a little but follows otherwise qualitatively the same path as quick sort, just much slower!

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-27T03:02:02+00:00Added an answer on May 27, 2026 at 3:02 am

    You have several problems here.

    First of all, as pointed out in the comments, your data set is far too small for the theoretical complexity to overcome the overheads in the code.

    Next your implementation with all those unnecessary function calls and copying lists around is very inefficient. Writing the code in a straightforward procedural manner will almost always be faster than a functional solution (for Python that is, other languages will differ here). You have a procedural implementation of quicksort so if you write your radix sort in the same style it may turn out faster even for small lists.

    Finally, it may be that when you do try large lists the overheads of memory management begin to dominate. That means that you have a limited window between small lists where the efficiency of the implementation is the dominant factor and large lists where the memory management is the dominant factor.

    Here’s some code that uses your quicksort but a simple radixsort written procedurally but trying to avoid so much copying of data. You’ll see that even for short lists it beats the quicksort but more interestingly as the data size goes up so does the ratio between quicksort and radix sort and then it begins to drop again as the memory management starts to dominate (simple things like freeing a list of 1,000,000 items take a significant time):

    from random import randint
    from math import log10
    from time import clock
    from itertools import chain
    
    def splitmerge0 (ls, digit): ## python (pure!)
    
        seq = map (lambda n: ((n // 10 ** digit) % 10, n), ls)
        buf = {0:[], 1:[], 2:[], 3:[], 4:[], 5:[], 6:[], 7:[], 8:[], 9:[]}
    
        return reduce (lambda acc, key: acc.extend(buf[key]) or acc,
            reduce (lambda _, (d,n): buf[d].append (n) or buf, seq, buf), [])
    
    def splitmerge1 (ls, digit): ## python (readable!)
        buf = [[] for i in range(10)]
        divisor = 10 ** digit
        for n in ls:
            buf[(n//divisor)%10].append(n)
        return chain(*buf)
    
    def radixsort (ls, fn = splitmerge1):
        return list(reduce (fn, xrange (int (log10 (max(abs(val) for val in ls)) + 1)), ls))
    
    ###############################################################################
    # quick sort
    ###############################################################################
    
    def partition (ls, start, end, pivot_index):
    
        lower = start
        upper = end - 1
    
        pivot = ls[pivot_index]
        ls[pivot_index] = ls[end]
    
        while True:
    
            while lower <= upper and ls[lower] <  pivot: lower += 1
            while lower <= upper and ls[upper] >= pivot: upper -= 1
            if lower > upper: break
    
            ls[lower], ls[upper] = ls[upper], ls[lower]
    
        ls[end] = ls[lower]
        ls[lower] = pivot
    
        return lower
    
    def qsort_range (ls, start, end):
    
        if end - start + 1 < 32:
            insertion_sort(ls, start, end)
        else:
            pivot_index = partition (ls, start, end, randint (start, end))
            qsort_range (ls, start, pivot_index - 1)
            qsort_range (ls, pivot_index + 1, end)
    
        return ls
    
    def insertion_sort (ls, start, end):
    
        for idx in xrange (start, end + 1):
            el = ls[idx]
            for jdx in reversed (xrange(0, idx)):
                if ls[jdx] <= el:
                    ls[jdx + 1] = el
                    break
                ls[jdx + 1] = ls[jdx]
            else:
                ls[0] = el
    
        return ls
    
    def quicksort (ls):
    
        return qsort_range (ls, 0, len (ls) - 1)
    
    if __name__=='__main__':
        for value in 1000, 10000, 100000, 1000000, 10000000:
            ls = [randint (1, value) for _ in range(value)]
            ls2 = list(ls)
            last = -1
            start = clock()
            ls = radixsort(ls)
            end = clock()
            for i in ls:
                assert last <= i
                last = i
            print("rs %d: %0.2fs" % (value, end-start))
            tdiff = end-start
            start = clock()
            ls2 = quicksort(ls2)
            end = clock()
            last = -1
            for i in ls2:
                assert last <= i
                last = i
            print("qs %d: %0.2fs %0.2f%%" % (value, end-start, ((end-start)/tdiff*100)))
    

    The output when I run this is:

    C:\temp>c:\python27\python radixsort.py
    rs 1000: 0.00s
    qs 1000: 0.00s 212.98%
    rs 10000: 0.02s
    qs 10000: 0.05s 291.28%
    rs 100000: 0.19s
    qs 100000: 0.58s 311.98%
    rs 1000000: 2.47s
    qs 1000000: 7.07s 286.33%
    rs 10000000: 31.74s
    qs 10000000: 86.04s 271.08%
    

    Edit:
    Just to clarify. The quicksort implementation here is very memory friendly, it sorts in-place so no matter how large the list it is just shuffling data around not copying it. The original radixsort effectively copies the list twice for each digit: once into the smaller lists and then again when you concatenate the lists. Using itertools.chain avoids that second copy but there’s still a lot of memory allocation/deallocation going on. (Also ‘twice’ is approximate as list appending does involve extra copying even if it is amortized O(1) so I should maybe say ‘proportional to twice’.)

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Original code from JamieMThomas , a (really good guy who is trying to combine
So I rewrote my paths to something like: URL/really/nice/paths/ using mod_rewrite rules like this:
Original Query I'm trying to rewrite: SELECT Table1.* FROM Table1 INNER JOIN Table2 ON
I would like to write a url from Original URL www.xyz.com/folder/articles?id=1221 to rewrite URL
Facebook, and twitter rewrite urls and replace characters(like ? or /) with code. Original
The original query looks like this (MySQL): SELECT * FROM books WHERE title LIKE
(I rewrite the original question. The problem is the same.) The above example code
I'm about to have to rewrite some rather old code using SQL Server's BULK
Edit - Rewrote question since the original did not makes sense: In R -
I thought I would rewrite this question (same iteration). The original was how to

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.