I’m reading Introduction to Algorithms by Cormen et. al. and there in the part where they describe Radix sort, they say:
Intuitively you might sort numbers on their most significant digit,
sort each of the resulting bins recursively and then combine the decks
in order. Unfortunately since the cards in 9 of the 10 bins must be
put aside to sort each of the bins, this procedure generates many
intermediate piles of cards that you’d have to keep a track of.
What does this mean?
I don’t understand how sorting by the MSB would be a problem?
They refer to a useful property of an LSD radix sort that since you ensure each sorting step is stable, you only have to run one step for each digit, on the whole array, and you don’t have to individually sort any subsets.
To take Michael’s example data:
After 0 steps:
170, 045, 075, 090, 002, 024, 802, 066, 182, 332, 140, 144
After 1 step (sort on units):
170, 090, 140, 002, 802, 182, 332, 024, 144, 045, 075, 066
After 2 steps (sort on tens):
002, 802, 024, 332, 140, 144, 045, 066, 170, 075, 182, 090
After 3 steps (sort on hundreds):
002, 024, 045, 066, 075, 090, 140, 144, 170, 182, 332, 802
This property becomes especially useful if you’re radix-sorting in binary rather than base 10. Then each sorting step is just a partition into two, which is very simple. At least, it is until you want to do it without using any extra memory.
MSD radix sort works, of course, it just requires more book-keeping and/or a non-tail recursion. It’s only a “problem” in that CLRS (in common with other expert programmers) don’t like to do fiddly work until it’s necessary.