A Beginner’s Guide to Radix Sort

5 min readFeb 18, 2021

Let’s say you were given 20 stuffed animals, all of different shapes and sizes. Now let’s say you were tasked to arrange these stuffed animals in order from smallest to largest. Simple, just look and compare one animal to the other and see which one is bigger. Now let’s say you were given the sizes in an array and you were told to sort them using a sorting algorithm. Since we all pay attention in our computer science classes and don’t just google our answers, we might come up with bubble sort or merge sort or another sorting method that relies on comparing values to each other in order to sort them. Now let’s say you are not allowed to compare one value to another when sorting your array. This is where non comparison sorting comes in.

Comparison Sorting vs Non Comparison Sorting Algorithms

Like the name implies, comparison sorting algorithms sort by comparing values while non comparison sorting algorithms do not. Students are taught comparison sorting algorithms at the start because it’s easy to tell a student to look at two values, compare them, and sort them. Then when students feel comfortable with their language of choice, they can increase the complexity of how those algorithms compare two values. Non comparison sorting algorithms avoid comparing values by relying on their own assumptions about the data, such as the size of the data or the range of the data.

Radix Sort

Radix sort is a sorting algorithm that dates back to the 19th century for sorting punched cards¹. This algorithm sorts numbers or letters by handling each digit or character separately from the whole. It would look at the integer keys in order that are in the range 0 to N where N is the highest value, and then place each item in its respective bucket. Then from 0 to N buckets, each bucket is emptied out into a new array. The buckets are like queues so the first item that was placed into the queue would be the first item that was removed from the bucket and placed into the new array². The simplest implementation is by using a radix or base of ten since we assume that every digit ranges from zero to nine. Since letters also have a numerical value, radix sort can sort strings of letters by using a radix of 26, the number of letters in the english alphabet. Radix sort, or sometimes called Bucket sort, sorts the keys by either starting at the most significant digit (MSD) or least significant digit (LSD). If the value in question is 105, the 1 would be the most significant digit while the 5 would be the least significant digit. Typically LSD radix sorts are used to sort integers while MSD radix sorts are used to sort strings.

If we were given the array [74, 127, 89, 73, 57, 139, 80, 14, 9]. We would use a base of 10 and start by looking at the LSD and placing them all in their respective buckets.

[74, 127, 89, 73, 57, 139, 80, 14, 9, 3]

Each bucket is a queue that then generates the new array [80, 73, 3, 74, 14, 127, 57, 89, 139, 9]. The process repeats with the 10s place with the newly generated array. Any numbers that don’t have a digit in the 10s place would just have a 0 as their value for the 10s place and get pushed into the zero bucket.

[80, 73, 03, 74, 14, 127, 57, 89, 139, 09]

Once the new queues get combined, the new array becomes [3, 9, 14, 127, 139, 57, 73, 74, 80, 89]. Most of the array is now in order but 127 and 139 are in the dead center. Only values less than 100 are sorted since the algorithm has only taken into account the ones and tens place. Since some of our values are less than 1000 but more than 100, this process only has to be repeated one more time. Again any numbers that do not have a digit in the hundreds place would also get pushed into the zero bucket.

After all the queues get combined, the final array is [3, 9, 14, 57, 73, 74, 80, 89, 127, 139]. In this case, it only took three iterations since the max number of digits in each item is three.

Complexity

The time and space complexity of radix sort is calculated by using the number of items to sort (n), the number of digits in each item(ℓ), and the number of values each digit can have(k). It would have a time complexity O(ℓ * (n+k)) and a space complexity O(n+k). It is usually assumed that the input(ℓ) either consists of 32 or 64 bit integers. In this case we used a radix(k) of 10. This simplifies to a time and space complexity O(n)³.

Use Case

Although this seems to be consistently faster than comparison sorting algorithms, with quicksort having an average time complexity of O(n log n), radix sort is at a disadvantage because it accesses memory in an arbitrary order instead of in a numerical order. This happens because after every iteration, the new array is in a different order than the old array. This means that it will not use the built in cache and will perform the same if not only insignificantly faster than optimized implementations of quicksort. Since vector supercomputers used to not have a cache, it would access the memory equally and run faster than quicksort¹. In parallel computing, radix sort could also assign each bucket to the next available processors so that each bin is sorted individually. If MSD radix sort is used, then it would start out w/ one processor and as it starts looking at the other digits it would start using more processors.

Resources:

CMU | Radix Sort for Vector Multiprocessors by Marco Zagha and Guy E. Blelloch.
NYU | Bucket-Sort and Radix-Sort
Interview Cake | Radix Sort Algorithm

A Beginner’s Guide to Radix Sort

Written by Teofilo Callanaupa