Non-Comparison Sorts
[!NOTE] This module explores the core principles of Non-Comparison Sorts, deriving solutions from first principles and hardware constraints to build world-class, production-ready expertise.
1. The Hook: 1 Million Exam Scores
Imagine you are designing the backend for a massive standardized testing system. You have 1,000,000 students who just took an exam scored strictly out of 100 points. You need to sort these scores to calculate percentiles.
We proved that any comparison-based sort (like QuickSort or MergeSort) requires Ω(N log N) comparisons. For 1,000,000 items, that’s roughly 20,000,000 operations.
But what if we don’t compare elements? What if we use their value to determine their address?
If we have information about the range of the data (e.g., integers from 0 to 100), we can sort them in O(N) time. We trade spatial constraints for a massive time advantage.
2. Counting Sort
Constraint: Integers in a small, known range [min, max].
Idea: Create an auxiliary array count of size max - min + 1. We don’t move the elements around yet; we just count how many times each number appears.
Interactive Counting Sort Anatomy
Step-by-Step Counting Sort
Input Array: [4, 2, 2, 8, 3, 3, 1]
Complexity: O(N + K) where K is the range.
The Trap: If K is huge (e.g., K = 109) but N is small, count array initialization dominates, and the algorithm crashes memory for no reason.
3. Radix Sort
Constraint: Integers or fixed-length strings. Idea: Use a stable Counting Sort as a subroutine.
- Sort by the Least Significant Digit (1s place).
- Sort by the 10s place.
- …
- Sort by the Most Significant Digit.
Wait, why LSD first?
Because the sort must be Stable. When we sort by the 10s column, the order relative to the 1s column must remain preserved if the 10s are equal. If we had numbers 24 and 21, sorting by the 1s place puts 21 before 24. When sorting by the 10s place (both have 2), stability guarantees 21 stays before 24.
Complexity: O(d ⋅ (N + b)) where d is digits and b is base (10). Basically O(N) for fixed-width integers.
4. Implementation: Radix Sort
Java
// Main function to do Radix Sort
public void radixSort(int[] arr) {
int max = getMax(arr);
// Apply counting sort to each digit place
for (int exp = 1; max / exp > 0; exp *= 10) {
countSort(arr, exp);
}
}
private void countSort(int[] arr, int exp) {
int n = arr.length;
int[] output = new int[n];
int[] count = new int[10]; // Base 10
// Store count of occurrences
for (int i = 0; i < n; i++) {
count[(arr[i] / exp) % 10]++;
}
// Change count[i] so that it contains position of digit
for (int i = 1; i < 10; i++) {
count[i] += count[i - 1];
}
// Build the output array (Go backwards for Stability!)
for (int i = n - 1; i >= 0; i--) {
output[count[(arr[i] / exp) % 10] - 1] = arr[i];
count[(arr[i] / exp) % 10]--;
}
// Copy output array to arr
System.arraycopy(output, 0, arr, 0, n);
}
Go
// Main function to do Radix Sort
func RadixSort(arr []int) {
if len(arr) == 0 {
return
}
max := getMax(arr)
// Apply counting sort to each digit place
for exp := 1; max/exp > 0; exp *= 10 {
countSort(arr, exp)
}
}
func countSort(arr []int, exp int) {
n := len(arr)
output := make([]int, n)
count := make([]int, 10) // Base 10
// Store count of occurrences
for i := 0; i < n; i++ {
count[(arr[i]/exp)%10]++
}
// Change count[i] so that it contains position of digit
for i := 1; i < 10; i++ {
count[i] += count[i-1]
}
// Build the output array (Go backwards for Stability!)
for i := n - 1; i >= 0; i-- {
digit := (arr[i] / exp) % 10
output[count[digit]-1] = arr[i]
count[digit]--
}
// Copy output array to arr
copy(arr, output)
}
5. Bucket Sort: The Floating Point Savior
What if the numbers aren’t integers? What if we have uniformly distributed floating-point numbers in the range [0.0, 1.0)? Counting Sort and Radix Sort both fail here.
Idea: Create N empty “buckets”. Scatter the array elements into the buckets based on their value. Sort individual buckets (usually with Insertion Sort), and then concatenate them.
- Create
nempty buckets. - Insert
arr[i]into bucket[n * arr[i]]. - Sort individual buckets.
- Concatenate all buckets into
arr.
Complexity: Best/Average Case is O(N). Worst Case is O(N2) if all elements fall into a single bucket.
6. Hardware Reality: Random Access Penalty
Counting sort looks fast (O(N)), but it involves Random Write Memory Access.
count[arr[i]]++ jumps to a random location in the count array.If the count array fits in L1 Cache, it's blazing fast. If the count array is large (e.g., sorting 64-bit integers by 16-bit chunks), it causes Cache Thrashing. Therefore, Radix Sort is often tuned to use a base b such that the table size fits in cache (e.g., base 256 or 65536).