How 12 comparisons can make integer sorting 30x faster
I spent a few weeks trying to beat ska_sort (the fastest non-SIMD sorting algorithm). Along the way I learned something interesting about algorithm selection.
The conventional wisdom is that radix sort is O(n) and beats comparison sorts for integers. True for random data. But real data isn't random.
Ages cluster in 0-100. Sensor readings are 12-bit. Network ports cluster around well-known values. When the value range is small relative to array size, counting sort is O(n + range) and destroys radix sort.
The problem: how do you know which algorithm to use without scanning the data first?
My solution was embarrassingly simple. Sample 64 values to estimate the range. If range <= 2n, use counting sort. Cost: 64 reads. Payoff: 30x speedup on dense data.
For sorted/reversed detection, I tried:
– Variance of differences (failed – too noisy)
– Entropy estimation (failed – threshold dependent)
– Inversion counting (failed – can't distinguish reversed from random)
What worked: check if arr[0] <= arr[1] <= arr[2] <= arr[3] at three positions (head, middle, tail). If all three agree, data is likely sorted. 12 comparisons total.
Results on 100k integers:
– Random: 3.8x faster than std::sort
– Dense (0-100): 30x faster than std::sort
– vs ska_sort: 1.6x faster on random, 9x faster on dense
The lesson: detection is cheap. 12 comparisons and 64 samples cost maybe 100 CPU cycles. Picking the wrong algorithm costs millions of cycles.
submitted by /u/DimitrisMitsos
[link] [comments]