https://sintheta.nexus/blog/feed.xml

Counting Sort

2024-06-19

Sorting algorithm I discovered. It comes w/ the caveat, that values of the to be sorted array need to be within a reasonable range (since they are used as indices in the bookkeeping array).

But the principle is neat:

  • iterate over to be sorted array
  • create a bookeeping array whose
    • indices reflect the "to be sorted array's" values
    • values at specific indices denote how often a value appears in the unsorted array
  • based on the bookeeping array create a sorted version

Quick implementation attempt:

#include <iostream>
#include <vector>

using namespace std;

int main()
{
    vector<int> v = {1, 3, 6, 9, 9, 3, 5, 9};
    vector<int> b = {};	// bookkeeping vector
    vector<int> s = {};	// vector sorted
    
    // bookeeping vector:
    //  a) indices are original vector's values
    //  b) its values denote how often a number
    //     appears in the unsorted vector

    for (int i = 0, sz = v.size(); i < sz; ++i)
    {
        int val = v[i];

        if (val > b.size())
            b.resize(val + 1);

        b[val]++;
    }

    // based on bookeeping vector
    // -> generate sorted vector

    for (int i = 0, sz = b.size(); i < sz; ++i)
    {
        int n_times = b[i];

        for (int j = 0; j < n_times; ++j)
        {
            s.push_back(i);
        }
    }

    // print sorted vector: 1 3 3 5 6 9 9 9

    for (int i = 0, sz = s.size(); i < sz; ++i)
        cout << s[i] << " ";
    cout << '\n';
}

It sorts in O(n) of course, and therefore beats the general optimized sorting algorithms, but its use case seems highly specialized, and having to sort particularly "smaller" data sets w/ a knowable small range seems an unlikely performance bottleneck. You'd also not want to use it if it leads to a lot of separate memory allocations. The principle is neat though.