Roslyn Code Gems - Counting Bits

Thursday, February 19, 2015

Back in the early days of computing, when machines were powered by coal-fired steam boilers the size of Liechtenstein, even the simplest operations were optimized by programmers toiling in the basements of university science buildings around the world. Simple operations like counting the number of “on” bits in a word.

Back then you couldn’t afford to use a naive approach...

unsigned int v; // count the number of bits set in v
unsigned int c; // c accumulates the total bits set in v

for (c = 0; v; v >>= 1)
{
  c += v & 1;
}

... because each CPU cycle required 4.09 x 10^-5 kg of hard coal, and the total cost adds up quickly.

It’s not surprising then, to find papers on efficient bit counting algorithms that have been around for decades. Sean Eron Anderson has a selection of these algorithms, plus others, on the Bit Twiddling Hacks page.

The Roslyn compiler platform builds on this research with classes like BitArithmeticUtilities, which features code like the following.

public static int CountBits(ulong v)
{
    unchecked
    {
        const ulong MASK_01010101010101010101010101010101 = 0x5555555555555555UL;
        const ulong MASK_00110011001100110011001100110011 = 0x3333333333333333UL;
        const ulong MASK_00001111000011110000111100001111 = 0x0F0F0F0F0F0F0F0FUL;
        const ulong MASK_00000000111111110000000011111111 = 0x00FF00FF00FF00FFUL;
        const ulong MASK_00000000000000001111111111111111 = 0x0000FFFF0000FFFFUL;
        const ulong MASK_11111111111111111111111111111111 = 0x00000000FFFFFFFFUL;
        v = (v & MASK_01010101010101010101010101010101) + ((v >> 1) & MASK_01010101010101010101010101010101);
        v = (v & MASK_00110011001100110011001100110011) + ((v >> 2) & MASK_00110011001100110011001100110011);
        v = (v & MASK_00001111000011110000111100001111) + ((v >> 4) & MASK_00001111000011110000111100001111);
        v = (v & MASK_00000000111111110000000011111111) + ((v >> 8) & MASK_00000000111111110000000011111111);
        v = (v & MASK_00000000000000001111111111111111) + ((v >> 16) & MASK_00000000000000001111111111111111);
        v = (v & MASK_11111111111111111111111111111111) + ((v >> 32) & MASK_11111111111111111111111111111111);
        return (int)v;
    }
}

It’s not often you find code with visually stimulating patterns that is also environmentally friendly.

Coming up soon – how many type parameters is too many? 10? 20? 30?

All My Pluralsight Courses

OdeToCode by K. Scott Allen