Quantcast
Channel: What is the fastest/most efficient way to find the highest set bit (msb) in an integer in C? - Stack Overflow
Viewing all articles
Browse latest Browse all 36

Answer by VoidStar for What is the fastest/most efficient way to find the highest set bit (msb) in an integer in C?

$
0
0

Some overly complex answers here. The Debruin technique should only be used when the input is already a power of two, otherwise there's a better way. For a power of 2 input, Debruin is the absolute fastest, even faster than _BitScanReverse on any processor I've tested. However, in the general case, _BitScanReverse (or whatever the intrinsic is called in your compiler) is the fastest (on certain CPU's it can be microcoded though).

If the intrinsic function is not an option, here is an optimal software solution for processing general inputs.

u8  inline log2 (u32 val)  {    u8  k = 0;    if (val > 0x0000FFFFu) { val >>= 16; k  = 16; }    if (val > 0x000000FFu) { val >>= 8;  k |= 8;  }    if (val > 0x0000000Fu) { val >>= 4;  k |= 4;  }    if (val > 0x00000003u) { val >>= 2;  k |= 2;  }    k |= (val & 2) >> 1;    return k;}

Note that this version does not require a Debruin lookup at the end, unlike most of the other answers. It computes the position in place.

Tables can be preferable though, if you call it repeatedly enough times, the risk of a cache miss becomes eclipsed by the speedup of a table.

u8 kTableLog2[256] = {0,0,1,1,2,2,2,2,3,3,3,3,3,3,3,3,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7};u8 log2_table(u32 val)  {    u8  k = 0;    if (val > 0x0000FFFFuL) { val >>= 16; k  = 16; }    if (val > 0x000000FFuL) { val >>=  8; k |=  8; }    k |= kTableLog2[val]; // precompute the Log2 of the low byte    return k;}

This should produce the highest throughput of any of the software answers given here, but if you only call it occasionally, prefer a table-free solution like my first snippet.


Viewing all articles
Browse latest Browse all 36

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>