Quantcast
Channel: What is the fastest/most efficient way to find the highest set bit (msb) in an integer in C? - Stack Overflow
Viewing all articles
Browse latest Browse all 36

Answer by SPWorley for What is the fastest/most efficient way to find the highest set bit (msb) in an integer in C?

$
0
0

This is sort of like finding a kind of integer log. There are bit-twiddling tricks, but I've made my own tool for this. The goal of course is for speed.

My realization is that the CPU has an automatic bit-detector already, used for integer to float conversion! So use that.

double ff=(double)(v|1);return ((*(1+(uint32_t *)&ff))>>20)-1023;  // assumes x86 endianness

This version casts the value to a double, then reads off the exponent, which tells you where the bit was. The fancy shift and subtract is to extract the proper parts from the IEEE value.

It's slightly faster to use floats, but a float can only give you the first 24 bit positions because of its smaller precision.


To do this safely, without undefined behaviour in C++ or C, use memcpy instead of pointer casting for type-punning. Compilers know how to inline it efficiently.

// static_assert(sizeof(double) == 2 * sizeof(uint32_t), "double isn't 8-byte IEEE binary64");// and also static_assert something about FLT_ENDIAN?double ff=(double)(v|1);uint32_t tmp;memcpy(&tmp, ((const char*)&ff)+sizeof(uint32_t), sizeof(uint32_t));return (tmp>>20)-1023;

Or in C99 and later, use a union {double d; uint32_t u[2];};. But note that in C++, union type punning is only supported on some compilers as an extension, not in ISO C++.


This will usually be slower than a platform-specific intrinsic for a leading-zeros counting instruction, but portable ISO C has no such function. Some CPUs also lack a leading-zero counting instruction, but some of those can efficiently convert integers to double. Type-punning an FP bit pattern back to integer can be slow, though (e.g. on PowerPC it requires a store/reload and usually causes a load-hit-store stall).

This algorithm could potentially be useful for SIMD implementations, because fewer CPUs have SIMD lzcnt. x86 only got such an instruction with AVX512CD


Viewing all articles
Browse latest Browse all 36

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>