For a simple project I have to make large numbers (e.g. 4294967123) readable, so I’m writing only the first digits with a prefix (4294967123 -> 4.29G, 12345 -> 12.34K etc.)
The code (simplified) looks like this:
const char* postfixes=" KMGT";
char postfix(unsigned int x)
{
return postfixes[(int) floor(log10(x))];
}
It works, but I think that there’s a more elegant/better solution than computing the full precision logarithm, rounding it and casting it down to an int again.
Other solutions I thought of:
int i=0;
for(; x >= 1000 ; ++i) x/=1000;
return postfixes[i];
(This is significantly slower, but easier to read)
The numbers are distributed between according to Benford’s Law and the number should be treated as unsigned 64 bit-number, as there should be no rounding error near 10^x (e.g. in python math.log(1000,10) returns 2.999996, which gets floored to 2).
Is there any fast, accurate other way I’m missing?
Your log10/floor code is perfectly readable, and its performance cost will likely be dwarfed by that of the string formatting you will subsequently be doing on your output.
However, suppose you were to really need the performance…
Note that log10(x) == log2(x) / log2(10) == log2(x) * 1/log2(10)
1/log2(10) is a constant
log2(x) can usually be performed cheaply in the integer pipeline on modern architectures using instructions such as CLZ or a bit twiddling hack, yielding a number between 0 and 63 for a 64-bit integer. That fits in 6 bits, leaving us up to 58 bits after the radix point usable for fixed point arithmetic in a 64-bit type.
So we can then use fixed-point arithmetic to find the log10:
The implementation of integer_log2 is compiler/platform-dependent; e.g. on GCC/PowerPC, it’s
This approach can be generalised for finding the logarithm of any base, simply calculate the appropriate constant as described above.