I’m writing a set of numeric type conversion functions for a database engine, and I’m concerned about the behavior of converting large integral floating-point values to integer types with greater precision.
Take for example converting a 32-bit int to a 32-bit single-precision float. The 23-bit significand of the float yields about 7 decimal digits of precision, so converting any int with more than about 7 digits will result in a loss of precision (which is fine and expected). However, when you convert such a float back to an int, you end up with artifacts of its binary representation in the low-order digits:
#include <iostream>
#include <iomanip>
using namespace std;
int main()
{
int a = 2147483000;
cout << a << endl;
float f = (float)a;
cout << setprecision(10) << f << endl;
int b = (int)f;
cout << b << endl;
return 0;
}
This prints:
2147483000
2147483008
2147483008
The trailing 008 is beyond the precision of the float, and therefore seems undesirable to retain in the int, since in a database application, users are primarily concerned with decimal representation, and trailing 0’s are used to indicate insignificant digits.
So my questions are: Are there any well-known existing systems that perform decimal significant digit rounding in float -> int (or double -> long long) conversions, and are there any well-known, efficient algorithms for doing so?
(Note: I’m aware that some systems have decimal floating-point types, such as those defined by IEEE 754-2008. However, they don’t have mainstream hardware support and aren’t built into C/C++. I might want to support them down the road, but I still need to handle binary floats intuitively.)
std::numeric_limits<float>::digits10says you only get 6 precise digits for float.Pick an efficient algorithm for your language, processor, and data distribution to calculate-the-decimal-length-of-an-integer (or here). Then subtract the number of digits that
digits10says are precise to get the number of digits to cull. Use that as an index to lookup a power of 10 to use as a modulus. Etc.One concern: Let’s say you convert a float to a decimal and perform this sort of rounding or truncation. Then convert that “adjusted” decimal to a float and back to a decimal with the same rounding/truncation scheme. Do you get the same decimal value? Hopefully yes.
This isn’t really what you’re looking for but may be interesting reading: A Proposal to add a max significant decimal digits value to the C++ Standard Library Numeric limits