I’m looking for a way to force the computer to calculate a floating-point operation with a set number of significant digits. This is for pure learning reasons, so I don’t care about the loss of accuracy in the result.
For example, if I have:
float a = 1.67;
float b = 10.0;
float c = 0.01
float d = a * b + c;
And I want every number represented with 3 significant digits, I’d like to see:
d = 16.7;
Not:
d = 16.71;
So far, I got this as a possible answer: Limit floating point precision?
But it would bloat my code to turn every floating-point variable into one with the precision I want using that strategy. And then doing to the same with the result.
Is there an automatic way to fix the precision?
The floating point data types are binary floating points, i.e., they have precision in terms of binary digits and it is actually impossible to represent the decimal values exactly in general. As a result, you will have some problems truncating the operations to the correct number of decimal places in the first place. What could work is to format a floating point value after each operation with a precision of
ndigits (e.g. withn == 3) and convert this back into a floating value. This won’t be particularly efficient but would work. To avoid littering the code with the corresponding truncation logic, you would encapsulate the operations you need into a class which does the operation an appropriately truncates the result.Alternatively, you could implement the necessary logic using a significand and a suitable base 10 exponent. The significant would be restricted to values between -999 and 999. It is probably more work to implement a class like this but the result is likely to be more efficient.