Here’s the code: #include <stdio.h> #include <math.h> static double const x = 665857; static

Question

0

Asked: June 3, 20262026-06-03T17:14:28+00:00 2026-06-03T17:14:28+00:00

Here’s the code: #include <stdio.h> #include <math.h> static double const x = 665857; static

0

Here’s the code:

#include <stdio.h>
#include <math.h>

static double const x = 665857;
static double const y = 470832;

int main(){
    double z = x*x*x*x -(y*y*y*y*4+y*y*4);
    printf("%f \n",z);
    return 0;
}

Mysteriously (to me) this code prints “0.0” if compiled on 32 bits machines (or with the -m32 flag on 64 bits machines like in my case) with GCC 4.6. As far as I know about floating point operations, it is possible to overflow/underflow them or to lose precision with them, but… a 0? How?

Thanks in advance.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-03T17:14:31+00:00

This is result of the way IEEE 754 represents floating point numbers in normalized form. float or double or whatever other IEEE 754 compliant representation is stored like:

1.xxxxxxxxxxxxxxxxxxx * 2^exp

where xxxxxxxxxxxxxxxxxxx is the fractional part of the mantissa so the mantissa itself is always in the range [1, 2). The integer part which is always 1 is not stored in the representation. The number of x bits defines the precision. It is 52 bits for the double. The exponent is stored in an offset form (one must subtract 1023 in order to obtain its value) but that is irrelevant now.

665857^4 in 64-bit IEEE 754 is:

0 10001001100 (1)0100110100000001111100111011101010000101110010100010
+ exponent    mantissa

(the first bit is the sign bit: 0 = positive, 1 – negative; the bit in parentheses is not really stored)

In 80-bit x86 extended precision it is:

0 10001001100    (1)0100110100000001111100111011101010000101110010100010
0 100000001001100 1 010011010000000111110011101110101000010111001010000111000111011

(here the integer part is explicitly part of the representation – a deviation from IEEE 754; I’ve aligned the mantissas for clarity)

4*470832^4 in 64-bit IEEE 754 and 80-bit x86 extended precision is:

0 10001001100    (1)0100110100000001111100111011101001111111010101100111
0 100000001001100 1 010011010000000111110011101110100111111101010110011100100010000

4*470832^2 in 64-bit IEEE 754 and 80-bit x86 extended precision is:

0 10000100110    (1)1001110011101010100101010100100000000000000000000000
0 100000000100110 1 100111001110101010010101010010000000000000000000000000000000000

When you sum up the last two numbers, the procedure is the following: the smaller value has its exponent adjusted to match the larger value’s exponent while the mantissa is shifted to the right in order to preserve the value. Since the two exponents differ by 38, the mantissa of the smaller number is shifted 38 bits to the right:

470832^2*4 in adjusted 64-bit IEEE 754 and 80-bit x86 extended precision:

 this bit came from 1.xxxx ------------------------------v
0 10001001100    (0)0000000000000000000000000000000000000110011100111010|1010
0 100000001001100 0 0000000000000000000000000000000000000110011100111010101001010101

Now both quantities have the same exponents and their mantissas could be summed:

0 10001001100 (1)0100110100000001111100111011101001111111010101100111|0010
0 10001001100 (0)0000000000000000000000000000000000000110011100111010|1010
--------------------------------------------------------------------------
0 10001001100 (1)0100110100000001111100111011101010000101110010100001|1100

I kept some of the 80-bit precision bits on the right of the bar, because the summation internally is done in the greater precision of 80 bits.

Now let’s perform the subtraction in 64-bit + some bits of the 80-bit rep:

minuend    0 10001001100 (1)0100110100000001111100111011101010000101110010100001|1100
subtrahend 0 10001001100 (1)0100110100000001111100111011101010000101110010100001|1100
-------------------------------------------------------------------------------------
difference 0 10001001100 (0)0000000000000000000000000000000000000000000000000000|0000

A pure 0! If you perform the calculations in full 80-bit, you would once again obtain a pure 0.

The real problem here is that 1.0 cannot be represented in 64-bit precision with an exponent of 2^77 – there are no 77 bits of precision in the mantissa. This is also true for the 80-bit precision – there are only 63 bits in the mantissa, 14 bits less than necessary to represent 1.0 given an exponent of 2^77.

So that’s it! It’s just the wonderful world of scientific computing where nothing works the way you were taught in the math classes…

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

Here’s the code: #include <stdio.h> #include <math.h> static double const x = 665857; static

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply