yes i know, making bitwise ops on double values seems like a bad idea,

Question

0

Editorial Team

Asked: June 8, 20262026-06-08T11:30:54+00:00 2026-06-08T11:30:54+00:00

yes i know, making bitwise ops on double values seems like a bad idea,

0

yes i know, making bitwise ops on double values seems like a bad idea, but i actually need it.

You don’t need to read the next paragraph for my question, only for the curious of you guys:

I actually try a special mod to the Mozilla Tamarin (Actionscript Virtual Machine). In it, any object has the first 3 bits reserved for it’s type (double is 7 for example). These bits reduce precision for primitive data types (int only 29 bits etc.). For my mod, i need to expand this area by 2 bits. This means, when you for example add 2 doubles, you need to set these last 5 bits to zero, do the math, then reset them on the result. so much for the why ^^

Now back to the code.
Here a minimal example which shows a very similar problem:

double *d = new double; 
*d = 15.25; 
printf("float: %f\n", *d);

//forced hex output of double
printf("forced bitwise of double: ");
unsigned char * c = (unsigned char *) d;
int i;
for (i = sizeof (double)-1; i >=0 ; i--) {
     printf ("%02X ", c[i]);
}
printf ("\n");

//cast to long long-pointer, so that bitops become possible
long long * l = (long long*)d;
//now the bitops: 
printf("IntHex: %016X, float: %f\n", *l, *(double*)l); //this output is wrong!
*l = *l | 0x07; 
printf("last 3 bits set to 1: %016X, float: %f\n", *l, *d);//this output is wrong!
*l = *l | 0x18; 
printf("2 bits more set to 1: %016X, float: %f\n", *l, *d);//this output is wrong!

when running this in VisualStudio2008, the first output is correct. second too. 3rd yields 0 for both hex and float-representation, which is obviously wrong. 4th and 5th also zero for both hex and float, but the modified bits show in the hex-value. So i thought, maybe the typecast messed things up here. so 2 more outputs:

printf("float2: %f\n", *(double*)(long long*)d); //almost right
printf("float3: %f\n", *d); //almost right

well, they show 15.25, but it should be 15.2500000000000550670620214078. so i thought, hey, it’s just the precision issue in the output. lets modify a bit further up:

*l = *l |= 0x10000000000;
printf("float4: %f\n", *d);

again, output is 15.25(0000), and not 15.2519531250000550670620214078. Weird enough, another forced hex output (see code above) shows no modification of d at all. so i tinkered a bit, and realized that bit 31 (0x80000000) is the last one i can set by hand. and holy moly, it actually has an effect on the output (15.250004)!

so, though i slightly strayed, still a lot of confusion. is printf broken? am i having a big/little-endian confusion here? am i accidently creating some kind of buffer overrun?

If anybody is interested, in the original problem (the tamarin thing, see above) it’s pretty much inverse. there, the last three bits are already set (which represents a double). setting them to zero works fine (which is the original implementation). setting 2 more to zero has the same effect as above (overall value gets floored to zero). which by the way is not output-specific, but also math-ops seem to work with those floored values (mul of 2 values obtained like that results in 0).

Any help would be appreciated.
Greetings.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-08T11:30:55+00:00

well, they show 15.25, but it should be 15.2500000000000550670620214078

By default, %f displays 6 digits of precision, so you won’t see the difference. You also need to specify that the first argument is long long rather than int, using the ll modifier; otherwise, it might print garbage. If you fix that and use a higher precision, such as %.30f, you should see the expected result:

printf("last 3 bits set to 1: %016llX, float: %.30f\n", *l, *d);
printf("2 bits more set to 1: %016llX, float: %.30f\n", *l, *d);

last 3 bits set to 1: 0000000000000007, float: 15.250000000000012434497875801753
2 bits more set to 1: 000000000000001F, float: 15.250000000000055067062021407764

lets modify a bit further up:

*l = *l |= 0x10000000000;
printf("float4: %f\n", *d);

You have a rogue = giving undefined behaviour, so the value may or may not end up being modified (and the program may or may not crash, phone out for pizza, or destroy the universe). Also, if your compiler isn’t C++11 compliant, the type of the integer literal might be no larger than long, which might only be 32 bits; in which case it will (probably) become zero.

Fixing those (and in my case, with your code as it is), I get the expected result:

*l = *l | 0x10000000000LL;  // just one assignment, and "LL" to force "long long"
printf("float4: %f\n", *d);


float4: 15.251953

Here is a demonstration.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

yes i know, making bitwise ops on double values seems like a bad idea,

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply