I know that the default implementation of Lua uses floating point numbers only, thus circumventing the problem of dynamically determining the subtype of a number before choosing which variant of math function to use.
My question is — if I try to emulate integers as doubles (or floats) in standard C99, is there a reliable (and simple) way to tell what is the maximum value representable precisely?
I mean, if I use 64-bit floats to represent integers, I certainly cannot represent all 64-bit integers (the pigeonhole principle applies here). How can I tell the maximum integer that is representable?
(Trying to list all values is not a solution — if, for example, I’m using doubles in a 64-bit architecture, as I’d have to list 2^{64} numbers)
Thanks!
The maximum ones-representable integer is 253 (9007199254740992) for a 64-bit double and 224 (16777216) for a 32-bit float. See the base digits on the Wikipedia page for IEEE floating point numbers.
Verifying this in Lua is pretty simple:
If we don’t have the IEEE-defined field sizes handy, knowing what we know about the design of floating point numbers, we can determine these values using a simple loop over the possible values:
The output of the above code:
Of course, due to the way floating-point precision works, a 64-bit double can represent numbers much larger than 264 as the floating exponent grows positive. The Wikipedia page on double-precision floating-point describes:
The absolute largest value a double can hold is listed further down that page: 0x7fefffffffffffff, which computes to (1 + (1 − 2−52)) * 21023, or roughly 1.7976931348623157e308.