Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 436665
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 12, 20262026-05-12T20:30:03+00:00 2026-05-12T20:30:03+00:00

Refreshing on floating points (also PDF ), IEEE-754 and taking part in this discussion

  • 0

Refreshing on floating points (also PDF), IEEE-754 and taking part in this discussion on floating point rounding when converting to strings, brought me to tinker: how can I get the maximum and minimum value for a given floating point number whose binary representations are equal.

Disclaimer: for this discussion, I like to stick to 32 bit and 64 bit floating point as described by IEEE-754. I’m not interested in extended floating point (80-bits) or quads (128 bits IEEE-754-2008) or any other standard (IEEE-854).

Background: Computers are bad at representing 0.1 in binary representation. In C#, a float represents this as 3DCCCCCD internally (C# uses round-to-nearest) and a double as 3FB999999999999A. The same bit patterns are used for decimal 0.100000005 (float) and 0.1000000000000000124 (double), but not for 0.1000000000000000144 (double).

For convenience, the following C# code gives these internal representations:

string GetHex(float f)
{
    return BitConverter.ToUInt32(BitConverter.GetBytes(f), 0).ToString("X");
}

string GetHex(double d)
{
    return BitConverter.ToUInt64(BitConverter.GetBytes(d), 0).ToString("X");
}

// float
Console.WriteLine(GetHex(0.1F));

// double 
Console.WriteLine(GetHex(0.1));

In the case of 0.1, there is no lower decimal number that is represented with the same bit pattern, any 0.99...99 will yield a different bit representation (i.e., float for 0.999999937 yields 3F7FFFFF internally).

My question is simple: how can I find the lowest and highest decimal value for a given float (or double) that is internally stored in the same binary representation.

Why: (I know you’ll ask) to find the error in rounding in .NET when it converts to a string and when it converts from a string, to find the internal exact value and to understand my own rounding errors better.

My guess is something like: take the mantissa, remove the rest, get its exact value, get one (mantissa-bit) higher, and calculate the mean: anything below that will yield the same bit pattern. My main problem is: how to get the fractional part as integer (bit manipulation it not my strongest asset). Jon Skeet’s DoubleConverter class may be helpful.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-12T20:30:03+00:00Added an answer on May 12, 2026 at 8:30 pm

    One way to get at your question is to find the size of an ULP, or Unit in the Last Place, of your floating-point number. Simplifying a little bit, this is the distance between a given floating-point number and the next larger number. Again, simplifying a little bit, given a representable floating-point value x, any decimal string whose value is between (x – 1/2 ulp) and (x + 1/2 ulp) will be rounded to x when converted to a floating-point value.

    The trick is that (x +/- 1/2 ulp) is not a representable floating-point number, so actually calculating its value requires that you use a wider floating-point type (if one is available) or an arbitrary width big decimal or similar type to do the computation.

    How do you find the size of an ulp? One relatively easy way is roughly what you suggested, written here is C-ish pseudocode because I don’t know C#:

    float absX = absoluteValue(x);
    uint32_t bitPattern = getRepresentationOfFloat(absx);
    bitPattern++;
    float nextFloatNumber = getFloatFromRepresentation(bitPattern);
    float ulpOfX = (nextFloatNumber - absX);
    

    This works because adding one to the bit pattern of x exactly corresponds to adding one ulp to the value of x. No floating-point rounding occurs in the subtraction because the values involved are so close (in particular, there is a theorem of ieee-754 floating-point arithmetic that if two numbers x and y satisfy y/2 <= x <= 2y, then x - y is computed exactly). The only caveats here are:

    1. if x happens to be the largest finite floating point number, this won’t work (it will return inf, which is clearly wrong).
    2. if your platform does not correctly support gradual underflow (say an embedded device running in flush-to-zero mode), this won’t work for very small values of x.

    It sounds like you’re not likely to be in either of those situations, so this should work just fine for your purposes.

    Now that you know what an ulp of x is, you can find the interval of values that rounds to x. You can compute ulp(x)/2 exactly in floating-point, because floating-point division by 2 is exact (again, barring underflow). Then you need only compute the value of x +/- ulp(x)/2 suitable larger floating-point type (double will work if you’re interested in float) or in a Big Decimal type, and you have your interval.

    I made a few simplifying assumptions through this explanation. If you need this to really be spelled out exactly, leave a comment and I’ll expand on the sections that are a bit fuzzy when I get the chance.


    One other note the following statement in your question:

    In the case of 0.1, there is no lower
    decimal number that is represented
    with the same bit pattern

    is incorrect. You just happened to be looking at the wrong values (0.999999… instead of 0.099999… — an easy typo to make).

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I am refreshing openmp a bit, and got into this weird situation. Shaved off
Here it goes: I am refreshing a page named index.jsp via this <head> <meta
I am refreshing my page using jQuery: location.reload(); This is working great but I
I want to upload an image without refreshing page. please help me for this
I was refreshing my understanding of value-initialisation versus default-initialisation, and came across this :
I have the problem in refreshing the table. I create the table like this
I'm having problems refreshing .Net 2.0 with IIS 6. I have been able to
I was just refreshing my JavaScript/Jquery skills and I thought of making a basic
I have a problem refreshing a ListView. I call my ListView Activity , I
I'm having trouble with refreshing objects in my database. I have an two PC's

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.