Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8452483
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 10, 20262026-06-10T11:30:41+00:00 2026-06-10T11:30:41+00:00

If the size of a float is 4 bytes then shouldn’t it be able

  • 0

If the size of a float is 4 bytes then shouldn’t it be able to hold digits from 8,388,607 to -8,388,608 or somewhere around there because I probably calculated it wrong.

Why does f display the extra 15 because the value of f (0.1) is still between 8,388,607 to -8,388,608 right?

int main(int argc, const char * argv[])
{
    @autoreleasepool {
        float f = .1;
        printf("%lu", sizeof(float));
        printf("%.10f", f);
    }
    return 0;
}

2012-08-28 20:53:38.537 prog[841:403] 4
2012-08-28 20:53:38.539 prog[841:403] 0.1000000015
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-10T11:30:43+00:00Added an answer on June 10, 2026 at 11:30 am

    The values -8,388,608 ... 8,388,607 lead me to believe that you think floats use two’s complement, which they don’t. In any case, the range you have indicates 24 bits, not the 32 that you’d get from four bytes.

    Floats in C use IEEE754 representation, which basically has three parts:

    • the sign.
    • the exponent (sort of a scale).
    • the fraction (actual digits of the number).

    You basically get a certain amount of precision (such as 7 decimal digits) and the exponent dictates whether you use those for a number like 0.000000001234567 or 123456700000.

    The reason you get those extra digits at the end of your 0.1 is because that number cannot be represented exactly in IEEE754. See this answer for a treatise explaining why that is the case.

    Numbers are only representable exactly if they can be built by adding inverse powers of two (like 1/2, 1/16, 1/65536 and so on) within the number of bits of precision (ie, number of bits in the fraction), subject to scaling.

    So, for example, a number like 0.5 is okay since it’s 1/2. Similarly 0.8125 is okay since that can be built from 1/2, 1/4 and 1/16.

    There is no way (at least within 23 bits of precision) that you can build 0.1 from inverse powers of two, so it gives you the nearest match.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

float have size of 4 bytes and long have size of 8 bytes. So,
I am wondering if the size of float and double objects are equal from
std::vector<bool> reprVectors::encode(std::vector<float> input){ std::vector<float> distance; for(size_t i=0;i<this->reprVectorsList.size();i++){ distance.push_back(distBtw(input,this->reprVectorsList[i])); } std::vector<float>::iterator it= min_element(distance.begin(),distance.end()); return this->reprVectorsList[it]->code;
I would like to get a byte[] from a float[] as quickly as possible,
I want to convert a single number of bytes, into a file size (that
Int is 4 bytes with a range of +- 2^31 Float is 4 bytes
I am trying to move the float array ptr 256 units from the start
I'm sending from a .Net application 1404 float values which make up for 5616
The size of my JavaScript file is getting out of hand because I have
const size_t size = 5; int *i = new int[size](); for (int* k =

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.