Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7640583
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 31, 20262026-05-31T08:42:21+00:00 2026-05-31T08:42:21+00:00

Is there any way to get correct rounding with the i387 fsqrt instruction?… …

  • 0

Is there any way to get correct rounding with the i387 fsqrt instruction?…

…aside from changing the precision mode in the x87 control word – I know that’s possible, but it’s not a reasonable solution because it has nasty reentrancy-type issues where the precision mode will be wrong if the sqrt operation is interrupted.

The issue I’m dealing with is as follows: the x87 fsqrt opcode performs a correctly-rounded (per IEEE 754) square root operation in the precision of the fpu registers, which I’ll assume is extended (80-bit) precision. However, I want to use it to implement efficient single and double precision square root functions with the results correctly rounded (per the current rounding mode). Since the result has excess precision, the second step of converting the result to single or double precision rounds again, possibly leaving a not-correctly-rounded result.

With some operations it’s possible to work around this with biases. For instance, I can avoid excess precision in the results of addition by adding a bias in the form of a power of two that forces the 52 significant bits of a double precision value into the last 52 bits of the 63-bit extended-precision mantissa. But I don’t see any obvious way to do such a trick with square root.

Any clever ideas?

(Also tagged C because the intended application is implementation of the C sqrt and sqrtf functions.)

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-31T08:42:22+00:00Added an answer on May 31, 2026 at 8:42 am

    First, let’s get the obvious out of the way: you should be using SSE instead of x87. The SSE sqrtss and sqrtsd instructions do exactly what you want, are supported on all modern x86 systems, and are significantly faster as well.

    Now, if you insist on using x87, I’ll start with the good news: you don’t need to do anything for float. You need 2p + 2 bits to compute a correctly rounded square-root in a p-bit floating-point format. Because 80 > 2*24 + 2, the additional rounding to single-precision will always round correctly, and you have a correctly rounded square root.

    Now the bad news: 80 < 2*53 + 2, so no such luck for double precision. I can suggest several workarounds; here’s a nice easy one off the top of my head.

    1. let y = round_to_double(x87_square_root(x));
    2. use a Dekker (head-tail) product to compute a and b such that y*y = a + b exactly.
    3. compute the residual r = x - a - b.
    4. if (r == 0) return y
    5. if (r > 0), let y1 = y + 1 ulp, and compute a1, b1 s.t. y1*y1 = a1 + b1. Compare r1 = x - a1 - b1 to r, and return either y or y1, depending on which has the smaller residual (or the one with zero low-order bit, if the residuals are equal in magnitude).
    6. if (r < 0), do the same thing for y1 = y - 1 ulp.

    This proceedure only handles the default rounding mode; however, in the directed rounding modes, simply rounding to the destination format does the right thing.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Is there any way to get the effect of running python -u from within
Is there any way to get just a part from MySql cell and ignore
Is there any way to get format settings (country from Formats tab on Region
Is there any way to get back from 'Waiting for upload' to 'Ready to
Is there any way to get the previous hash using the history object ?I
Is there any way to get direct URL to a JSF bean action method?
Is there any way to get how much memory a service application is using
Is there any way to get the previously associated Wifi networks for an Android
Is there any way to get the location of a cell phone (i.e. latitude/longitude)
Is there any way I get get the size of an NSWindow (in pixels)

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.