Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 824949
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 15, 20262026-05-15T03:10:35+00:00 2026-05-15T03:10:35+00:00

Assume this much: I’m using a 16.16 fixed point system. System is 32 bit.

  • 0

Assume this much:
I’m using a 16.16 fixed point system.
System is 32 bit.
CPU has no floating point processor.
Overflow is pretty imminent for multiplication for anything larger than 1.0 * 0.4999

To make one last assumption… lets say the values I’m working will not be so high as to cause overflow in this operation…

//assume that in practical application
//this assignment wouldn't be here as 2 fixed values would already exist...
fixed1 = (int)(1.2341 * 65536);
fixed2 = (int)(0.7854 * 65536);

mask1 = fixed1 & 0xFF; //mask off lower 8 bits

fixed1 >>= 8; //keep upper 24 bits... assume value here isn't too large...

answer = (((fixed2 * fixed1) >> 8) + ((fixed2 * mask1) >> 16));

So the question is… is this a stroke of genius (not to say it hasn’t already been thought of or anything) or a complete waste of time?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-15T03:10:36+00:00Added an answer on May 15, 2026 at 3:10 am

    Re-edit – because I was wrong 🙂

    Looks like you are trying to get higher precision by using an extra var?

    If you are indeed trying to increase precision, then this would work, but why not use the whole int instead of just 8-bits?

    Ok, from your comments, you wanted to know how to do 64-bit precision muls on a 32-bit processor. The easiest way is if the processor underneath you has a long multiply op. If it’s an ARM, you are in luck and can use long long to do your mul then shift away your out of bounds low bits and be done.

    If it does not, you can still do a long long multiply and let the compiler writer do the heavy lifting of handling overflow for you. These are the easiest methods.

    Failing that, you get to do 4 16-bit multiplies and a bunch of adds and shifts:

    
    // The idea is to break the 32-bit multiply into 4 16-bit 
    parts to prevent any overflow.  You can break any 
    multiply into factors and additions (all math here is unsigned):
          (ahi16)(alo16)
    X     (bhi16)(blo16)
    --------------------
          (blo16)(alo16)  - First  32-bit product var
      (blo16)(ahi16)&lt&lt16  - Second 32-bit product var (Don't shift here)
      (bhi16)(alo16)&lt&lt16  - Third  32-bit product var (Don't shift here)
    + (bhi16)(ahi16)&lt&lt32  - Forth  32-bit product var (Don't shift here)
    --------------------
    Final Value.  Here we add using add and add 
    with carry techniques to allow overflow.
    
    

    Basically, we have a low product and a high product The low product gets assigned the first partial product. You then add in the 2 middle products shifted up 16. For each overflow, you add 1 to the high product and continue. Then add the upper 16-bits of each middle product into the high product. Finally, add the last product as is into the high product.

    A big pain in the butt, but it works for any abitrary precision of values.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.