How can I round a floating point number to the nearest integer? I am looking for the algorithm in terms of binary since I have to implement the code in assembly.
Share
Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.
Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Please briefly explain why you feel this question should be reported.
Please briefly explain why you feel this answer should be reported.
Please briefly explain why you feel this user should be reported.
UPDATED with method for proper rounding to even.
Store the 23-exponent+1’th bit (after the decimal point). Next, zero out the (23-exponent) least significant bits. Then use the stored bit and the new LSB to round. If the stored bit bit is 1, add one to the LSB of the non-truncated part and normalize if necessary. If the stored bit is 0, do nothing.
**
**
Before Zeroing out the (23-exponent) least significant bits, OR together the (22-exponent) least significant bits. Call the result of that OR the rounding bit.
The stored (23-exponent + 1) bit (after the decimal point) will be called the guard bit.
Then zero out the (23-exponent) least significant bits).
If the guard bit is zero, do nothing.
If the guard bit is 1, and the sticky bit is 0, add one to the LSB if the LSB is 1.
If the guard bit is 1 and the sticky bit is 1, add one to the LSB.
Here are some examples using the basic algorithm:
x = 62.3
Step 1: Store the exponent+1’th bit (after the decimal point)
exponent+1 = 6th bit
savedbit = 0
Step 2: Zero out 23-exponent least significant bits
23-exponent = 18, so we zero out the 18 LSBs
Step 3: Use the next bit to round
Since the stored bit is 0, we do nothing, and the floating point number has been rounded to 62.
Another example:
x = 21.9
Step 1: Store the exponent+1’th bit (after the decimal point)
exponent+1 = 5th bit
savedbit = 1
Step 2: Zero out 23-exponent least significant bits
23-exponent = 19, so we zero out the 19 LSBs
Step 3: Use the next bit to round
Since the stored bit is 1, we add one to the LSB of the truncated part and get 22, which is the correct number:
We start with:
Add one at this location:
And we get 22: