fma(a,b,c) is equivalent to a*b+c except it doesn’t round intermediate result.
Could you give me some examples of algorithms that non-trivially benefit from avoiding this rounding?
It’s not obvious, as rounding after multiplications which we avoid tends to be less problematic than rounding after addition which we don’t.
taw hit on one important example; more generally, FMA allows library writers to efficiently implement many other floating-point operations with correct rounding.
For example, a platform that has an FMA can use it to implement correctly rounded divide and square root (PPC and Itanium took this approach), which lets the FPU be basically a single-purpose FMA machine. Peter Tang and John Harrison (Intel), and Peter Markstein (HP) have some papers that explain this use if you’re curious.
The example taw gave is more broadly useful than just in tracking error bounds. It allows you to represent the product of two floating point numbers as a sum of two floating point numbers without any rounding error; this is quite useful in implementing correctly-rounded floating-point library functions. Jean-Michel Muller’s book or the papers on
crlibmwould be good starting places to learn more about these uses.FMA is also broadly useful in argument reduction in math-library style routines for certain types of arguments; when one is doing argument reduction, the goal of the computation is often a term of the form
(x - a*b), where(a*b)is very nearly equal to x itself; in particular, the result is often on the order of the rounding error in the(a*b)term, if this is computed without an FMA. I believe that Muller has also written some about this in his book.