Bug#372544: libc6-dev: fma() is incorrect (inaccurate), not conform to C99
On 2006-06-10 03:11:51 +0200, Vincent Lefevre wrote:
> The cause is that fma() is implemented on the x86 with (x * y) + z.
> The ISO C99 standard requires:
> The fma functions compute (x × y) + z, rounded as one ternary
> operation: they compute the value (as if) to infinite precision
> and round once to the result format, according to the rounding
> mode characterized by the value of FLT_ROUNDS.
> I currently don't have the time to fix it, but I think this should
> be done like this:
> 1. Compute the product as an exact sum of two FP numbers with Dekker's
> 2. Determine the rounded results (see algorithms on FP expansions).
> Well, there would be 2 problems: First, the fact that values may be in
> extended precision (GCC bug). Then, the possible intermediate overflow
> or underflow should be avoided, probably by doing comparisons first
> and scale the values if need be (this problem is much less important
> than the current situation).
The overflow problem occurs on x86_64 (since SSE2 is used); see
attached C file.
I haven't tested it, but I assume that it also occurs in fmal
on all x86 machines.
Vincent Lefèvre <firstname.lastname@example.org> - Web: <http://www.vinc17.org/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.org/blog/>
Work: CR INRIA - computer arithmetic / Arenaire project (LIP, ENS-Lyon)
/* $Id: fma-tests.c 17487 2007-05-21 11:03:03Z lefevre $
* Miscellaneous fma() test.
#define WRONG(S,X,Y,Z,F,C) \
fprintf (stderr, "ERROR in test '" S "'!\nfma (%a, %a, %a)\n" \
"returned %a instead of %a.\n", X, Y, Z, F, C)
/* Modified Nelson H. F. Beebe's fma() test from stds-754 list, 2006-06-09 */
static int beebe (void)
volatile double eps, e2, f, x, z;
eps = DBL_EPSILON;
e2 = eps * eps;
x = 1.0 + eps;
z = -1.0 - 2.0 * eps;
f = fma (x, x, z);
if (f != e2)
WRONG ("beebe", x, x, z, f, e2);
static int overflowed_mul (void)
volatile double x, f;
x = DBL_MAX;
f = fma (x, 2.0, -x);
if (f != x)
WRONG ("overflowed_mul", x, 2.0, -x, f, x);
int main (void)
int err = 0;
err += beebe ();
err += overflowed_mul ();