[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#372544: libc6-dev: fma() is incorrect (inaccurate), not conform to C99



On 2006-06-10 03:11:51 +0200, Vincent Lefevre wrote:
> The cause is that fma() is implemented on the x86 with (x * y) + z.
> The ISO C99 standard requires:
> 
>     The fma functions compute (x × y) + z, rounded as one ternary
>     operation: they compute the value (as if) to infinite precision
>     and round once to the result format, according to the rounding
>     mode characterized by the value of FLT_ROUNDS.
> 
> I currently don't have the time to fix it, but I think this should
> be done like this:
> 
> 1. Compute the product as an exact sum of two FP numbers with Dekker's
>    algorithm.
> 2. Determine the rounded results (see algorithms on FP expansions).
> 
> Well, there would be 2 problems: First, the fact that values may be in
> extended precision (GCC bug). Then, the possible intermediate overflow
> or underflow should be avoided, probably by doing comparisons first
> and scale the values if need be (this problem is much less important
> than the current situation).

The overflow problem occurs on x86_64 (since SSE2 is used); see
attached C file.

I haven't tested it, but I assume that it also occurs in fmal
on all x86 machines.

-- 
Vincent Lefèvre <vincent@vinc17.org> - Web: <http://www.vinc17.org/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.org/blog/>
Work: CR INRIA - computer arithmetic / Arenaire project (LIP, ENS-Lyon)
/* $Id: fma-tests.c 17487 2007-05-21 11:03:03Z lefevre $
 *
 * Miscellaneous fma() test.
 */

#include <stdio.h>
#include <stdlib.h>
#include <float.h>
#include <math.h>

#define WRONG(S,X,Y,Z,F,C)                                        \
  fprintf (stderr, "ERROR in test '" S "'!\nfma (%a, %a, %a)\n"   \
           "returned %a instead of %a.\n", X, Y, Z, F, C)

/* Modified Nelson H. F. Beebe's fma() test from stds-754 list, 2006-06-09 */
static int beebe (void)
{
  volatile double eps, e2, f, x, z;

  eps = DBL_EPSILON;
  e2 = eps * eps;
  x = 1.0 + eps;
  z = -1.0 - 2.0 * eps;
  f = fma (x, x, z);
  if (f != e2)
    {
      WRONG ("beebe", x, x, z, f, e2);
      return 1;
    }
  return 0;
}

static int overflowed_mul (void)
{
  volatile double x, f;

  x = DBL_MAX;
  f = fma (x, 2.0, -x);
  if (f != x)
    {
      WRONG ("overflowed_mul", x, 2.0, -x, f, x);
      return 1;
    }
  return 0;
}

int main (void)
{
  int err = 0;

  err += beebe ();
  err += overflowed_mul ();
  return err;
}

Reply to: