[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [BUGS] Test suite fails on alpha architecture

Steve Langasek <vorlon@debian.org> writes:
> It may be specific to particular versions of glibc and the kernel.  At least
> one of the test regressions is actually due to the bug described in
> <http://lists.debian.org/debian-alpha/2007/10/msg00014.html>; I haven't dug
> into the rest of the failures further at this point.

Thanks for the tip about that bug.  Using the gentoo project's
kindly-lent Alpha, I see that the failure in our float8 regression test
is indeed explained by floor() doing the wrong thing.  The case that
fails is

regression=# select (-34.84)::float8  ^ '1e200';
ERROR:  2201F: invalid argument for power function
LOCATION:  dpow, float.c:1337

where we are expecting to get "value out of range: overflow".  Instead this
test is failing:

     * The SQL spec requires that we emit a particular SQLSTATE error code for
     * certain error conditions.
    if ((arg1 == 0 && arg2 < 0) ||
        (arg1 < 0 && floor(arg2) != arg2))
                 errmsg("invalid argument for power function")));

and indeed

regression=# select floor(1e200::float8) - 1e200::float8;
(1 row)

so it seems floor(3m) is off by one in the last place.

> But if it can be reproduced on other distros as well, all the better.

All the other diffs that Martin showed are divide-by-zero failures,
and I do not see any of them on Gentoo's machine.  I think that this
must be a compiler bug.  The first example in his diffs is just
"select 1/0", which executes this code:

    int32        arg1 = PG_GETARG_INT32(0);
    int32        arg2 = PG_GETARG_INT32(1);
    int32        result;

    if (arg2 == 0)
                 errmsg("division by zero")));

    result = arg1 / arg2;

It looks to me like Debian's compiler must be allowing the division
instruction to be speculatively executed before the if-test branch
is taken.  Perhaps it is supposing that this is OK because control
will return from ereport(), when in fact it will not (the routine
throws a longjmp).  Since we've not seen such behavior on any other
platform, however, I suspect this is just a bug and not intentional.

FWIW the Gentoo machine is running

$ gcc -v
Using built-in specs.
Target: alpha-unknown-linux-gnu
Configured with: /var/tmp/portage/sys-devel/gcc-4.1.2/work/gcc-4.1.2/configure --prefix=/usr --bindir=/usr/alpha-unknown-linux-gnu/gcc-bin/4.1.2 --includedir=/usr/lib/gcc/alpha-unknown-linux-gnu/4.1.2/include --datadir=/usr/share/gcc-data/alpha-unknown-linux-gnu/4.1.2 --mandir=/usr/share/gcc-data/alpha-unknown-linux-gnu/4.1.2/man --infodir=/usr/share/gcc-data/alpha-unknown-linux-gnu/4.1.2/info --with-gxx-include-dir=/usr/lib/gcc/alpha-unknown-linux-gnu/4.1.2/include/g++-v4 --host=alpha-unknown-linux-gnu --build=alpha-unknown-linux-gnu --disable-altivec --enable-nls --without-included-gettext --with-system-zlib --disable-checking --disable-werror --enable-secureplt --disable-libunwind-exceptions --disable-multilib --disable-libmudflap --disable-libssp --disable-libgcj --enable-languages=c,c++,fortran --enable-shared --enable-threads=posix --enable-__cxa_atexit --enable-clocale=gnu
Thread model: posix
gcc version 4.1.2 (Gentoo 4.1.2)

Bottom line is that I see nothing here that the Postgres project can
fix --- these are library and compiler bugs.

			regards, tom lane

Reply to: