Hi, Tom Lane [2007-11-07 13:49 -0500]: > All the other diffs that Martin showed are divide-by-zero failures, > and I do not see any of them on Gentoo's machine. I think that this > must be a compiler bug. The first example in his diffs is just > "select 1/0", which executes this code: > > int32 arg1 = PG_GETARG_INT32(0); > int32 arg2 = PG_GETARG_INT32(1); > int32 result; > > if (arg2 == 0) > ereport(ERROR, > (errcode(ERRCODE_DIVISION_BY_ZERO), > errmsg("division by zero"))); > > result = arg1 / arg2; > > It looks to me like Debian's compiler must be allowing the division > instruction to be speculatively executed before the if-test branch > is taken. Perhaps it is supposing that this is OK because control > will return from ereport(), when in fact it will not (the routine > throws a longjmp). Since we've not seen such behavior on any other > platform, however, I suspect this is just a bug and not intentional. I tried this on a Debian Alpha porter box (thanks, Steve, for pointing me at it) with Debian's gcc 4.2.2. Latest sid indeed still has this bug (the floor() one is confirmed fixed), not only on Alpha, but also on sparc. Since the simple test case did not reproduce the error, I tried to make a more sophisticated one which resembles more closely what PostgreSQL does (sigsetjmp/siglongjmp instead of exit(), some macros, etc.). Unfortunately in vain, since the test case still works perfectly with both no compiler options and also the ones used for PostgreSQL. I attach it here nevertheless just in case someone has more luck than me. So I tried to approach it from the other side: Building postgresql with CFLAGS="-O0 -g" or "-O1 -g" works correctly, but with "-O2 -g" I get above bug. So I guess I'll build with -O1 for the time being on sparc and alpha to get correct binaries until this is sorted out. Any idea what else I could try? Thanks, Martin -- Martin Pitt http://www.piware.de Ubuntu Developer http://www.ubuntu.com Debian Developer http://www.debian.org
#include <stdio.h> #include <stdlib.h> #include <setjmp.h> #define ERROR 20 #define ereport(elevel, rest) \ (errstart(elevel, __FILE__, __LINE__, __func__) ? \ (errfinish rest) : (void) 0) #define PG_RE_THROW() \ siglongjmp(PG_exception_stack, 1) sigjmp_buf PG_exception_stack; int errstart(int elevel, const char *filename, int lineno, const char *funcname) { printf("error: level %i %s:%i function %s\n", elevel, filename, lineno, funcname); return 1; } void errfinish(int dummy, const char* msg) { puts(msg); PG_RE_THROW(); } int do_div(char** argv) { int arg1 = atoi(argv[1]); int arg2 = atoi(argv[2]); int result; if (arg2 == 0) ereport(ERROR, (1, "division by zero")); result = arg1 / arg2; return result; } int main(int argc, char **argv) { if (sigsetjmp(PG_exception_stack, 0) == 0) { int result = do_div(argv); printf("%d\n", result); } else { printf("caught error, aborting\n"); return 1; } return 0; }
Attachment:
signature.asc
Description: Digital signature