[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#318959: libc6: unreproducible on powerpc



On Fri, Jul 22, 2005 at 10:34:08PM +0900, GOTO Masanori wrote:
> tags 318959 unreproducible, moreinfo
> thanks
> 
> At Wed, 20 Jul 2005 05:16:32 +0100,
> Paul Brossier wrote:
> > for info, i have been testing both testcases on powerpc and could not reproduce
> > the issue.
> 
> Me too.  I don't know what the actual problem is - it may be hardware
> dependent problem, or simply system has another libc6.  I'll close
> this bug.  giuseppe, if you have more information about this, please
> report to us in detail.
> 
> Regards,
> -- gotom

Hi all.

I think that is no hardware problem at all.  Instead, Thorsten found a
long standing bug in glibc, previously gone unnoticed.

I was able to reproduce it on a bunch of x86/32 systems, both intel
and amd, and several releases of Debian and RedHat versions: the tag
"unreproducible" is not appropriate.

I investigated a bit further, and I think the problem is a violation of
the C89 and C99 standards in
glibc-2.3.2.ds1/build-tree/glibc-2.3.2/sysdeps/ieee754/dbl-64/s_lround.c, 
more precisely in line 61. 
	result = ((long int) i0 << (j0 - 20)) | (j >> (52 - j0))

With the argument reported by Thorsten, this statement is
executed with j==0x80000000 and (52-j0)==32.  

Kernighan & Ritchie, as well as ISO/IEC 9899:1999 (in 6.5.7 "Bitwise
shift operators") state that "... If the value of the right operand is
negative or is greater than or *equal to* the width of the promoted left
operand, the behavior is undefined." 

It turns out that on x86/32 (as far as I could experiment)
(0x80000000 >> 32) == 0x80000000, quite definitely, indeed. :-) 
And this behaviour probably is unintended: the coder of the above line
probably expected to get (0x80000000 >> 32) == 0.

Now to the reproducibility: 

- Undefined behaviour is very likely to depend on compiler optimization
  options, and that explains the "mysterious behaviour" previously
  reported.  Probably with '-O1' and above the compiler skips the
  computation at all, and places 0.

- I don't know how uint32_t are aligned and operated on in powerpc/64
  archs, but maybe different alignment (and different "undefined
  behaviour") is the reason why Paul could not reproduce the bug.
  BTW, what arch did you run your tests on, GOTO?

I don't know ieee754, so I don't dare submit a patch to libc (e.g., if
(52 - j0) > 31, use 0), but I believe dropping this bug as unreproducible
would be a mistake.  Better downgrade it and forward upstream.  It reminds
me of the fdiv bug in early pentiums.

Best regards.
giuseppe



Reply to: