Re: Possible solution to Netscape crashing problem (was Re: Release-critical Bugreport for January 7, 2000)
Stefan Gybas <email@example.com> writes:
> Jules Bean wrote:
> > That's right. IIRC, doogie said that actually it was some complex
> > interaction with some code in xlib, and it wasn't technically a bug in
> > netscape.
> I did some investigation on this subject and I think I have now found a
> solution for this problem:
> The libc6 version of Netscape crashes when you close one of its windows or
> try to quit the program since the X libs were compiled using egcs/gcc 2.95.
> A call trace from the core file shows that the problem seems to be inside
> libXt so I recompiled libXt from XFree 3.3.5 using gcc 2.7.2 and have had
> no Netcspae crashes so far.
Hm, glibc disables all of its inlined functions for gcc 2.7.2. As near as I
can tell All of these also seem to depend on __OPTIMIZE__, so they shouldn't
kick in unless compiling with -O, but maybe there are some that missed
__OPTIMIZE__. In fact bits/string.h and bits/string2.h don't check
__OPTIMIZE__ (but then they don't seem to check for 2.7.2 either.)
Try compiling with __NO_MATH_INLINES and __NO_STRING_INLINES and see if the
assembly is any different?
> Attached to this message is a patch for xfree86-1 which should IMHO be
> added to the Debian package (together with a build dependency on gcc272 on
> i386) if this version proves stable. I have not found any problems with
> other X applications so far.
The bug seems to be really really quirky. Some people have reported that they
have had absolutely no problems, while others report they can't run netscape
for a minute before it crashes. People haven't reported much details of their
systems to see any correlations but there were no obvious differences.
> BTW, I don't think all this is caused by a bug in gcc 2.95 since the same
> crash happens when you use gcc 2.7.2 without optimization. I guess this is
> something like an allignment problem of some data structures as the motif
> library used inside Netscape was compiled using "gcc-2.7.2 -O2".
Hm, possibly alignment, or possibly register allocation, or possibly a race
condition in the thread timing. The types of errors that have been happening
are all generally associated with race conditions. If it's a compiler issue it
could be the way it's compiling the locking primitives or some optimization
that does something non-thread-safe with the stack?