[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Inconsistant Segfaults



First off, I really appreciate the input. :)

On Mon, Aug 06, 2001 at 06:25:16PM -0300, Rog?rio Brito wrote:
> On Aug 06 2001, Ryan Golbeck wrote:
> > Firstly, I compiled a 2.4.7 kernel and I set the CPU to Althon/Duron
> > and everytime I booted it seemed that either the kernel would segfault
> > or some of the startup processes would and aftre the system got booted
> > random software would segfault as well.  Like vim would segfault as
> > I was editing or scrolling a file.
> 
> 	Well, just for the record, I'm using an Asus A7V board (which
> 	uses the VIA KT133 chipset) here with a Duron 600MHz (not
> 	overclocked), with kernel 2.2.19 and 2.4.7 (switching all the
> 	time between a potato and a sid install that I have, since
> 	I've acquired a DVD player) and I don't have any problems. In
> 	fact, the system is quite stable and good.
> 
> 	I'm moderately conservative with the settings that I use to
> 	build my kernels, though, and I only compile extensions for
> 	Pentiums.

Fair enough.  I've tried the 386 extensions, and it seems to segfault
less but even a little bit is not okay with me.  The Asus A7V board is
the exact board I have, so now I'm confident that it is not a hardware
incompatability with the kernel.

> 
> > Now, I tried 2.2.19-reiserfs (with zoltan's reiserfs boot disks) and
> > this didn't seem to happen (that much) and I've tried 2.4.7 compiled
> > for a 386 processor and it seems to work okay.  The only problem I've
> > had with these two kernels seems to be with mozilla, in that the
> > run-mozilla.sh scripts sometimes segfaults, apparently on line 72,
> > which funny enough is the opening brace of the first function in the
> > shell script.  Which is really kind of odd.  But then if I try running
> > it again it works.
> 
> 	Does mozilla die on the first invocation? I did see mozilla
> 	dying with a segfault in the script, but only after I've been
> 	browsing a little bit and I suspect that the segfaults are
> 	actually not in the script, but propagated from one of the
> 	mozilla modules (otherwise, I'd say that bash would have the
> 	segfault).

Mozilla dies only on invocation (and sometimes during browsing, but of
course it did that on my old system too).  From experimenting I've
found that programs only seem to segfault at peak CPU load times.
That is, when mozilla starts, or I'm starting X it catches sig 11
(segfault), and also when a CPU intensive screensaver would startup,
it would cause a segfault and X would dump, so that seems to be
accurate to me.

Yeah, I've seen figured it's not with the script, but instead some CPU
intensive startup instructions of mozilla that are causing the crash.

> 	Ooops! Not that fast! I've seen things quite strange in the
> 	past when memory and cooling get unreliable. With unproper
> 	cooling I once had with a server (its fan died), even emacs
> 	was segfaulting, which is weird.

The cooling is proper. I've physically checked the CPU heatsink and
have had a external fan blow into the case during operation and it
didn't change anything (and the heat sink was barely even getting
moderately warm).  So I believe the cooling is sufficient.

> 	Another thing that you'd might want to check would be the
> 	memtest86 program to rigorously test your memory. It will get
> 	hot and stress-tested if you run it for a few days
> 	continuosly, so you can see how reliable it is. Search google
> 	for memtest86 and grab the latest stable version.

The memory from the box is from a known working system and the memory
was flawless in that system (no problems at all).  There were three
chips I moved over, and I tested each one individually up to test #4
with memtest86 and all but one came up clean, and offending chip
I've seen removed.

> 
> > But if it's not hardware I don't know what it is.  Anyone have any
> > ideas or had any experience with asus via133 motherboards?
> 
> 	Well, I hope that my comments would give you a hint of other
> 	places where the problem could be.

Again, I really appreciate your feedback.  It's especially good to know
that the board is known to work.  I'm beginning to believe the hardware
is just bad and I'm going to have it replaced, that's all I can think of
to do now at this point.

At least the system is stable enough to work on for the time being, but
under any sort of load things start crashing which is unacceptable. :)

Thanks!

-- 
Ryan Golbeck <rmgolbeck@uwaterloo.ca>
Computer Science
University Of Waterloo

GPG: 1024D/78916B84 
1B1B 2A87 3F00 A7FB 40F3  526D 36CF BA44 7891 6B84

Attachment: pgp7FBqj5213u.pgp
Description: PGP signature


Reply to: