[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: ThinkPad R51 creeping segmentation faults



Paul Ausbeck wrote:
> I recently replaced the hard disk in my ThinkPad R51 with a solid
> state drive

The ThinkPad R51 is a solid machine.  Don't let anyone tell you
otherwise.

> The symptom is that as time goes on more and more programs will cause a
> segmentation fault while loading. For instance, emacs commonly is the first
> program to go. Then maybe iceweasel. Just today iceweasel wouldn't load at
> all but then following another suspend/resume cycle it now loads to a point
> where it presents a safe mode dialog but then crashes if the mouse pointer
> is moved over the dialog box.

That sounds very much like a hardware fault.  Probably a ram failure.
Which is the best type to have because ram is the cheapest to swap out.

If it isn't a ram failure then unfortunately it would most likely be a
cpu failure.  It would be possible to swap the cpu but much more
inconvenient.  Third likely would be some failure on the motherboard.

The root cause of a segmentation fault that isn't a software bug is
that bits are getting flipped.  Let's say a pointer to some piece of
memory is being accessed but a bit of the pointer value is
flipped.  That will cause it to access the array out of bounds and
cause a segmentation violation.  Those will be random because the
location of the program is different at different times and bits being
flipped could be anywhere.  This is most likely to occur when running
programs that use a lot of memory.  That is why you are seeing it on
Iceweasel, which is true memory hog, and ahem, my favorite editor
Emacs too.  Those programs are making the most use of your memory and
are therefore the mostly likely to suffer from flipped bits.

In the old days computers would use ECC ram throughout.  The ECC would
protect you from these problems.  For years however we have suffered
under MS quality hardware.  It doesn't make financial sense to make
hardware more reliable than the OS sold with it and most machines have
been sold with MS.

> I've looked around a bit on the internet for similar problems and come up
> short. In fact, this class of problem seems inherently difficult to drive to
> ground, at least with the knowledge that I currently possess. So what I hope
> is that the Debian mailing list can give me some good seeds for new
> knowledge to acquire. In particular I'd be interested in how others might
> have approached similar situations.

I would start by running memtest86+ overnight.

  apt-get install memtest86+

Then rebooting to the memtest system and letting it run overnight.
Hopefully it will indicate a problem.  That would be the best result.

> I've tried loading emacs and iceweasel with gdb to get stack
> backtraces.

If random programs are segfaulting then it is very unlikely to be a
problem with any of those programs.

> One last specific question that sort of embarrasses me to ask, is
> where should segmentation fault messages be logged?

/var/log/syslog logs all system messages.  I always look there.  Red
Hat calls it /var/log/messages and Debian also logs there too.  The
/var/log/kern.log is for the subset that are kernel messages.

To understand the difference look at /etc/rsyslog.conf and see what
gets logged different places.  /var/log/syslog contains pretty much
everything and the other logs contain more specific things.  Mostly.

Do you have mcelog installed?  If not then install it.

  apt-get install mcelog

> I've grepped around and there are a few segfault messages from maybe
> a week ago in kern.log.1 and messages.1, but nothing in kern.log or
> messages.log. Perhaps these are still in a memory ring buffer
> somewhere? Is there some sort of tool for viewing user space log
> messages, I mean other than dmesg which doesn't appear to show any
> user space messages?

What I have told you applies to Wheezy 7 you are running which is
running sysvinit.  A lot of flamewar has been spent on the new systemd
binary file logging in Jessie 8.  I mention this only to give you a
heads up that everything you have previously learned about the system
up through Wheezy 7 is all changed in Jessie 8.  If you decide to
stick with sysvinit then what you learn about /etc/rsyslog.conf
applies.  If you go with the new systemd journal in Jessie 8 then the
entire universe is a different place and you will need to learn it all
new for systemd.  Just to let you know there was a major change that
rolled out with the Jessie 8 release.

Bob

Attachment: signature.asc
Description: Digital signature


Reply to: