Re: unpredictable crashes, lock up, freezes, whatever

on Mon, Jun 18, 2001 at 09:06:35AM +0900, Olaf Meeuwissen (olaf@epkowa.co.jp) wrote:
> "Karsten M. Self" <kmself@ix.netcom.com> writes:
> > on Fri, Jun 15, 2001 at 04:51:35PM +0900, Olaf Meeuwissen (olaf@epkowa.co.jp) wrote:
> > > Dear all,
> > > 
> > > I'm running mostly testing with some unstable under linux 2.2.19
> > > (hand rolled, of course) on an IBM ThinkPad i1476 (Type 2611).
> > > Since a few weeks, my machine completely locks up at unpredictable
> > > moments.  The screen is no longer updated, I can't switch to a
> > > virtual terminal, even the three finger salute doesn't do a thing.
> > > Pinging from another machine results in 100% lost packets but the
> > > PCMCIA network card keeps signalling traffic.  Just about the only
> > > thing that keeps on going is CD audio.


> > > I've checked the logs but apart from occasional blocks of nulls just
> > > before a lock up, I haven't seen anything out of the ordinary.  Note,
> > > those null blocks only appear before _some_ lock ups, not all.
> > 
> > Look for power-change events under apmd.
> I doubt that has anything to do with it because the machine is on AC
> 99% of the time.  [Goes checking the logs now ...]   No correlation
> between power change events and crash times.

There may or may not be a correlation.  The question is whether or not
you're seeing power change events, particularly numerous or unexplained
ones.  In my case, a flaky onboard power port is leading to frequent
changes.  The detachable base/docking unit with my laptop works better,
it's what I'm using at the moment.

> > > Because I haven't experienced any lock up when using the console, I'm
> > > wondering if my graphics card (probed as Neomagic NM2200 according to
> > > XFree86 log, NeoMagic MagicMedia 256AV according to hardware spec) has
> > > gone bad.  Are there any tools a la memtest to test my graphics card?
> > 
> > Possible, but the card's pretty well supported in recent XF86 v.3 and
> > v.4 drivers.
> > 
> > It's not clear how long you're leaving your system in console mode to
> > establish whether or not this is a problem.  Might make a practice of
> > doing this on long breaks (lunch, overnight), and seeing what the
> > results are.
> Sorry, should have mentioned that; somewhere around 5, 6 hours.  Have
> only done that once though.  Could try leaving it in console mode
> overnight.

I'd try that for a few days, or some perioid commensurate with the
frequency with which you're having system lockups.  If you can go for a
period of 2-3 times the typical interval between graphic screen lockups,
you might consider the issue to be related to your graphics card.


> > I had similar problems associated with apmd and Speedstep (aka
> > Geyserville) on my TuxTops Amethyst 20U, exacerbated by a flaky
> > onboard power port (it breaks circuit when jiggled, resulting in APM
> > mode changes).  In system BIOS, I disabled speedstep functionality
> > -- my CPU is always running in full-speed mode (600 MHz), resulting
> > in shorter battery life, but longer uptime ;-).  I've had no
> > problems since changing this setting about two months ago.
> I believe I've disabled BIOS power savings settings but will double
> check at the next crash, er, reboot.

I believe the Speedstep settings are somewhat seperated from other
energy saving settings, though I may be wrong on this, and results will
vary by BIOS.

