[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: unpredictable crashes, lock up, freezes, whatever



"Karsten M. Self" <kmself@ix.netcom.com> writes:

> on Fri, Jun 15, 2001 at 04:51:35PM +0900, Olaf Meeuwissen (olaf@epkowa.co.jp) wrote:
> > Dear all,
> > 
> > I'm running mostly testing with some unstable under linux 2.2.19 (hand
> > rolled, of course) on an IBM ThinkPad i1476 (Type 2611).  Since a few
> > weeks, my machine completely locks up at unpredictable moments.  The
> > screen is no longer updated, I can't switch to a virtual terminal,
> > even the three finger salute doesn't do a thing.  Pinging from another
> > machine results in 100% lost packets but the PCMCIA network card keeps
> > signalling traffic.  Just about the only thing that keeps on going is
> > CD audio.
> 
> CD audio is not mediated by the OS, other than to (sometimes) create the
> link between the CD drive and your speakers.  It's just plain hardware.
> This largely establishes that your system is working at a low (hardware)
> level but not necessarily otherwise.  CD audio functionality doesn't
> indicate any OS-level functionality, and your loss of low-level network
> functionality indicates the system is probably pretty much hosed.

So that's all my sound module is needed for then :-)  Anyway, that is
one of the reasons why I tried pinging the machine during a lock up;
to see if there was any kernel life left.

> > I regularly 'apt-get -t testing upgrade' and the problem hasn't gone
> > away.  I've tried other kernels, including the Debian vanilla ones,
> > but to no avail.  I've run memtest86 and found errors in one of my
> > DIMMs but the problem remains even after lobotomy.  That is, even when
> > I only use the DIMM that is okay (memtest86, 20+ passes, tests 1-7) my
> > machine randomly locks up.
> > 
> > I've checked the logs but apart from occasional blocks of nulls just
> > before a lock up, I haven't seen anything out of the ordinary.  Note,
> > those null blocks only appear before _some_ lock ups, not all.
> 
> Look for power-change events under apmd.

I doubt that has anything to do with it because the machine is on AC
99% of the time.  [Goes checking the logs now ...]   No correlation
between power change events and crash times.

> > Because I haven't experienced any lock up when using the console, I'm
> > wondering if my graphics card (probed as Neomagic NM2200 according to
> > XFree86 log, NeoMagic MagicMedia 256AV according to hardware spec) has
> > gone bad.  Are there any tools a la memtest to test my graphics card?
> 
> Possible, but the card's pretty well supported in recent XF86 v.3 and
> v.4 drivers.
> 
> It's not clear how long you're leaving your system in console mode to
> establish whether or not this is a problem.  Might make a practice of
> doing this on long breaks (lunch, overnight), and seeing what the
> results are.

Sorry, should have mentioned that; somewhere around 5, 6 hours.  Have
only done that once though.  Could try leaving it in console mode
overnight.

> > Before you suggest, I have already tried both Gnome (with several
> > window managers) and KDE.  It doesn't matter.  The machine even locks
> > up when running (x|k)screensaver during lunch :-(
> > 
> > If you have other ideas as to what could be the matter, I'm open to
> > suggestions.
> 
> I had similar problems associated with apmd and Speedstep (aka
> Geyserville) on my TuxTops Amethyst 20U, exacerbated by a flaky onboard
> power port (it breaks circuit when jiggled, resulting in APM mode
> changes).  In system BIOS, I disabled speedstep functionality -- my CPU
> is always running in full-speed mode (600 MHz), resulting in shorter
> battery life, but longer uptime ;-).  I've had no problems since
> changing this setting about two months ago.

I believe I've disabled BIOS power savings settings but will double
check at the next crash, er, reboot.

> I'd made a more complete report to debian-laptop, should be in
> archives.

That box gave you a bit of troubles, eh?  My symptoms seem very much
like yours.  I'll be going over my kernel APM configuration as well.
 
> You might isolate video card issues by running in console mode, by
> switching to a version 3 XF86 driver, or by switching from an
> accelerated driver to SVGA or VGA16.

I've been thinking about running X on the frame buffer device myself.

Thanks for the suggestions,
-- 
Olaf Meeuwissen       Epson Kowa Corporation, Research and Development

     Free Software: `No walls, no windows!  No fences, no gates!'



Reply to: