[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Weird lockups



On Tue, Jan 18, 2000 at 09:14:49AM -0600, David Winchell wrote:
> Since the machine doesn't respond to sysrq, it may be stuck in an interrupt handler.
> If any of the machines in question have "halt" buttons (machines from
> Compaq should have them) I would like to know if pressing halt
> gets one back to the console ">>>" prompt.  If it does, one can use
> console commands to dump the pc, ra, sp.  Also, the stack can be dumped
> from the console.  If its a multi-processor, one should halt all the processors at the console
> and get the registers on the various processors.

I have occasionally had the machine lock while not in X11.  What happens
is that I get the message "Machine check while in PALmode" and am
returned to the SRM prompt.  I have tried examining the registers and
see no clear pattern.  Sometimes the return address is inside the kernel
(various places) and sometimes it isn't.  How does one dump the stack
from the console?

Greg

> 
> Something like this:
> 
> [hit halt button]
> >> e pc
> >> e ra
> >> e sp
> >> halt 1     /* if smp */
> >> set host 1   (or set cpu 1 - I forget)
> >> e pc
> etc
> 
> Now, under Compaq Unix one would eventually
> do
> 
> >> set cpu 0
> >> crash
> 
> and this would generate a crash dump.  Crash dumps
> under Linux are under development.  I am working
> one one and SGI has another.
> 
> I will soon, within the next few days, release an "in-memory"
> crash dump for Linux.  I have not yet figured out how
> to plug the crash command at the console into Linux kernel code.
> If anyone knows how to do this, please let me know.
> 
> -Dave
> 
> 
> 
> 
> "Matthew R. Pavlovich" wrote:
> 
> > > Just a "me too".  I have had hard lockups with the following system:
> > >
> > > 164LX2 533MHz
> >
> > Another 21164 system.  This is different from the PC164's.  Anyone know
> > what the difference is?  Which sub-arch would the LX2 use?
> >
> > > 4GB Seagate IDE drive
> > > 8MB ELSA Gloria Synergy video card.
> > > DEC Tulip NIC (Kingston KNE100TX)
> > > Debian 2.2
> >
> > Have you updated recently?  The autobuilder is back and a good number of
> > packages have been updated.
> >
> > > more often when running X.  I have tried swapping all removable parts of
> >
> > I think we are seeing a lot of this... Here's what I've summarized:
> >
> > BUG-- HARD system locks with Alpha processors.
> >
> > Description- Unrecoverable (SysReq fails) system lockups at various,
> > unpredictable, unrepeatable times.  Users have estimated error could be
> > with-
> >
> >         3 PC164's 1 LX2
> >         IDE controller support for alphas.
> >         CMD646 IDE controller specifically
> >         X - 2 ppl w/ G200 PCI's 1 w/ ELSA Gloria Synergy
> >         2 w/ 3Com 905b's
> >         2 w/ NCR 53C875 SCSI Controllers
> >
> > Diagnostics performed- All users have tried different hardware
> > configurations, including replacing all IDE hardware w/ SCSI.
> >
> > Notes- Some report problems with module loading, or modular kernels.
> >
> > Kernel versions- Various 2.2.x.. including 2.2.13 and 2.2.14.
> >
> >  Matthew R. Pavlovich
> 
> --
> David Winchell
> Chief Technology Officer
> Mission Critical Linux
> http://www.missioncriticallinux.com
> 
> 

-- 
Greg Johnson                          gjohnson@physics.clarku.edu
http://physics.clarku.edu/~gjohnson            finger for PGP key


Reply to: