[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Weird lockups



Since the machine doesn't respond to sysrq, it may be stuck in an interrupt handler.
If any of the machines in question have "halt" buttons (machines from
Compaq should have them) I would like to know if pressing halt
gets one back to the console ">>>" prompt.  If it does, one can use
console commands to dump the pc, ra, sp.  Also, the stack can be dumped
from the console.  If its a multi-processor, one should halt all the processors at the console
and get the registers on the various processors.

Something like this:

[hit halt button]
>> e pc
>> e ra
>> e sp
>> halt 1     /* if smp */
>> set host 1   (or set cpu 1 - I forget)
>> e pc
etc

Now, under Compaq Unix one would eventually
do

>> set cpu 0
>> crash

and this would generate a crash dump.  Crash dumps
under Linux are under development.  I am working
one one and SGI has another.

I will soon, within the next few days, release an "in-memory"
crash dump for Linux.  I have not yet figured out how
to plug the crash command at the console into Linux kernel code.
If anyone knows how to do this, please let me know.

-Dave
 
 
 

"Matthew R. Pavlovich" wrote:

> Just a "me too".  I have had hard lockups with the following system:
>
> 164LX2 533MHz

Another 21164 system.  This is different from the PC164's.  Anyone know
what the difference is?  Which sub-arch would the LX2 use?

> 4GB Seagate IDE drive
> 8MB ELSA Gloria Synergy video card.
> DEC Tulip NIC (Kingston KNE100TX)
> Debian 2.2

Have you updated recently?  The autobuilder is back and a good number of
packages have been updated.

> more often when running X.  I have tried swapping all removable parts of

I think we are seeing a lot of this... Here's what I've summarized:

BUG-- HARD system locks with Alpha processors.

Description- Unrecoverable (SysReq fails) system lockups at various,
unpredictable, unrepeatable times.  Users have estimated error could be
with-

        3 PC164's 1 LX2
        IDE controller support for alphas.
        CMD646 IDE controller specifically
        X - 2 ppl w/ G200 PCI's 1 w/ ELSA Gloria Synergy
        2 w/ 3Com 905b's
        2 w/ NCR 53C875 SCSI Controllers

Diagnostics performed- All users have tried different hardware
configurations, including replacing all IDE hardware w/ SCSI.

Notes- Some report problems with module loading, or modular kernels.

Kernel versions- Various 2.2.x.. including 2.2.13 and 2.2.14.

 Matthew R. Pavlovich

-- 
David Winchell
Chief Technology Officer
Mission Critical Linux
http://www.missioncriticallinux.com
 
Reply to: