[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

zx2000: Machine check when booting



Hello,

terribly sorry for posting off-topic, the problem is not with debian
on my ia64 machine but with the machine itself. Here is my post on HP
forum

http://h30499.www3.hp.com/t5/Workstations-Itanium-Based/zx2000-Machine-check-when-booting/td-p/5900253

still unanswered. The only question I have is which I should replace:
motherboard or processor?


I'd be grateful for any hint how to understand BMC logs for my zx2000.
This is a continuation of "Firmware Error" on rx/zx topic, now with
new data I retrieved with null-modem connection. In a couple of words:
2 months ago my zx2000 machine, deerfield, b.2005, debian lenny,
always healthy, refused to boot. For the first time about 1 of 5-6
attempts to boot was successeful (during one of them I replaced the
bios battery and ran IPF Offline Utilities) but now the machine seems
to be defunct: every boot fails. IPF Offline Utilities say that almost
all is OK except for Basic I/O test which hangs the machine.


Here is an extract from SEL log with a record of successful boot
(judging on incorrect time settings that was the day I changed the
battery):


cli>sel
#  Sev Generator/Sensor Description  Event ID    Data, Timestamp
---- - ---------------- ------------ ----------- --------------------------
0990 - BMC              LPC reset    00-12:70:02        (Rel)   00 00:00:05
09A0 - SEL Time Set     Set          FD-C0:03:01        1998-01-01 00:04:52
09B0 2 CPU0                             000DC    DT 00  0000000000000000
09C0 2 CPU0                             000DC    Time   1998-01-01 00:04:54
09D0 - SFW              EFI boot mgr 00-12:6F:41 8F:--  1998-01-01 00:05:20
09E0 2 CPU0             EFI boot mgr    0020B    DT 04  0000000000000006
09F0 2 CPU0             EFI boot mgr    0020B    Time   1998-01-01 00:05:20
0A00 - BMC              LPC reset    00-12:70:02        1998-01-01 00:34:23



The following is the FPL record for a recent attempt to boot:



cli>fpl
Rec#   Sev Generator/Sensor Description  Event ID    Data, Timestamp
-------- - ---------------- ------------ ----------- --------------------------
00001DEF - BMC              LPC reset    00-12:70:02        2012-12-17 00:58:59
00001DF0 - SFW              Boot start   00-1D:0A:00        2012-12-17 00:58:59
00001DF1 2 CPU0             Boot start      00063    DT 06  0000000000000000
00001DF2 2 CPU0             Boot start      00063    Time   2012-12-17 00:58:59
00001DF3 0 CPU0                             00020    DT 00  0000000000000000
00001DF4 0 CPU0                             0000E    DT 06  0000000000010000
00001DF5 1 CPU0             CPU monarch     0000C    DT 06  0000000000000000
00001DF6 1 CPU0             CPU present     00261    DT 06  0000000000000000
00001DF7 0 CPU0                             00008    DT 00  0000000000000000
00001DF8 0 CPU0                             0024B    DT 00  0000000000000000
00001DF9 0 CPU0                             00006    DT 03  02000000002A0400
00001DFA 0 CPU0                             00056    DT 00  0000000000000000
00001DFB 0 CPU0                             0024C    DT 00  0000000000000000
00001DFC 0 CPU0                             0001D    DT 06  0000000000000000
00001DFD - SEL Time Set     Set          FD-C0:03:01        2012-12-17 00:59:05
00001DFE 0 CPU0                             002AF    DT 06  000000000000001F
00001DFF 0 CPU0                             0010B    DT 00  0000000000000000
00001E00 1 CPU0                             000A4    DT 00  0000000000000000
00001E01 0 CPU0                             000B1    DT 00  0000000000000000
00001E02 0 CPU0                             000DF    DT 00  0000000000000000
00001E03 0 CPU0                             000C6    DT 00  0000000000000000
00001E04 1 CPU0                             000FE    DT 00  0000000000000000
00001E05 0 CPU0                             000EC    DT 00  0000000000000000
00001E06 0 CPU0                             000A6    DT 00  0000000000000000
00001E07 0 CPU0                             000E7    DT 04  FFFFFFFF000AFF74
00001E08 0 CPU0                             000E7    DT 04  FFFFFFFF000BFF74
00001E09 0 CPU0                             000E5    DT 04  FFFFFFFF001AFF74
00001E0A 0 CPU0                             000E5    DT 04  FFFFFFFF001BFF74
00001E0B 0 CPU0                             00205    DT 00  0000000000000000
00001E0C 0 CPU0                             000B2    DT 00  0000000000000000
00001E0D 0 CPU0                             000C9    DT 00  0000000000000000
00001E0E 0 CPU0                             000C2    DT 00  0000000000000000
00001E0F 0 CPU0                             000A8    DT 00  0000000000000000
00001E10 0 CPU0                             000CE    DT 00  0000000000000000
00001E11 0 CPU0                             000B8    DT 00  0000000000000000
00001E12 0 CPU0                             000F6    DT 00  0000000000000000
00001E13 0 CPU0                             000F1    DT 00  0000000000000000
00001E14 0 CPU0                             000EF    DT 00  0000000000000000
00001E15 0 CPU0                             000A5    DT 00  0000000000000000
00001E16 1 CPU0             I/O discovry    00081    DT 00  0000000000000000
00001E17 0 CPU0                             00086    DT 04  000000FFFF00FF83
00001E18 0 CPU0                             00086    DT 04  000000FFFF04FF83
00001E19 0 CPU0                             00086    DT 04  000000FFFF05FF83
00001E1A 0 CPU0                             00086    DT 04  000000FFFF06FF83
00001E1B 0 CPU0                             00087    DT 04  000000FFFF00FF83
00001E1C 0 CPU0                             00087    DT 04  000000FFFF04FF83
00001E1D 0 CPU0                             00087    DT 04  000000FFFF05FF83
00001E1E 0 CPU0                             00087    DT 04  000000FFFF06FF83
00001E1F 2 CPU0                             00285    DT 06  0000000000000000
00001E20 2 CPU0                             00285    Time   2012-12-17 00:59:09
00001E21 - SFW              Machine chk  00-13:70:A1 3F:00  2012-12-17 00:59:09
00001E22 7 CPU0             Machine chk     00098    DT 06  000000000000000B
00001E23 7 CPU0             Machine chk     00098    Time   2012-12-17 00:59:09
00001E24 2 CPU0                             002A1    DT 06  28000000FFF21130
00001E25 2 CPU0                             002A1    Time   2012-12-17 00:59:09
00001E26 2 CPU0                             00115    DT 06  0000000000000000
00001E27 2 CPU0                             00115    Time   2012-12-17 00:59:09
00001E28 3 CPU0                             00107    DT 06  0000000000000000
00001E29 3 CPU0                             00107    Time   2012-12-17 00:59:09
00001E2A - BMC              LPC reset    00-12:70:02        2012-12-17 00:59:09


Which code caused this Machine check? If it's actually the EFI code,
how can I replace the firmware if the only documented way to do this
is through the EFI shell? Of course, it could be solely a hardware
interruption, and if it's the case, what should I replace: processor
or motherboard?

Thank you in advance,

Regards,
Valery


Reply to: