zx2000: Machine check when booting
Hello,
terribly sorry for posting off-topic, the problem is not with debian
on my ia64 machine but with the machine itself. Here is my post on HP
forum
http://h30499.www3.hp.com/t5/Workstations-Itanium-Based/zx2000-Machine-check-when-booting/td-p/5900253
still unanswered. The only question I have is which I should replace:
motherboard or processor?
I'd be grateful for any hint how to understand BMC logs for my zx2000.
This is a continuation of "Firmware Error" on rx/zx topic, now with
new data I retrieved with null-modem connection. In a couple of words:
2 months ago my zx2000 machine, deerfield, b.2005, debian lenny,
always healthy, refused to boot. For the first time about 1 of 5-6
attempts to boot was successeful (during one of them I replaced the
bios battery and ran IPF Offline Utilities) but now the machine seems
to be defunct: every boot fails. IPF Offline Utilities say that almost
all is OK except for Basic I/O test which hangs the machine.
Here is an extract from SEL log with a record of successful boot
(judging on incorrect time settings that was the day I changed the
battery):
cli>sel
# Sev Generator/Sensor Description Event ID Data, Timestamp
---- - ---------------- ------------ ----------- --------------------------
0990 - BMC LPC reset 00-12:70:02 (Rel) 00 00:00:05
09A0 - SEL Time Set Set FD-C0:03:01 1998-01-01 00:04:52
09B0 2 CPU0 000DC DT 00 0000000000000000
09C0 2 CPU0 000DC Time 1998-01-01 00:04:54
09D0 - SFW EFI boot mgr 00-12:6F:41 8F:-- 1998-01-01 00:05:20
09E0 2 CPU0 EFI boot mgr 0020B DT 04 0000000000000006
09F0 2 CPU0 EFI boot mgr 0020B Time 1998-01-01 00:05:20
0A00 - BMC LPC reset 00-12:70:02 1998-01-01 00:34:23
The following is the FPL record for a recent attempt to boot:
cli>fpl
Rec# Sev Generator/Sensor Description Event ID Data, Timestamp
-------- - ---------------- ------------ ----------- --------------------------
00001DEF - BMC LPC reset 00-12:70:02 2012-12-17 00:58:59
00001DF0 - SFW Boot start 00-1D:0A:00 2012-12-17 00:58:59
00001DF1 2 CPU0 Boot start 00063 DT 06 0000000000000000
00001DF2 2 CPU0 Boot start 00063 Time 2012-12-17 00:58:59
00001DF3 0 CPU0 00020 DT 00 0000000000000000
00001DF4 0 CPU0 0000E DT 06 0000000000010000
00001DF5 1 CPU0 CPU monarch 0000C DT 06 0000000000000000
00001DF6 1 CPU0 CPU present 00261 DT 06 0000000000000000
00001DF7 0 CPU0 00008 DT 00 0000000000000000
00001DF8 0 CPU0 0024B DT 00 0000000000000000
00001DF9 0 CPU0 00006 DT 03 02000000002A0400
00001DFA 0 CPU0 00056 DT 00 0000000000000000
00001DFB 0 CPU0 0024C DT 00 0000000000000000
00001DFC 0 CPU0 0001D DT 06 0000000000000000
00001DFD - SEL Time Set Set FD-C0:03:01 2012-12-17 00:59:05
00001DFE 0 CPU0 002AF DT 06 000000000000001F
00001DFF 0 CPU0 0010B DT 00 0000000000000000
00001E00 1 CPU0 000A4 DT 00 0000000000000000
00001E01 0 CPU0 000B1 DT 00 0000000000000000
00001E02 0 CPU0 000DF DT 00 0000000000000000
00001E03 0 CPU0 000C6 DT 00 0000000000000000
00001E04 1 CPU0 000FE DT 00 0000000000000000
00001E05 0 CPU0 000EC DT 00 0000000000000000
00001E06 0 CPU0 000A6 DT 00 0000000000000000
00001E07 0 CPU0 000E7 DT 04 FFFFFFFF000AFF74
00001E08 0 CPU0 000E7 DT 04 FFFFFFFF000BFF74
00001E09 0 CPU0 000E5 DT 04 FFFFFFFF001AFF74
00001E0A 0 CPU0 000E5 DT 04 FFFFFFFF001BFF74
00001E0B 0 CPU0 00205 DT 00 0000000000000000
00001E0C 0 CPU0 000B2 DT 00 0000000000000000
00001E0D 0 CPU0 000C9 DT 00 0000000000000000
00001E0E 0 CPU0 000C2 DT 00 0000000000000000
00001E0F 0 CPU0 000A8 DT 00 0000000000000000
00001E10 0 CPU0 000CE DT 00 0000000000000000
00001E11 0 CPU0 000B8 DT 00 0000000000000000
00001E12 0 CPU0 000F6 DT 00 0000000000000000
00001E13 0 CPU0 000F1 DT 00 0000000000000000
00001E14 0 CPU0 000EF DT 00 0000000000000000
00001E15 0 CPU0 000A5 DT 00 0000000000000000
00001E16 1 CPU0 I/O discovry 00081 DT 00 0000000000000000
00001E17 0 CPU0 00086 DT 04 000000FFFF00FF83
00001E18 0 CPU0 00086 DT 04 000000FFFF04FF83
00001E19 0 CPU0 00086 DT 04 000000FFFF05FF83
00001E1A 0 CPU0 00086 DT 04 000000FFFF06FF83
00001E1B 0 CPU0 00087 DT 04 000000FFFF00FF83
00001E1C 0 CPU0 00087 DT 04 000000FFFF04FF83
00001E1D 0 CPU0 00087 DT 04 000000FFFF05FF83
00001E1E 0 CPU0 00087 DT 04 000000FFFF06FF83
00001E1F 2 CPU0 00285 DT 06 0000000000000000
00001E20 2 CPU0 00285 Time 2012-12-17 00:59:09
00001E21 - SFW Machine chk 00-13:70:A1 3F:00 2012-12-17 00:59:09
00001E22 7 CPU0 Machine chk 00098 DT 06 000000000000000B
00001E23 7 CPU0 Machine chk 00098 Time 2012-12-17 00:59:09
00001E24 2 CPU0 002A1 DT 06 28000000FFF21130
00001E25 2 CPU0 002A1 Time 2012-12-17 00:59:09
00001E26 2 CPU0 00115 DT 06 0000000000000000
00001E27 2 CPU0 00115 Time 2012-12-17 00:59:09
00001E28 3 CPU0 00107 DT 06 0000000000000000
00001E29 3 CPU0 00107 Time 2012-12-17 00:59:09
00001E2A - BMC LPC reset 00-12:70:02 2012-12-17 00:59:09
Which code caused this Machine check? If it's actually the EFI code,
how can I replace the firmware if the only documented way to do this
is through the EFI shell? Of course, it could be solely a hardware
interruption, and if it's the case, what should I replace: processor
or motherboard?
Thank you in advance,
Regards,
Valery
Reply to: