[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

R815 machine checks under jessie



Jerry - I had some communication with you about a year ago.

I have four R815s. They have been running Debian since purchase about 6 years
ago. I upgraded from wheezy to jessie about a year ago. For the past year,
all four exhibit sporadic machine checks that cause them to crash. I never
observed this before the upgrade. I upgraded a dozen T5500s and four C6145s
from wheezy to jessie at the same time and none of them exhibit this problem.
This appears to be specific to the R815s and jessie.

The R815s run jessie fine. Just that they machine check about a week or a few
weeks after reboot. I read on the net that this may be related to fan speed or
temperature. But I can't find that web page now.

Suggestions on how I might fix this?

    Thanks,
    Jeff (http://engineering.purdue.edu/~qobi)
--------------------------------------------------------------------------------
Relevant output from ipmitool

root@upplysingaoflun:~# ipmitool sel elist
   1 | 07/12/2017 | 18:29:12 | Event Logging Disabled SEL | Log area reset/cleared | Asserted
   2 | 07/17/2017 | 15:03:13 | Power Supply Status | Failure detected () | Asserted
   3 | 07/17/2017 | 15:03:14 | Power Supply PS Redundancy | Redundancy Lost | Asserted
   4 | 07/17/2017 | 15:03:15 | Power Supply Status | Failure detected () | Deasserted
   5 | 07/17/2017 | 15:03:19 | Power Supply PS Redundancy | Fully Redundant | Asserted
   6 | 07/18/2017 | 09:18:03 | Processor CPU Machine Chk | Transition to Non-recoverable | Asserted
   7 | 07/18/2017 | 09:18:04 | Unknown #0x28 |  | Asserted
   8 | 07/18/2017 | 09:18:04 | Unknown #0x28 |  | Asserted
   9 | 07/18/2017 | 09:18:04 | Unknown #0x28 |  | Asserted
   a | 07/18/2017 | 09:18:04 | Unknown #0x28 |  | Asserted
   b | 07/18/2017 | 09:18:04 | Unknown #0x28 |  | Asserted
   c | 07/18/2017 | 09:18:05 | Unknown #0x28 |  | Asserted
   d | 07/18/2017 | 09:18:05 | Unknown #0x28 |  | Asserted
   e | 07/18/2017 | 09:18:05 | Unknown #0x28 |  | Asserted
   f | 07/22/2017 | 10:50:29 | Processor CPU Machine Chk | Transition to Non-recoverable | Asserted
  10 | 07/22/2017 | 10:50:29 | Unknown #0x28 |  | Asserted
  11 | 07/22/2017 | 10:50:29 | Unknown #0x28 |  | Asserted
  12 | 07/22/2017 | 10:50:30 | Unknown #0x28 |  | Asserted
  13 | 07/22/2017 | 10:50:30 | Unknown #0x28 |  | Asserted
  14 | 07/22/2017 | 10:50:30 | Unknown #0x28 |  | Asserted
  15 | 07/22/2017 | 10:50:30 | Unknown #0x28 |  | Asserted
  16 | 07/22/2017 | 10:50:31 | Unknown #0x28 |  | Asserted
  17 | 07/22/2017 | 10:50:31 | Unknown #0x28 |  | Asserted
  18 | 07/24/2017 | 20:26:16 | Power Supply Status | Failure detected () | Asserted
  19 | 07/24/2017 | 20:26:17 | Power Supply Status | Failure detected () | Deasserted
  1a | 07/24/2017 | 20:26:22 | Power Supply PS Redundancy | Fully Redundant | Asserted
root@upplysingaoflun:~#


Reply to: