Re: Please help - kernel crashes often
Yes, I was able to go down and get on the console, record it, and
found a thread on how to decypher it.
The MCE was:
CPU 0: Machine Check Exception: 4 Bank 0: f60da00000000833
TSC 23fd7acec1e ADDR 797db2c0
Kernel panic - not syncing: Machine check
the output from "mcelog" was:
web03:~# mcelog --k8 --ascii <mce.txt
CPU 0 0 data cache TSC 23fd7acec1e
Data cache ECC error (syndrome 1b)
bit45 = uncorrected ecc error
bit57 = processor context corrupt
bit61 = error uncorrected
bit62 = error overflow (multiple errors)
bus error 'local node origin, request didn't time out
data read mem transaction
memory access, level generic'
STATUS f60da00000000833 MCGSTATUS 4
Kernel panic - not syncing: Machine check
I've been running memtest86 V3.3 (if I recall the exact title) on all
the machines starting earlier today and will be looking at them in the
next day or two to figure out what they say.
One thing that disturbs me is that it shows ECC: no in memtest, even
when I force enable it on - and the RAM is most definately ECC...
On 1/30/06, Anthony DeRobertis <anthony@derobert.net> wrote:
> ECC failures will generate MCE's. The MCE message *should* provide some
> hint as to what is wrong.
Reply to: