[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Install Fails on T5240



Jeremy,

I'm unsure if this helps on FBDIMMs, but you could try reseating the CPU and cleaning (compressed air) of the CPU socket & affected memory module socket.

I have had success on "weird" memory errors on SD-RAM & DDR Systems in the past by reseating the CPU.
FB-Dimm is a bit different, though, as the modules are on a serial bus-like structure iirc, so I'm unsure if a single module on a channel can be affected and others fine if the CPU connection is the culprit...

Also, if the system has pluggable VRMs for CPUs and Memory, try swapping them left/right and see if the error follows.

Greetings,
Robin


Am 02.09.2025 um 06:30 schrieb Jeremy Leonard:
Adrian,

So, I was able to pick up some replacement RAM. Doesn't Seem to matter what modules I have where. 

Sometimes I get  /SYS/MB/CMP0/BR1/CH0/D0 failed. I've tried moving modules and it's still this location. 

Other times I get a bunch of errors on multiple locations:

379 Chassis Log critical Tue Sep 2 00:16:46 2025 Host has been powered off
378 Chassis Log critical Tue Sep 2 00:16:46 2025 Sep 2 00:16:42 FATAL: No memory available
377 Chassis Log major Tue Sep 2 00:16:46 2025 Sep 2 00:16:42 ERROR: Unsupported memory configuration
376 Fault Fault critical Tue Sep 2 00:16:45 2025 SP detected fault at time Tue Sep 2 00:16:42 2025. Sep 2 00:16:42 ERROR: Unsupported memory configuration
375 Chassis Log critical Tue Sep 2 00:16:45 2025 Sep 2 00:16:42 FATAL: The HOST Processor has a configuration error, forcing a power-down
374 Chassis Log critical Tue Sep 2 00:16:45 2025 Sep 2 00:16:42 FATAL: No useable memory branches.
373 Chassis Log major Tue Sep 2 00:16:45 2025 Sep 2 00:16:42 ERROR: MB/CMP0/BR1/CH1/D0/J1100: does not meet system criteria, not configured
372 Chassis Log major Tue Sep 2 00:16:45 2025 Sep 2 00:16:42 ERROR: MB/CMP0/BR1/CH1/D0/J1100: TCASE_DELTA does not meet requirements
371 Chassis Log major Tue Sep 2 00:16:45 2025 Sep 2 00:16:42 ERROR: MB/CMP0/BR1/CH1/D0/J1100: TREFI does not meet temperature requirements
370 Chassis Log major Tue Sep 2 00:16:45 2025 Sep 2 00:16:42 ERROR: MB/CMP0/BR1/CH1/D0/J1100: TREFI does not meet requirements
369 Chassis Log major Tue Sep 2 00:16:45 2025 Sep 2 00:16:42 ERROR: MB/CMP0/BR1/CH1/D0/J1100: Does not meet cas latency requirements
368 Chassis Log major Tue Sep 2 00:16:45 2025 Sep 2 00:16:42 ERROR: MB/CMP0/BR1/CH1/D0/J1100: Unsupported num of row bits: 12, must be 13, 14, or 15
367 Chassis Log major Tue Sep 2 00:16:45 2025 Sep 2 00:16:42 ERROR: MB/CMP0/BR1/CH1/D0/J1100: Unsupported num of column bits: 9, must be 10 or 11
366 Chassis Log major Tue Sep 2 00:16:45 2025 Sep 2 00:16:42 ERROR: MB/CMP0/BR1/CH1/D0/J1100: Unsupported number of ranks: 0, must be 1 or 2
365 Chassis Log major Tue Sep 2 00:16:45 2025 Sep 2 00:16:42 ERROR: MB/CMP0/BR1/CH1/D0/J1100: Unsupported burst length 0, must support 4 and 8
364 Chassis Log major Tue Sep 2 00:16:44 2025 Sep 2 00:16:42 ERROR: MB/CMP0/BR1/CH1/D0/J1100: Unsupported key 0, must be FBDIMM!
363 Chassis Log major Tue Sep 2 00:16:44 2025 Sep 2 00:16:42 ERROR: MB/CMP0/BR1/CH1/D0/J1100: Unsupported SPD revision 0
362 Chassis Log major Tue Sep 2 00:16:44 2025 Sep 2 00:16:42 ERROR: MB/CMP0/BR1/CH0/D0/J0900: does not meet system criteria, not configured
361 Chassis Log major Tue Sep 2 00:16:44 2025 Sep 2 00:16:42 ERROR: MB/CMP0/BR1/CH0/D0/J0900: TCASE_DELTA does not meet requirements
360 Chassis Log major Tue Sep 2 00:16:44 2025 Sep 2 00:16:42 ERROR: MB/CMP0/BR1/CH0/D0/J0900: TREFI does not meet temperature requirements
359 Chassis Log major Tue Sep 2 00:16:44 2025 Sep 2 00:16:42 ERROR: MB/CMP0/BR1/CH0/D0/J0900: TREFI does not meet requirements
358 Chassis Log major Tue Sep 2 00:16:44 2025 Sep 2 00:16:42 ERROR: MB/CMP0/BR1/CH0/D0/J0900: Does not meet cas latency requirements
357 Chassis Log major Tue Sep 2 00:16:44 2025 Sep 2 00:16:42 ERROR: MB/CMP0/BR1/CH0/D0/J0900: Unsupported num of row bits: 12, must be 13, 14, or 15
356 Chassis Log major Tue Sep 2 00:16:44 2025 Sep 2 00:16:42 ERROR: MB/CMP0/BR1/CH0/D0/J0900: Unsupported num of column bits: 9, must be 10 or 11
355 Chassis Log major Tue Sep 2 00:16:44 2025 Sep 2 00:16:42 ERROR: MB/CMP0/BR1/CH0/D0/J0900: Unsupported number of ranks: 0, must be 1 or 2
354 Chassis Log major Tue Sep 2 00:16:44 2025 Sep 2 00:16:42 ERROR: MB/CMP0/BR1/CH0/D0/J0900: Unsupported burst length 0, must support 4 and 8
353 Chassis Log major Tue Sep 2 00:16:44 2025 Sep 2 00:16:42 ERROR: MB/CMP0/BR1/CH0/D0/J0900: Unsupported key 0, must be FBDIMM!
352 Chassis Log major Tue Sep 2 00:16:44 2025 Sep 2 00:16:42 ERROR: MB/CMP0/BR1/CH0/D0/J0900: Unsupported SPD revision 0
351 Chassis Log major Tue Sep 2 00:16:44 2025 Sep 2 00:16:42 ERROR: MB/CMP0/BR0/CH1/D0/J0700: does not meet system criteria, not configured
350 Chassis Log major Tue Sep 2 00:16:44 2025 Sep 2 00:16:42 ERROR: MB/CMP0/BR0/CH1/D0/J0700: TCASE_DELTA does not meet requirements
349 Chassis Log major Tue Sep 2 00:16:43 2025 Sep 2 00:16:42 ERROR: MB/CMP0/BR0/CH1/D0/J0700: TREFI does not meet temperature requirements
348 Chassis Log major Tue Sep 2 00:16:43 2025 Sep 2 00:16:42 ERROR: MB/CMP0/BR0/CH1/D0/J0700: TREFI does not meet requirements
347 Chassis Log major Tue Sep 2 00:16:43 2025 Sep 2 00:16:42 ERROR: MB/CMP0/BR0/CH1/D0/J0700: Does not meet cas latency requirements
346 Chassis Log major Tue Sep 2 00:16:43 2025 Sep 2 00:16:42 ERROR: MB/CMP0/BR0/CH1/D0/J0700: Unsupported num of row bits: 12, must be 13, 14, or 15
345 Chassis Log major Tue Sep 2 00:16:43 2025 Sep 2 00:16:42 ERROR: MB/CMP0/BR0/CH1/D0/J0700: Unsupported num of column bits: 9, must be 10 or 11
344 Chassis Log major Tue Sep 2 00:16:43 2025 Sep 2 00:16:41 ERROR: MB/CMP0/BR0/CH1/D0/J0700: Unsupported number of ranks: 0, must be 1 or 2
343 Chassis Log major Tue Sep 2 00:16:43 2025 Sep 2 00:16:41 ERROR: MB/CMP0/BR0/CH1/D0/J0700: Unsupported burst length 0, must support 4 and 8
342 Chassis Log major Tue Sep 2 00:16:43 2025 Sep 2 00:16:41 ERROR: MB/CMP0/BR0/CH1/D0/J0700: Unsupported key 0, must be FBDIMM!
341 Chassis Log major Tue Sep 2 00:16:43 2025 Sep 2 00:16:41 ERROR: MB/CMP0/BR0/CH1/D0/J0700: Unsupported SPD revision 0
340 Chassis Log major Tue Sep 2 00:16:43 2025 Sep 2 00:16:41 ERROR: MB/CMP0/BR0/CH0/D0/J0500: does not meet system criteria, not configured
339 Chassis Log major Tue Sep 2 00:16:43 2025 Sep 2 00:16:41 ERROR: MB/CMP0/BR0/CH0/D0/J0500: TCASE_DELTA does not meet requirements
338 Chassis Log major Tue Sep 2 00:16:43 2025 Sep 2 00:16:41 ERROR: MB/CMP0/BR0/CH0/D0/J0500: TREFI does not meet temperature requirements
337 Chassis Log major Tue Sep 2 00:16:43 2025 Sep 2 00:16:41 ERROR: MB/CMP0/BR0/CH0/D0/J0500: TREFI does not meet requirements
336 Chassis Log major Tue Sep 2 00:16:43 2025 Sep 2 00:16:41 ERROR: MB/CMP0/BR0/CH0/D0/J0500: Does not meet cas latency requirements
335 Chassis Log major Tue Sep 2 00:16:43 2025 Sep 2 00:16:41 ERROR: MB/CMP0/BR0/CH0/D0/J0500: Unsupported num of row bits: 12, must be 13, 14, or 15
334 Chassis Log major Tue Sep 2 00:16:42 2025 Sep 2 00:16:41 ERROR: MB/CMP0/BR0/CH0/D0/J0500: Unsupported num of column bits: 9, must be 10 or 11
333 Chassis Log major Tue Sep 2 00:16:42 2025 Sep 2 00:16:41 ERROR: MB/CMP0/BR0/CH0/D0/J0500: Unsupported number of ranks: 0, must be 1 or 2
332 Chassis Log major Tue Sep 2 00:16:42 2025 Sep 2 00:16:41 ERROR: MB/CMP0/BR0/CH0/D0/J0500: Unsupported burst length 0, must support 4 and 8
331 Chassis Log major Tue Sep 2 00:16:41 2025 Sep 2 00:16:41 ERROR: MB/CMP0/BR0/CH0/D0/J0500: Unsupported key 0, must be FBDIMM!
330 Chassis Log major Tue Sep 2 00:16:41 2025 Sep 2 00:16:41 ERROR: MB/CMP0/BR0/CH0/D0/J0500: Unsupported SPD revision 0
329 Chassis Log major Tue Sep 2 00:16:38 2025 Host has been powered on

These errors aren't accurate. And it seems to be all the modules on CPU0. 

If I power cycle the system I can occasionally get it to boot. I've been able to get Solaris 11.3 installed and I've booted to it a few times. 

Also, with the latest build of Debian I was able to complete the install but when it rebooted it still went to Solaris. So I shut it down planning to boot the other disk in OpenBoot but I haven't been able to get it to come back up since. 

I'm thinking I may have a more serious issue than just memory modules. OR I have a LOT of bad modules. 

I originally had 8 x 4GB modules and I now have 8 x 8GB modules. But I have the same behaviour regardless which modules are installed and in what slots. I don't have a mezzanine card installed. This is just on the motherboard.

On Sat, Jul 26, 2025 at 3:10 AM John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> wrote:
Hi Jeremy

On Fri, 2025-07-25 at 21:48 -0400, Jeremy Leonard wrote:
> So this probably isn't the proper place to continue with this since this
> list is for Debian. But I don't really know where to go from here. I'm
> just trying to start with Sun/Sparc things.

It's not 100% clear to me from the logs what could be the source of the problem
but since error comes from the "cpumem-diagnosis" diagnostic engine, I would just
try replacing memory modules if you have the possibility.

If you don't have any replacement modules available, I would just suggest removing
two of the modules (if there are enough) and then booting Solaris again. Cycle
through all memory modules until the error is gone.

PS: We can continue the discussion on this list, even hardware failures are on topic.

Adrian

--
 .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer
`. `'   Physicist
  `-    GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913


--
Jeremy Leonard
Cell: (517) 285-8309

Attachment: smime.p7s
Description: Kryptografische S/MIME-Signatur


Reply to: