[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Install Fails on T5240



Hi Jeremy,

The T5140/T5240 only support 1.5V ECC FB-DIMMs.  If the system detects a DIMM that:

Is not FB-DIMM (e.g., standard DDR2 ecc),

Has incorrect voltage (e.g., 1.8V),

Is not recognized due to firmware mismatch or unsupported FRU. Should match FRU part number(s) exactly!

DIMMs in a CMP branch must be identical in capacity and FRU.

…it will throw a fault like:

Code
DIMM in slot X must be FB-DIMM
This is a hard fault, and the system will disable the DIMM.

Try clearing all system faults, using ILOM cli, specially those that are memory related.  Also note, putting in dimms in an incorrect or mismatch order may also cause issues.  You will have to try different memory combinations and clearing faults each time (if unsuccessful). I had something similar happen on my t5140 years ago, but can't say it will resolve your problem (especially if you have bad hardware (motherboard, dimms, incorrect FRU for dimms, etc).


Try this:

To clear memory faults on a Sun SPARC Enterprise T5140/T5240, use the ILOM (Integrated Lights Out Manager) interface.

*1) See this:  Also remember to avoid static electricity and ground yourself before touching anything within the motherboard.

https://docs.oracle.com/cd/E19712-01/E21412-01/z40000741389175.html

2) Access the ILOM CLI Connect via serial console or SSH to the service processor. You should see the -> prompt.

-> show /SYS

-> show faulty

#This should list all currently faulted components.

Run the Clear Fault Command Use the following syntax:

-> set /SYS/component clear_fault_action=true

#Replace /SYS/component with the actual faulted component path. For example:

-> set /SYS/MB/MEM0 clear_fault_action=true

Note:

DIMMs are labeled like:

/SYS/MB/MEM0

/SYS/MB/MEM1

etc.

3) Verify Faults Are Cleared Run:

-> show /SYS

-> show faulty

4) Make sure the system is not in diagnostic mode, unless you're actively troubleshooting:

-> set /SYS keyswitch_state=normal


Regards,
Tony

 On 9/2/25 2:18 AM, Robin Cremer wrote:
Jeremy,

I'm unsure if this helps on FBDIMMs, but you could try reseating the CPU and cleaning (compressed air) of the CPU socket & affected memory module socket.

I have had success on "weird" memory errors on SD-RAM & DDR Systems in the past by reseating the CPU. FB-Dimm is a bit different, though, as the modules are on a serial bus-like structure iirc, so I'm unsure if a single module on a channel can be affected and others fine if the CPU connection is the culprit...

Also, if the system has pluggable VRMs for CPUs and Memory, try swapping them left/right and see if the error follows.

Greetings,
Robin


Am 02.09.2025 um 06:30 schrieb Jeremy Leonard:
Adrian,

So, I was able to pick up some replacement RAM. Doesn't Seem to matter what modules I have where.

Sometimes I get  /SYS/MB/CMP0/BR1/CH0/D0 failed. I've tried moving modules and it's still this location.

Other times I get a bunch of errors on multiple locations:

379 Chassis Log critical Tue Sep 2 00:16:46 2025 Host has been powered off 378 Chassis Log critical Tue Sep 2 00:16:46 2025 Sep 2 00:16:42 FATAL: No memory available 377 Chassis Log major Tue Sep 2 00:16:46 2025 Sep 2 00:16:42 ERROR: Unsupported memory configuration 376 Fault Fault critical Tue Sep 2 00:16:45 2025 SP detected fault at time Tue Sep 2 00:16:42 2025. Sep 2 00:16:42 ERROR: Unsupported memory configuration 375 Chassis Log critical Tue Sep 2 00:16:45 2025 Sep 2 00:16:42 FATAL: The HOST Processor has a configuration error, forcing a power-down 374 Chassis Log critical Tue Sep 2 00:16:45 2025 Sep 2 00:16:42 FATAL: No useable memory branches. 373 Chassis Log major Tue Sep 2 00:16:45 2025 Sep 2 00:16:42 ERROR: MB/CMP0/BR1/CH1/D0/J1100: does not meet system criteria, not configured 372 Chassis Log major Tue Sep 2 00:16:45 2025 Sep 2 00:16:42 ERROR: MB/CMP0/BR1/CH1/D0/J1100: TCASE_DELTA does not meet requirements 371 Chassis Log major Tue Sep 2 00:16:45 2025 Sep 2 00:16:42 ERROR: MB/CMP0/BR1/CH1/D0/J1100: TREFI does not meet temperature requirements 370 Chassis Log major Tue Sep 2 00:16:45 2025 Sep 2 00:16:42 ERROR: MB/CMP0/BR1/CH1/D0/J1100: TREFI does not meet requirements 369 Chassis Log major Tue Sep 2 00:16:45 2025 Sep 2 00:16:42 ERROR: MB/CMP0/BR1/CH1/D0/J1100: Does not meet cas latency requirements 368 Chassis Log major Tue Sep 2 00:16:45 2025 Sep 2 00:16:42 ERROR: MB/CMP0/BR1/CH1/D0/J1100: Unsupported num of row bits: 12, must be 13, 14, or 15 367 Chassis Log major Tue Sep 2 00:16:45 2025 Sep 2 00:16:42 ERROR: MB/CMP0/BR1/CH1/D0/J1100: Unsupported num of column bits: 9, must be 10 or 11 366 Chassis Log major Tue Sep 2 00:16:45 2025 Sep 2 00:16:42 ERROR: MB/CMP0/BR1/CH1/D0/J1100: Unsupported number of ranks: 0, must be 1 or 2 365 Chassis Log major Tue Sep 2 00:16:45 2025 Sep 2 00:16:42 ERROR: MB/CMP0/BR1/CH1/D0/J1100: Unsupported burst length 0, must support 4 and 8 364 Chassis Log major Tue Sep 2 00:16:44 2025 Sep 2 00:16:42 ERROR: MB/CMP0/BR1/CH1/D0/J1100: Unsupported key 0, must be FBDIMM! 363 Chassis Log major Tue Sep 2 00:16:44 2025 Sep 2 00:16:42 ERROR: MB/CMP0/BR1/CH1/D0/J1100: Unsupported SPD revision 0 362 Chassis Log major Tue Sep 2 00:16:44 2025 Sep 2 00:16:42 ERROR: MB/CMP0/BR1/CH0/D0/J0900: does not meet system criteria, not configured 361 Chassis Log major Tue Sep 2 00:16:44 2025 Sep 2 00:16:42 ERROR: MB/CMP0/BR1/CH0/D0/J0900: TCASE_DELTA does not meet requirements 360 Chassis Log major Tue Sep 2 00:16:44 2025 Sep 2 00:16:42 ERROR: MB/CMP0/BR1/CH0/D0/J0900: TREFI does not meet temperature requirements 359 Chassis Log major Tue Sep 2 00:16:44 2025 Sep 2 00:16:42 ERROR: MB/CMP0/BR1/CH0/D0/J0900: TREFI does not meet requirements 358 Chassis Log major Tue Sep 2 00:16:44 2025 Sep 2 00:16:42 ERROR: MB/CMP0/BR1/CH0/D0/J0900: Does not meet cas latency requirements 357 Chassis Log major Tue Sep 2 00:16:44 2025 Sep 2 00:16:42 ERROR: MB/CMP0/BR1/CH0/D0/J0900: Unsupported num of row bits: 12, must be 13, 14, or 15 356 Chassis Log major Tue Sep 2 00:16:44 2025 Sep 2 00:16:42 ERROR: MB/CMP0/BR1/CH0/D0/J0900: Unsupported num of column bits: 9, must be 10 or 11 355 Chassis Log major Tue Sep 2 00:16:44 2025 Sep 2 00:16:42 ERROR: MB/CMP0/BR1/CH0/D0/J0900: Unsupported number of ranks: 0, must be 1 or 2 354 Chassis Log major Tue Sep 2 00:16:44 2025 Sep 2 00:16:42 ERROR: MB/CMP0/BR1/CH0/D0/J0900: Unsupported burst length 0, must support 4 and 8 353 Chassis Log major Tue Sep 2 00:16:44 2025 Sep 2 00:16:42 ERROR: MB/CMP0/BR1/CH0/D0/J0900: Unsupported key 0, must be FBDIMM! 352 Chassis Log major Tue Sep 2 00:16:44 2025 Sep 2 00:16:42 ERROR: MB/CMP0/BR1/CH0/D0/J0900: Unsupported SPD revision 0 351 Chassis Log major Tue Sep 2 00:16:44 2025 Sep 2 00:16:42 ERROR: MB/CMP0/BR0/CH1/D0/J0700: does not meet system criteria, not configured 350 Chassis Log major Tue Sep 2 00:16:44 2025 Sep 2 00:16:42 ERROR: MB/CMP0/BR0/CH1/D0/J0700: TCASE_DELTA does not meet requirements 349 Chassis Log major Tue Sep 2 00:16:43 2025 Sep 2 00:16:42 ERROR: MB/CMP0/BR0/CH1/D0/J0700: TREFI does not meet temperature requirements 348 Chassis Log major Tue Sep 2 00:16:43 2025 Sep 2 00:16:42 ERROR: MB/CMP0/BR0/CH1/D0/J0700: TREFI does not meet requirements 347 Chassis Log major Tue Sep 2 00:16:43 2025 Sep 2 00:16:42 ERROR: MB/CMP0/BR0/CH1/D0/J0700: Does not meet cas latency requirements 346 Chassis Log major Tue Sep 2 00:16:43 2025 Sep 2 00:16:42 ERROR: MB/CMP0/BR0/CH1/D0/J0700: Unsupported num of row bits: 12, must be 13, 14, or 15 345 Chassis Log major Tue Sep 2 00:16:43 2025 Sep 2 00:16:42 ERROR: MB/CMP0/BR0/CH1/D0/J0700: Unsupported num of column bits: 9, must be 10 or 11 344 Chassis Log major Tue Sep 2 00:16:43 2025 Sep 2 00:16:41 ERROR: MB/CMP0/BR0/CH1/D0/J0700: Unsupported number of ranks: 0, must be 1 or 2 343 Chassis Log major Tue Sep 2 00:16:43 2025 Sep 2 00:16:41 ERROR: MB/CMP0/BR0/CH1/D0/J0700: Unsupported burst length 0, must support 4 and 8 342 Chassis Log major Tue Sep 2 00:16:43 2025 Sep 2 00:16:41 ERROR: MB/CMP0/BR0/CH1/D0/J0700: Unsupported key 0, must be FBDIMM! 341 Chassis Log major Tue Sep 2 00:16:43 2025 Sep 2 00:16:41 ERROR: MB/CMP0/BR0/CH1/D0/J0700: Unsupported SPD revision 0 340 Chassis Log major Tue Sep 2 00:16:43 2025 Sep 2 00:16:41 ERROR: MB/CMP0/BR0/CH0/D0/J0500: does not meet system criteria, not configured 339 Chassis Log major Tue Sep 2 00:16:43 2025 Sep 2 00:16:41 ERROR: MB/CMP0/BR0/CH0/D0/J0500: TCASE_DELTA does not meet requirements 338 Chassis Log major Tue Sep 2 00:16:43 2025 Sep 2 00:16:41 ERROR: MB/CMP0/BR0/CH0/D0/J0500: TREFI does not meet temperature requirements 337 Chassis Log major Tue Sep 2 00:16:43 2025 Sep 2 00:16:41 ERROR: MB/CMP0/BR0/CH0/D0/J0500: TREFI does not meet requirements 336 Chassis Log major Tue Sep 2 00:16:43 2025 Sep 2 00:16:41 ERROR: MB/CMP0/BR0/CH0/D0/J0500: Does not meet cas latency requirements 335 Chassis Log major Tue Sep 2 00:16:43 2025 Sep 2 00:16:41 ERROR: MB/CMP0/BR0/CH0/D0/J0500: Unsupported num of row bits: 12, must be 13, 14, or 15 334 Chassis Log major Tue Sep 2 00:16:42 2025 Sep 2 00:16:41 ERROR: MB/CMP0/BR0/CH0/D0/J0500: Unsupported num of column bits: 9, must be 10 or 11 333 Chassis Log major Tue Sep 2 00:16:42 2025 Sep 2 00:16:41 ERROR: MB/CMP0/BR0/CH0/D0/J0500: Unsupported number of ranks: 0, must be 1 or 2 332 Chassis Log major Tue Sep 2 00:16:42 2025 Sep 2 00:16:41 ERROR: MB/CMP0/BR0/CH0/D0/J0500: Unsupported burst length 0, must support 4 and 8 331 Chassis Log major Tue Sep 2 00:16:41 2025 Sep 2 00:16:41 ERROR: MB/CMP0/BR0/CH0/D0/J0500: Unsupported key 0, must be FBDIMM! 330 Chassis Log major Tue Sep 2 00:16:41 2025 Sep 2 00:16:41 ERROR: MB/CMP0/BR0/CH0/D0/J0500: Unsupported SPD revision 0
329 Chassis Log major Tue Sep 2 00:16:38 2025 Host has been powered on

These errors aren't accurate. And it seems to be all the modules on CPU0.

If I power cycle the system I can occasionally get it to boot. I've been able to get Solaris 11.3 installed and I've booted to it a few times.

Also, with the latest build of Debian I was able to complete the install but when it rebooted it still went to Solaris. So I shut it down planning to boot the other disk in OpenBoot but I haven't been able to get it to come back up since.

I'm thinking I may have a more serious issue than just memory modules. OR I have a LOT of bad modules.

I originally had 8 x 4GB modules and I now have 8 x 8GB modules. But I have the same behaviour regardless which modules are installed and in what slots. I don't have a mezzanine card installed. This is just on the motherboard.

On Sat, Jul 26, 2025 at 3:10 AM John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> wrote:

    Hi Jeremy

    On Fri, 2025-07-25 at 21:48 -0400, Jeremy Leonard wrote:
    > So this probably isn't the proper place to continue with this
    since this
    > list is for Debian. But I don't really know where to go from
    here. I'm
    > just trying to start with Sun/Sparc things.

    It's not 100% clear to me from the logs what could be the source
    of the problem
    but since error comes from the "cpumem-diagnosis" diagnostic
    engine, I would just
    try replacing memory modules if you have the possibility.

    If you don't have any replacement modules available, I would just
    suggest removing
    two of the modules (if there are enough) and then booting Solaris
    again. Cycle
    through all memory modules until the error is gone.

    PS: We can continue the discussion on this list, even hardware
    failures are on topic.

    Adrian

--  .''`.  John Paul Adrian Glaubitz
    : :' :  Debian Developer
    `. `'   Physicist
      `-    GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913



--
Jeremy Leonard
JeremyL@elite4god.com
Cell: (517) 285-8309



Reply to: