[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#992238: debian-installer: Installation fails on HP ProLiant m400 Server: additional cores crash, kernel hangs in acpi_init



On Mon, Aug 16, 2021 at 12:54:47PM +0200, Justus Winter wrote:
>Steve McIntyre <steve@einval.com> writes:
>> On Mon, Aug 16, 2021 at 10:09:49AM +0200, Justus Winter wrote:
>>>Package: debian-installer
>>>Version: 20210731
>>>Severity: critical
>>>Tags: d-i
>>>Justification: breaks the whole system
>>>
>>>Dear Maintainer,
>>>
>>>I'm trying to install Debian Bullseye on a ProLiant m400 Server
>>>Cartridge.  The cartridge is in EFI mode, we boot the EFI shim, GRUB,
>>>the kernel, and the netboot initrd via PXE and tftp.  We added the
>>>necessary kernel command line flags to redirect the kernel log to the
>>>serial console early on, and various debugging flags.
>>>
>>>Reading the log we believe that there are two problems.  First, while
>>>bringing up additional CPU cores, we see them crash immediately.
>>>Adding nosmp to the kernel command line avoids this, but doesn't make
>>>the second problem go away.  Second, the kernel calls acpi_init, which
>>>does not seem to return.
>>
>> Hmmm. The m400 sleds are getting quite old, and AIUI they're basically
>> EOL in terms of firmware support etc.
>
>Well, I'm also getting quite old, yet, I'd like to use Debian :)

Sure. :-)

>> I've got some Mustang (X-Gene 1) machines here, which are the same
>> core APM hardware but packaged on standard motherboard (mini-itx I
>> think?). I'm just trying a bullseye update on one now.
>
>Thanks.

OK, and my upgrade worked just fine. The key difference that I'm
seeing is that on my system ACPI is *not* used:

root@mustang4:/home/steve# grep ACPI /var/log/syslog
Aug 16 11:20:27 mustang4 kernel: [    0.000000] efi: ACPI=0x43fa700000 ACPI 2.0=0x43fa700014 SMBIOS 3.0=0x43fa9db000 ESRT=0x43ff006d18 MOKvar=0x43fd2b2000 MEMRESERVE=0x43fa5e0718 
Aug 16 11:20:27 mustang4 kernel: [    1.293700] ACPI: Interpreter disabled.
Aug 16 11:20:27 mustang4 kernel: [    1.322457] pnp: PnP ACPI: disabled

Basically, the firmware on these older machines is too old for ACPI to
work well. This brings back memories of X-Gene 1 oddities - the way
they boot the extra CPU cores depends on specific setup in the DTB. My
machine is working that way, but I'm guessing that maybe whatever in
the kernel determines this is *not* automatically disabling ACPI on
your machine.

Pondering: do things work better for you if you add "acpi=off" to the
kernel command line?

-- 
Steve McIntyre, Cambridge, UK.                                steve@einval.com
"Yes, of course duct tape works in a near-vacuum. Duct tape works
 anywhere. Duct tape is magic and should be worshipped."
   -― Andy Weir, "The Martian"


Reply to: