[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Bullseye (mostly) not booting on Proliant DL380 G7



Meanwhile I was able to identify more by removing "quiet" from the grub loader.
The pcc_cpufreq_init does not seem to hurt the booting - these are just warnings popping up.

The following messages appear on the console before the server freezes:

[ OK ] Finished Load Kernel Module fuse.
[ 62.887855] systemd[1]: Mounting FUSE Control File System...
   Mounting FUSE Controle File System...
[ 62.891852] systemd[1]: Finished Apply Kernel Variables.
[ OK ] Finished Apply Kernel Variables.
[ 62.892237] systemd[1]: Mounted FUSE Control File System.
[ OK ] Mounted FUSE Control File System.
[ 62.900668] systemd[1]: Finished Create System Users.
[ OK ] Finished Create System Users.
[ 62.902224] systemd[1]: Starting Create Static Device Nodes in /dev...
  Starting Create Static Device Nodes in /dev...
[ 62.920767] systemd[1]: modprobe@drm.service: Succeeded.
[ 62.921202] systemd[1]: Finished Load Kernel Module drm.
[ OK ] Finished Load Kernel Module drm.
[ 62.921979] systemd[1]: Finished Create Static Device Nodes in /dev.
[ OK ] Finished Create Static Device Nodes in /dev.
[ 62.925007] systemd[1]: Starting Rule-based Manager for Device Events and Files...
   Starting Rule-based Manager for Device Events and Files...
[ 62.955322] systemd[1]: Finished Monitoring of LVM2 mirrors, snapshots etc. using dmeventd or progress polling.
[ OK ] Finished Monitoring of LVM2 mirrors, snapshots etc. using dmeventd or progress polling.
[ 62.962186] systemd[1]: Started Rule-based Manager for Device Events and Files.

After this, no further messages, no login prompt, server does not react to keyboard input anymore. Only a hardware reset works in this case.
Out of ~10 server reboots this problem occurred 4 or 5 times.

Could it have something to do with drm? I've seen a drm driver error during earlier boot phase.

Jun 28 16:15:05 irczsrvp08 kernel: [   63.182074] [drm] radeon kernel modesetting enabled.
Jun 28 16:15:05 irczsrvp08 kernel: [   63.182197] radeon 0000:01:03.0: vgaarb: deactivate vga console
Jun 28 16:15:05 irczsrvp08 kernel: [   63.183720] Console: switching to colour dummy device 80x25
Jun 28 16:15:05 irczsrvp08 kernel: [   63.184088] [drm] initializing kernel modesetting (RV100 0x1002:0x515E 0x103C:0x31FB 0x02).
Jun 28 16:15:05 irczsrvp08 kernel: [   63.184208] radeon 0000:01:03.0: VRAM: 128M 0x00000000E8000000 - 0x00000000EFFFFFFF (64M used)
Jun 28 16:15:05 irczsrvp08 kernel: [   63.184210] radeon 0000:01:03.0: GTT: 512M 0x00000000C8000000 - 0x00000000E7FFFFFF
Jun 28 16:15:05 irczsrvp08 kernel: [   63.184219] [drm] Detected VRAM RAM=128M, BAR=128M
Jun 28 16:15:05 irczsrvp08 kernel: [   63.184220] [drm] RAM width 16bits DDR
Jun 28 16:15:05 irczsrvp08 kernel: [   63.184302] [TTM] Zone  kernel: Available graphics memory: 49487844 KiB
Jun 28 16:15:05 irczsrvp08 kernel: [   63.184304] [TTM] Zone   dma32: Available graphics memory: 2097152 KiB
Jun 28 16:15:05 irczsrvp08 kernel: [   63.184305] [TTM] Initializing pool allocator
Jun 28 16:15:05 irczsrvp08 kernel: [   63.184310] [TTM] Initializing DMA pool allocator
Jun 28 16:15:05 irczsrvp08 kernel: [   63.184333] [drm] radeon: 64M of VRAM memory ready
Jun 28 16:15:05 irczsrvp08 kernel: [   63.184334] [drm] radeon: 512M of GTT memory ready.
Jun 28 16:15:05 irczsrvp08 kernel: [   63.184371] [drm] GART: num cpu pages 131072, num gpu pages 131072
Jun 28 16:15:05 irczsrvp08 kernel: [   63.205645] [drm] PCI GART of 512M enabled (table at 0x00000000FFF00000).
Jun 28 16:15:05 irczsrvp08 kernel: [   63.205890] radeon 0000:01:03.0: WB disabled
Jun 28 16:15:05 irczsrvp08 kernel: [   63.205894] radeon 0000:01:03.0: fence driver on ring 0 use gpu addr 0x00000000c8000000
Jun 28 16:15:05 irczsrvp08 kernel: [   63.205967] [drm] radeon: irq initialized.
Jun 28 16:15:05 irczsrvp08 kernel: [   63.205980] [drm] Loading R100 Microcode
Jun 28 16:15:05 irczsrvp08 kernel: [   63.206233] radeon 0000:01:03.0: firmware: failed to load radeon/R100_cp.bin (-2)
Jun 28 16:15:05 irczsrvp08 kernel: [   63.206241] firmware_class: See https://wiki.debian.org/Firmware for information about missing firmware
Jun 28 16:15:05 irczsrvp08 kernel: [   63.206246] radeon 0000:01:03.0: Direct firmware load for radeon/R100_cp.bin failed with error -2
Jun 28 16:15:05 irczsrvp08 kernel: [   63.206311] [drm:r100_cp_init [radeon]] *ERROR* Failed to load firmware!
Jun 28 16:15:05 irczsrvp08 kernel: [   63.206318] radeon 0000:01:03.0: failed initializing CP (-2).
Jun 28 16:15:05 irczsrvp08 kernel: [   63.206321] radeon 0000:01:03.0: Disabling GPU acceleration
Jun 28 16:15:05 irczsrvp08 kernel: [   63.206329] [drm] radeon: cp finalized
Jun 28 16:15:05 irczsrvp08 kernel: [   63.206961] [drm] No TV DAC info found in BIOS
Jun 28 16:15:05 irczsrvp08 kernel: [   63.206996] [drm] Radeon Display Connectors
Jun 28 16:15:05 irczsrvp08 kernel: [   63.206997] [drm] Connector 0:
Jun 28 16:15:05 irczsrvp08 kernel: [   63.206998] [drm]   VGA-1
Jun 28 16:15:05 irczsrvp08 kernel: [   63.206999] [drm]   DDC: 0x60 0x60 0x60 0x60 0x60 0x60 0x60 0x60
Jun 28 16:15:05 irczsrvp08 kernel: [   63.207000] [drm]   Encoders:
Jun 28 16:15:05 irczsrvp08 kernel: [   63.207001] [drm]     CRT1: INTERNAL_DAC1
Jun 28 16:15:05 irczsrvp08 kernel: [   63.207002] [drm] Connector 1:
Jun 28 16:15:05 irczsrvp08 kernel: [   63.207003] [drm]   VGA-2
Jun 28 16:15:05 irczsrvp08 kernel: [   63.207004] [drm]   DDC: 0x6c 0x6c 0x6c 0x6c 0x6c 0x6c 0x6c 0x6c
Jun 28 16:15:05 irczsrvp08 kernel: [   63.207004] [drm]   Encoders:
Jun 28 16:15:05 irczsrvp08 kernel: [   63.207005] [drm]     CRT2: INTERNAL_DAC2
Jun 28 16:15:05 irczsrvp08 kernel: [   63.236242] kvm: VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL does not work properly. Using workaround
Jun 28 16:15:05 irczsrvp08 kernel: [   63.245005] EXT4-fs (dm-0): mounted filesystem with ordered data mode. Opts: (null)
Jun 28 16:15:05 irczsrvp08 kernel: [   63.250269] [drm] fb mappable at 0xE8040000
Jun 28 16:15:05 irczsrvp08 kernel: [   63.250270] [drm] vram apper at 0xE8000000
Jun 28 16:15:05 irczsrvp08 kernel: [   63.250271] [drm] size 1572864
Jun 28 16:15:05 irczsrvp08 kernel: [   63.250271] [drm] fb depth is 16
Jun 28 16:15:05 irczsrvp08 kernel: [   63.250272] [drm]    pitch is 2048

Maybe related to the known bullseye errata https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=989863 ?



On Mon, Jun 28, 2021 at 8:32 PM Claudio Kuenzler <ck@claudiokuenzler.com> wrote:
Hello!

Currently testing the new Bullseye release (using firmware-bullseye-DI-rc2-amd64-netinst.iso) and see a strange phenomenon on a HP Proliant DL380 G7 server.

During boot, the following messages show up in the console:

[63.063844] pcc_cpufreq_init: Too many CPUs, dynamic performance scaling disabled
[63.063895] pcc_cpufreq_init: Try to enable another scaling driver through BIOS settings
[63.063943] pcc_cpufreq_init: and complain to the system vendor

According to https://patchwork.kernel.org/project/linux-pm/patch/5423012.ZZnfdYddaT@aspire.rjw.lan/ this is a Kernel patch from July 2018.
According to Andreas Herrmann, the settings can be defined in the HP server BIOS:

Power Management -> Advanced Power Options -> Collaborative Power Control = enabled

This is active (is the default I believe). The Power Regulator is set to "Dynamic Power Savings Mode".

After these messages show up on the console, no login prompt appears. No network started. The server seems frozen - doesn't even react to CTRL+ALT+DEL on the console anymore. Not sure if this is caused by cpufreq or something else though.

This boot problem happened on 2 out of 3 server boots.

Is this a bug in Bullseye?

thx for any hints.


Reply to: