[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#971371: marked as done (linux-image-5.8.0-2-amd64: crashes in amdgpu)



Your message dated Wed, 19 Feb 2025 16:00:35 +0100 (CET)
with message-id <20250219150035.42E54BE2EE7@eldamar.lan>
and subject line Closing this bug (BTS maintenance for src:linux bugs)
has caused the Debian Bug report #971371,
regarding linux-image-5.8.0-2-amd64: crashes in amdgpu
to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact owner@bugs.debian.org
immediately.)


-- 
971371: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=971371
Debian Bug Tracking System
Contact owner@bugs.debian.org with problems
--- Begin Message ---
Subject: linux-image-5.8.0-2-amd64: crashes in amdgpu
Package: src:linux
Version: 5.8.10-1
Severity: important

Dear Maintainer,

I have recently got a new laptop and tried to install Debian Buster
on it. As it turned out, the support for Renoir GPUs was only added in
recent kernels. Because of that, with
linux-image-4.19.0-10-amd64 (4.19.132-1) I can only access text console,
but it is impossible to run Xorg.

I have tried to run following kernels with similar results:

linux-image-5.6.0-0.bpo.2-amd64          5.6.14-2~bpo10+1
linux-image-5.7.0-0.bpo.2-amd64          5.7.10-1~bpo10+1
linux-image-5.8.0-2-amd64                5.8.10-1

In particular, with this kernels lightdm successfully started.

With linux-image-5.6.0-0.bpo.2-amd64 (5.6.14-2~bpo10+1) and
linux-image-5.7.0-0.bpo.2-amd64 (5.7.10-1~bpo10+1), after pressing
Ctrl-Alt-F1 to open console it showed kernel traces, messages about
"oops" and "kernel BUG", scrolling every few seconds.
The console did not respond to regular characters (alphabetic, Enter
and so on), Ctrl-Alt-F2 showed empty unresponsive console with blinking
underline in the left-top corner. ssh connections, through which I tried
to dump dmesg, also hanged up. After forcibly shutting down (long press
on "power" button) and booting with linux-image-4.19.0-10-amd64
(4.19.132-1) it appeared that these messages were saved in
/var/log/syslog. Output of "grep -a kernel syslog" is attached as
"syslog_kernel.txt.bz2".

After that I tried to install more new kernel and firmware packages from
the bullseye repositorues. I chose to mix distribution versions to be
able to reboot into stable kernel afterwards. To prevent automatic
upgrading I created /etc/apt/apt.conf with following content:

APT {
  Default-Release "buster";
}

With linux-image-5.8.0-2-amd64 (5.8.10-1) I was able to start Mate DE
and glxgears (I had not tried it with previous kernels). But pressing
Ctrl-Alt-F1 led to black screen and no responce from keyboard
(Ctrl-Alt-F7 did not work). But the laptop remained accessible through
ssh, through which I am running reportbug now.

Also, it appeared that if I try to run lspci after this crashes, it does
not show any output and can't be killed. I had to send SIGTERM to one of
scripts run by the reportbug which waited for lspci to continue. With
linux-image-4.19.0-10-amd64 (4.19.132-1) lspci runs as it should.

P.S.: Please let me know if this may be fixed in a short time. I have
less than two weeks to return this laptop to store and get a refund
"just because this laptop is not to my liking", but it will be a pity to
do that.

-- Package-specific info:
** Version:
Linux version 5.8.0-2-amd64 (debian-kernel@lists.debian.org) (gcc-10
(Debian 10.2.0-9) 10.2.0, GNU ld (GNU Binutils for Debian) 2.35) #1 SMP
Debian 5.8.10-1 (2020-09-19)

** Command line:
BOOT_IMAGE=/boot/vmlinuz-5.8.0-2-amd64
root=UUID=51aec92c-f92a-4b5a-94a5-65295c1b46b9 ro quiet

** Tainted: DW (640)
 * kernel died recently, i.e. there was an OOPS or BUG
 * kernel issued warning

** Kernel log:
[   99.144304] Hardware name: Micro-Star International Co., Ltd. Bravo
15 A4DDR/MS-16WK, BIOS E16WKAMS.10E 05/21/2020
[   99.144309] Workqueue: pm pm_runtime_work
[   99.144467] RIP: 0010:dm_suspend+0xa6/0xb0 [amdgpu]
[   99.144469] Code: 48 89 ef 48 89 85 00 53 01 00 48 89 c6 e8 f2 b9 ff
ff 48 8b bd 80 38 01 00 e8 36 fe ff ff 48 89 ef e8 8e 1d 00 00 31 c0 5d
c3 <0f> 0b e9 73 ff ff ff 0f 1f 00 0f 1f 44 00 00 48 8b 47 08 8b 57 38
[   99.144470] RSP: 0018:ffffb2e9c06fbcb0 EFLAGS: 00010282
[   99.144472] RAX: ffffffffc0883ba0 RBX: ffff8a36c5c35cf0 RCX:
0000000000000000
[   99.144474] RDX: 000000000000000a RSI: 0000000000003fe0 RDI:
ffff8a36c5c20000
[   99.144475] RBP: ffff8a36c5c20000 R08: 0000000000000000 R09:
ffff8a35cec1c42c
[   99.144476] R10: 0000000000000018 R11: 0000000000000018 R12:
ffff8a36c5c20000
[   99.144477] R13: ffff8a36dfeb60b0 R14: ffff8a36c5c20000 R15:
ffff8a36dfeb6238
[   99.144479] FS:  0000000000000000(0000) GS:ffff8a36e7940000(0000)
knlGS:0000000000000000
[   99.144480] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   99.144481] CR2: 00007f9af455a000 CR3: 0000000224568000 CR4:
0000000000340ee0
[   99.144483] Call Trace:
[   99.144597]  amdgpu_device_ip_suspend_phase1+0x83/0xe0 [amdgpu]
[   99.144708]  amdgpu_device_suspend+0x89/0x2b0 [amdgpu]
[   99.144713]  ? cpumask_next_and+0x19/0x20
[   99.144773]  amdgpu_pmops_runtime_suspend+0x9e/0x140 [amdgpu]
[   99.144775]  pci_pm_runtime_suspend+0x5e/0x170
[   99.144776]  ? vga_switcheroo_runtime_resume+0x60/0x60
[   99.144777]  vga_switcheroo_runtime_suspend+0x22/0xb0
[   99.144778]  ? vga_switcheroo_runtime_resume+0x60/0x60
[   99.144778]  ? vga_switcheroo_runtime_resume+0x60/0x60
[   99.144779]  __rpm_callback+0x81/0x140
[   99.144780]  ? vga_switcheroo_runtime_resume+0x60/0x60
[   99.144780]  rpm_callback+0x1f/0x70
[   99.144781]  ? vga_switcheroo_runtime_resume+0x60/0x60
[   99.144781]  rpm_suspend+0x148/0x680
[   99.144783]  ? __switch_to+0x80/0x3d0
[   99.144784]  ? __switch_to_asm+0x36/0x70
[   99.144785]  pm_runtime_work+0x8e/0x90
[   99.144787]  process_one_work+0x1b4/0x370
[   99.144788]  worker_thread+0x53/0x3e0
[   99.144788]  ? process_one_work+0x370/0x370
[   99.144789]  kthread+0x11b/0x140
[   99.144790]  ? __kthread_bind_mask+0x60/0x60
[   99.144791]  ret_from_fork+0x22/0x30
[   99.144792] ---[ end trace 124c505234b7db1b ]---
[   99.144917] ------------[ cut here ]------------
[   99.144918] kernel BUG at mm/slub.c:304!
[   99.144925] invalid opcode: 0000 [#1] SMP NOPTI
[   99.144926] CPU: 13 PID: 373 Comm: kworker/13:2 Tainted: G        W
      5.8.0-2-amd64 #1 Debian 5.8.10-1
[   99.144926] Hardware name: Micro-Star International Co., Ltd. Bravo
15 A4DDR/MS-16WK, BIOS E16WKAMS.10E 05/21/2020
[   99.144927] Workqueue: pm pm_runtime_work
[   99.144930] RIP: 0010:__slab_free+0x1ce/0x360
[   99.144931] Code: 34 24 00 57 9d 0f 1f 44 00 00 4d 85 c0 75 77 80 7c
24 5b 00 79 05 40 84 f6 74 1b 48 8d 65 d8 5b 41 5c 41 5d 41 5e 41 5f 5d
c3 <0f> 0b 80 4c 24 5b 80 e9 29 ff ff ff 48 8d 65 d8 4c 89 e6 4c 89 f7
[   99.144932] RSP: 0018:ffffb2e9c06fbbc0 EFLAGS: 00010246
[   99.144932] RAX: ffff8a36a780e100 RBX: 000000008020001f RCX:
ffff8a36a780e000
[   99.144933] RDX: ffff8a36a780e000 RSI: ffffe417c79e0300 RDI:
ffff8a35c7c06d80
[   99.144933] RBP: ffffb2e9c06fbc58 R08: 0000000000000001 R09:
ffffffffc0803d24
[   99.144934] R10: ffff8a36a780e000 R11: 0000000000000001 R12:
ffffe417c79e0300
[   99.144934] R13: ffff8a36a780e000 R14: ffff8a35c7c06d80 R15:
ffff8a36dfeb6238
[   99.144935] FS:  0000000000000000(0000) GS:ffff8a36e7940000(0000)
knlGS:0000000000000000
[   99.144936] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   99.144936] CR2: 00007f9af455a000 CR3: 0000000224568000 CR4:
0000000000340ee0
[   99.144937] Call Trace:
[   99.144939]  ? __alloc_pages_nodemask+0x15e/0x300
[   99.144940]  ? __free_one_page+0x142/0x410
[   99.144985]  ? kfd_gtt_sa_free+0x56/0x80 [amdgpu]
[   99.145028]  ? kernel_queue_uninit+0x84/0xf0 [amdgpu]
[   99.145029]  kfree+0x214/0x230
[   99.145070]  kernel_queue_uninit+0x84/0xf0 [amdgpu]
[   99.145113]  stop_cpsch+0x97/0xc0 [amdgpu]
[   99.145154]  kgd2kfd_suspend.part.0+0x2f/0x40 [amdgpu]
[   99.145191]  amdgpu_device_suspend+0x95/0x2b0 [amdgpu]
[   99.145192]  ? cpumask_next_and+0x19/0x20
[   99.145228]  amdgpu_pmops_runtime_suspend+0x9e/0x140 [amdgpu]
[   99.145229]  pci_pm_runtime_suspend+0x5e/0x170
[   99.145230]  ? vga_switcheroo_runtime_resume+0x60/0x60
[   99.145231]  vga_switcheroo_runtime_suspend+0x22/0xb0
[   99.145231]  ? vga_switcheroo_runtime_resume+0x60/0x60
[   99.145232]  ? vga_switcheroo_runtime_resume+0x60/0x60
[   99.145233]  __rpm_callback+0x81/0x140
[   99.145234]  ? vga_switcheroo_runtime_resume+0x60/0x60
[   99.145234]  rpm_callback+0x1f/0x70
[   99.145235]  ? vga_switcheroo_runtime_resume+0x60/0x60
[   99.145236]  rpm_suspend+0x148/0x680
[   99.145237]  ? __switch_to+0x80/0x3d0
[   99.145238]  ? __switch_to_asm+0x36/0x70
[   99.145238]  pm_runtime_work+0x8e/0x90
[   99.145239]  process_one_work+0x1b4/0x370
[   99.145240]  worker_thread+0x53/0x3e0
[   99.145241]  ? process_one_work+0x370/0x370
[   99.145242]  kthread+0x11b/0x140
[   99.145243]  ? __kthread_bind_mask+0x60/0x60
[   99.145244]  ret_from_fork+0x22/0x30
[   99.145244] Modules linked in: bnep btusb btrtl btbcm btintel
bluetooth jitterentropy_rng drbg edac_mce_amd snd_hda_codec_realtek
kvm_amd snd_hda_codec_generic ledtrig_audio kvm snd_hda_codec_hdmi
nls_ascii irqbypass nls_cp437 crc32_pclmul snd_hda_intel vfat
snd_intel_dspcfg aes_generic fat snd_hda_codec ghash_clmulni_intel
iwlwifi snd_hda_core aesni_intel snd_hwdep snd_pcm ansi_cprng
crypto_simd cryptd ecdh_generic glue_helper efi_pstore snd_timer ecc
joydev cfg80211 gpio_keys efivars libaes rapl hid_multitouch serio_raw
msi_wmi snd sparse_keymap ccp pcspkr soundcore rfkill sp5100_tco
rng_core watchdog k10temp ac acpi_cpufreq soc_button_array evdev
efivarfs ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 btrfs
blake2b_generic xor zstd_decompress zstd_compress raid6_pq libcrc32c
crc32c_generic hid_generic amdgpu gpu_sched i2c_algo_bit ttm
drm_kms_helper cec ahci xhci_pci libahci drm xhci_hcd libata nvme
nvme_core psmouse usbcore crc32c_intel scsi_mod r8169 t10_pi crc_t10dif
realtek
[   99.145254]  crct10dif_generic i2c_piix4 crct10dif_pclmul libphy
usb_common crct10dif_common wmi i2c_hid battery video hid button
[   99.145258] ---[ end trace 124c505234b7db1c ]---
[   99.281810] RIP: 0010:__slab_free+0x1ce/0x360
[   99.281813] Code: 34 24 00 57 9d 0f 1f 44 00 00 4d 85 c0 75 77 80 7c
24 5b 00 79 05 40 84 f6 74 1b 48 8d 65 d8 5b 41 5c 41 5d 41 5e 41 5f 5d
c3 <0f> 0b 80 4c 24 5b 80 e9 29 ff ff ff 48 8d 65 d8 4c 89 e6 4c 89 f7
[   99.281814] RSP: 0018:ffffb2e9c06fbbc0 EFLAGS: 00010246
[   99.281816] RAX: ffff8a36a780e100 RBX: 000000008020001f RCX:
ffff8a36a780e000
[   99.281816] RDX: ffff8a36a780e000 RSI: ffffe417c79e0300 RDI:
ffff8a35c7c06d80
[   99.281817] RBP: ffffb2e9c06fbc58 R08: 0000000000000001 R09:
ffffffffc0803d24
[   99.281817] R10: ffff8a36a780e000 R11: 0000000000000001 R12:
ffffe417c79e0300
[   99.281818] R13: ffff8a36a780e000 R14: ffff8a35c7c06d80 R15:
ffff8a36dfeb6238
[   99.281819] FS:  0000000000000000(0000) GS:ffff8a36e7940000(0000)
knlGS:0000000000000000
[   99.281820] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   99.281820] CR2: 00007f9af455a000 CR3: 0000000224568000 CR4:
0000000000340ee0

** Model information
sys_vendor: Micro-Star International Co., Ltd.
product_name: Bravo 15 A4DDR
product_version: REV:1.0
chassis_vendor: Micro-Star International Co., Ltd.
chassis_version: Default string
bios_vendor: American Megatrends Inc.
bios_version: E16WKAMS.10E
board_vendor: Micro-Star International Co., Ltd.
board_name: MS-16WK
board_version: REV:1.0

** Loaded modules:
bnep
btusb
btrtl
btbcm
btintel
bluetooth
jitterentropy_rng
drbg
edac_mce_amd
snd_hda_codec_realtek
kvm_amd
snd_hda_codec_generic
ledtrig_audio
kvm
snd_hda_codec_hdmi
nls_ascii
irqbypass
nls_cp437
crc32_pclmul
snd_hda_intel
vfat
snd_intel_dspcfg
aes_generic
fat
snd_hda_codec
ghash_clmulni_intel
iwlwifi
snd_hda_core
aesni_intel
snd_hwdep
snd_pcm
ansi_cprng
crypto_simd
cryptd
ecdh_generic
glue_helper
efi_pstore
snd_timer
ecc
joydev
cfg80211
gpio_keys
efivars
libaes
rapl
hid_multitouch
serio_raw
msi_wmi
snd
sparse_keymap
ccp
pcspkr
soundcore
rfkill
sp5100_tco
rng_core
watchdog
k10temp
ac
acpi_cpufreq
soc_button_array
evdev
efivarfs
ip_tables
x_tables
autofs4
ext4
crc16
mbcache
jbd2
btrfs
blake2b_generic
xor
zstd_decompress
zstd_compress
raid6_pq
libcrc32c
crc32c_generic
hid_generic
amdgpu
gpu_sched
i2c_algo_bit
ttm
drm_kms_helper
cec
ahci
xhci_pci
libahci
drm
xhci_hcd
libata
nvme
nvme_core
psmouse
usbcore
crc32c_intel
scsi_mod
r8169
t10_pi
crc_t10dif
realtek
crct10dif_generic
i2c_piix4
crct10dif_pclmul
libphy
usb_common
crct10dif_common
wmi
i2c_hid
battery
video
hid
button

** PCI devices:

-- System Information:
Debian Release: 10.6
  APT prefers stable
  APT policy: (990, 'stable'), (500, 'stable-updates'), (500, 'testing')
Architecture: amd64 (x86_64)

Kernel: Linux 5.8.0-2-amd64 (SMP w/16 CPU cores)
Kernel taint flags: TAINT_DIE, TAINT_WARN
Locale: LANG=uk_UA.UTF-8, LC_CTYPE=uk_UA.UTF-8 (charmap=UTF-8),
LANGUAGE=uk_UA.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /usr/bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled

Versions of packages linux-image-5.8.0-2-amd64 depends on:
ii  initramfs-tools [linux-initramfs-tool]  0.133+deb10u1
ii  kmod                                    26-1
ii  linux-base                              4.6

Versions of packages linux-image-5.8.0-2-amd64 recommends:
ii  apparmor             2.13.2-10
ii  firmware-linux-free  3.4

Versions of packages linux-image-5.8.0-2-amd64 suggests:
pn  debian-kernel-handbook  <none>
ii  grub-efi-amd64          2.02+dfsg1-20+deb10u2
pn  linux-doc-5.8           <none>

Versions of packages linux-image-5.8.0-2-amd64 is related to:
ii  firmware-amd-graphics     20200918-1
pn  firmware-atheros          <none>
pn  firmware-bnx2             <none>
pn  firmware-bnx2x            <none>
pn  firmware-brcm80211        <none>
pn  firmware-cavium           <none>
pn  firmware-intel-sound      <none>
pn  firmware-intelwimax       <none>
pn  firmware-ipw2x00          <none>
pn  firmware-ivtv             <none>
pn  firmware-iwlwifi          <none>
pn  firmware-libertas         <none>
ii  firmware-linux-nonfree    20200918-1
ii  firmware-misc-nonfree     20200918-1
pn  firmware-myricom          <none>
pn  firmware-netxen           <none>
pn  firmware-qlogic           <none>
pn  firmware-realtek          <none>
pn  firmware-samsung          <none>
pn  firmware-siano            <none>
pn  firmware-ti-connectivity  <none>
pn  xen-hypervisor            <none>

-- no debconf information

Attachment: syslog_kernel.txt.bz2
Description: application/bzip


--- End Message ---
--- Begin Message ---
Hi

This bug was filed for a very old kernel or the bug is old itself
without resolution.

If you can reproduce it with

- the current version in unstable/testing
- the latest kernel from backports

please reopen the bug, see https://www.debian.org/Bugs/server-control
for details.

Regards,
Salvatore

--- End Message ---

Reply to: