Bug#1077516: amdgpu (xorg) crashes with kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled
Package: firmware-amd-graphics
Version: 20230210-5
*** Reporter, please consider answering these questions, where
appropriate ***
* What led up to the situation?
* What exactly did you do (or not do) that was effective (or
ineffective)?
* What was the outcome of this action?
* What outcome did you expect instead?
Occasionally, the screen of my Lenovo/T14s goes black or the
X-session is restarted.
Here is a procedure how reproduce this issue:
- running the installer of flowjo [1] within wine [2] crashes the
x-sessions or the
[1]
https://fjinstallers.s3.amazonaws.com/FlowJo/FlowJo-Win64-10.10.0.exe
[2] I used wine-devel (wine 9.13), but I expect this this will be
reproducible with wine 8.x too.
However, occassianally the issue occurs also in other cases.
The logs during this crash are attached.
After uninstalling "firmware-amd-graphics/20230210-5" from debian
bookworm and rebooting the machine, the problem went away (also at the
cost that no 2nd screen is available).
Installing the more recent version of
firmware-amd-graphics/20240709-1
from testing repository fixes this issue.
Therefore, I believe that firmware-amd-graphics/20230210-5 is buggy,
and the more recent version of firmware-amd-graphics should be used.
*** End of the template - remove these template lines ***
-- System Information:
Debian Release: 12.6
APT prefers stable-security
APT policy: (500, 'stable-security'), (500, 'stable')
Architecture: amd64 (x86_64)
Foreign Architectures: i386
Kernel: Linux 6.1.0-23-amd64 (SMP w/16 CPU threads; PREEMPT)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8),
LANGUAGE=en_US:en
Shell: /bin/sh linked to /usr/bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled
firmware-amd-graphics depends on no packages.
firmware-amd-graphics recommends no packages.
Versions of packages firmware-amd-graphics suggests:
ii initramfs-tools 0.142
Jul 29 15:27:49 pascal kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=834899, emitted seq=834901
Jul 29 15:27:49 pascal kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process Xorg pid 23803 thread Xorg:cs0 pid 23804
Jul 29 15:27:49 pascal kernel: amdgpu 0000:c3:00.0: amdgpu: GPU reset begin!
Jul 29 15:27:49 pascal kernel: [drm] REG_WAIT timeout 1us * 100000 tries - optc1_wait_for_state line:817
Jul 29 15:27:49 pascal kernel: [drm] REG_WAIT timeout 1us * 100000 tries - optc1_wait_for_state line:817
Jul 29 15:27:49 pascal kernel: [drm] REG_WAIT timeout 1us * 100000 tries - optc1_wait_for_state line:817
Jul 29 15:27:50 pascal kernel: [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
Jul 29 15:27:50 pascal kernel: [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
Jul 29 15:27:50 pascal kernel: [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
Jul 29 15:27:50 pascal kernel: [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
Jul 29 15:27:50 pascal kernel: [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
Jul 29 15:27:50 pascal kernel: [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
Jul 29 15:27:50 pascal kernel: [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
Jul 29 15:27:50 pascal kernel: [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
Jul 29 15:27:50 pascal kernel: [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
Jul 29 15:27:50 pascal kernel: [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
Jul 29 15:27:50 pascal kernel: [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
Jul 29 15:27:50 pascal kernel: [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
Jul 29 15:27:51 pascal kernel: [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
Jul 29 15:27:51 pascal kernel: [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
Jul 29 15:27:51 pascal kernel: [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
Jul 29 15:27:51 pascal kernel: [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
Jul 29 15:27:51 pascal kernel: [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
Jul 29 15:27:51 pascal kernel: [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
Jul 29 15:27:51 pascal kernel: [drm:gfx_v11_0_hw_fini [amdgpu]] *ERROR* failed to halt cp gfx
Reply to: