[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#1088522: nouveau: Unable to boot with 3 monitors on Nvidia GPU



Package: linux-image-6.13.1+debian+tj
Followup-For: Bug #1088522
X-Debbugs-Cc: tj.iam.tj@proton.me

Thank-you for the boot logs Benjamin. The 3-monitor log initially
confused me since it started with the 2-monitor log and looked
identical. Once I'd figured it contained multiple boots and cleaned it I
compared differences using:

 diff -u <( sed -n '/kernel:/ s/^.*kernel: \(nouveau.*\)/\1/p' /tmp/bootlog_2_screens.log ) <( sed -n '/kernel:/ s/^.*kernel: \(nouveau.*\)/\1/p' /tmp/bootlog_3_screens.log  )

--- /dev/fd/63  2025-04-03 18:21:05.169414147 +0100
+++ /dev/fd/62  2025-04-03 18:21:05.169414147 +0100
@@ -44,18 +44,303 @@
 nouveau 0000:01:00.0: DRM: MM: using COPY for buffer copies
 nouveau 0000:01:00.0: sec2: cmdq: timeout waiting for queue ready
 nouveau 0000:01:00.0: gr: init failed, -110
-nouveau 0000:01:00.0: DRM: allocated 2560x1440 fb: 0x200000, bo 0000000094c55ef2
+nouveau 0000:01:00.0: DRM: allocated 2560x1440 fb: 0x200000, bo 00000000afe506a5
 nouveau 0000:01:00.0: [drm] fb0: nouveaudrmfb frame buffer device
 nouveau 0000:01:00.0: DRM: Disabling PCI power management to avoid bug
 nouveau 0000:01:00.0: gr: fecs falcon already acquired by gr!
 nouveau 0000:01:00.0: gr: init failed, -16
+nouveau 0000:01:00.0: Xorg[734]: failed to idle channel 2 [Xorg[734]]
 nouveau 0000:01:00.0: gr: fecs falcon already acquired by gr!
 nouveau 0000:01:00.0: gr: init failed, -16
+nouveau 0000:01:00.0: Xorg[734]: failed to idle channel 2 [Xorg[734]]
+nouveau 0000:01:00.0: i2c: aux 0004: magic wait 00009000
+nouveau 0000:01:00.0: i2c: aux 0004: magic wait 00009000
+nouveau 0000:01:00.0: i2c: aux 0004: magic wait 00009000
...

After looking at other logs and reports it seems the "sec2" and "gr"
messages are expected for the TU117 models. The important part here is
"failed to idle channel 2".

Based on my experience using an Nvidia Quadro NVS420 that contains 2
GPUs and has 4 outputs I *think* the channels are paired such that:

Channel Pair Monitor
0       1    A
1       1    B
2       2    C
3       2    D

It may be the case that for the GPU model here there is a problem in
operating the second pair (channel 2 in your case).

I've skimmed through the nouveau commit history from v6.1 up to current
master but could not see anything obvious related to this, but then
again the code is extremely complex and covers so many models I suspect
it needs someone intimately familiar with the hardware and code to know
what to look for.

I'd recommend the following in an attempt to narrow down the cause and
possibly identify a fix:

1) Test a recent mainline kernel [0] (if building a kernel is a problem
maybe try a pre-built kernel from experimental [1]). If the issue is
resolved we can try to identify where the fix is either through commit
history or a git bisect. Note however even if a fix is found it may not
be possible to back-port it to v6.1 since the code has undergone some
large recfactors between these versions.

2) Whether or not you can identify a fixed version report this to the
nouveau developers via their issue tracker [2] since they're best placed
to know what is going on and suggest further steps. I did browse through
closed and open reports for something similar that might shed light but
the single report there has no follow-up.

I was going to recommend reporting to their mailing list [3] as well
but, at least for me, their mail archive web-site is unreachable.

[0] https://www.kernel.org/
[1] https://tracker.debian.org/pkg/linux
[2] https://gitlab.freedesktop.org/drm/nouveau
[3] https://lists.freedesktop.org/archives/nouveau/


Reply to: