Bug#953066: libpciaccess0: nVidia driver finds no devices, sddm dies with ABRT
Package: libpciaccess0
Version: 0.13.4-1+b2
Severity: important
-- System Information:
Debian Release: 9.12
APT prefers oldstable-updates
APT policy: (500, 'oldstable-updates'), (500, 'oldstable')
Architecture: amd64 (x86_64)
Foreign Architectures: i386
Kernel: Linux 4.19.0-0.bpo.6-amd64 (SMP w/40 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8), LANGUAGE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)
Versions of packages libpciaccess0 depends on:
ii libc6 2.24-11+deb9u4
ii zlib1g 1:1.2.8.dfsg-5
libpciaccess0 recommends no packages.
Versions of packages libpciaccess0 suggests:
ii pciutils 1:3.5.2-1
-- no debconf information
SUMMARY:
- this is fixed by the libpciaccess0 0.14-1 .so library manually
replacing the distribution library. Appears to be the PCI address
space size issue. (see below bug ref)
- this bug seems pretty major, so is it possible it'll ever get into
the stretch distribution?
- possible that it only effects our systems using UEFI boot as we have
some BIOS boot systems that have similar configurations that don't
have this problem. (not entirely sure the hardware is all similar)
NOTE: some report data below is from another system that was using both the non-backports kernel as well as the backports kernel (no diff in symptoms)
Ref: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=892754
dmesg:
[ 15.915748] systemd[1]: sddm.service: Failed with result 'signal'.
[ 17.214465] systemd[1]: sddm.service: Main process exited, code=killed, status=6/ABRT
[ 17.216271] systemd[1]: sddm.service: Failed with result 'signal'.
[ 18.447283] systemd[1]: sddm.service: Main process exited, code=killed, status=6/ABRT
[ 18.449123] systemd[1]: sddm.service: Failed with result 'signal'.
[ 19.700575] systemd[1]: sddm.service: Start request repeated too quickly.
[ 19.701925] systemd[1]: Failed to start Simple Desktop Display Manager.
[ 19.703344] systemd[1]: sddm.service: Failed with result 'signal'.
hostname:.../share/doc/NVIDIA_GLX-1.0# grep NULL /var/log/Xorg.0.log
[ 13.286] (II) NVIDIA(0): nvCommonPlatformProbe: Device is NULL
[ 13.286] (II) NVIDIA(0): nvCommonPlatformProbe: Device is NULL
The following may be irrelevant
hostname:/etc/default/grub.d# dmesg | grep BAR\.\*size
[ 4.401542] pci 10000:00:02.0: BAR 13: no space for [io size 0xb000]
[ 4.401544] pci 10000:00:02.0: BAR 13: failed to assign [io size 0xb000]
[ 4.401546] pci 10000:00:03.0: BAR 13: no space for [io size 0xc000]
[ 4.401547] pci 10000:00:03.0: BAR 13: failed to assign [io size 0xc000]
[ 4.401551] pci 10000:00:02.0: BAR 13: no space for [io size 0xb000]
[ 4.401552] pci 10000:00:02.0: BAR 13: failed to assign [io size 0xb000]
[ 4.401554] pci 10000:00:03.0: BAR 13: no space for [io size 0xc000]
[ 4.401556] pci 10000:00:03.0: BAR 13: failed to assign [io size 0xc000]
card is there:
hostname:~# nvidia-smi -L
GPU 0: Quadro P400 (UUID: GPU-10d81f86-****-****-****-********cdad)
NOTE: '*' used to sanitize
device info not properly enumerated
hostname:~# cat /proc/driver/nvidia/gpus/*/information
Model: Unknown
IRQ: 81
GPU UUID: GPU-????????-????-????-????-????????????
Video BIOS: ??.??.??.??.??
Bus Type: PCIe
DMA Size: 36 bits
DMA Mask: 0xfffffffff
Bus Location: 0000:65:00.0
Device Minor: 0
Blacklisted: No
# NOTE: the '?' above are literal (not sanitization replacements)
I am using the following (flawed) script to "workaround" this problem to get my
desktop systems functional (it fixes the problem and the X11 +sddm
system works as expected) :
dl_file="http://ftp.us.debian.org/debian/pool/main/libp/libpciaccess/libpciaccess0_0.14-1_amd64.deb"
repl_file="/usr/lib/x86_64-linux-gnu/libpciaccess.so.0.11.1"
cd /
cp "${repl_file}" "${repl_file%/*}/__${repl_file##*/}.DIST"
# (prefix with __ to keep ldconfig from symlinking SONAME of old
# library)
if curl -s "${dl_file}" \
| dpkg-deb --fsys-tarfile - \
| tar xvf - .${repl_file}
then
ldconfig
else
echo "Something bad happened. rc=$?"
exit 1
fi
echo "YOU SHOULD REBOOT NOW"
Thanks,
--stephen
Reply to: