[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#953066: libpciaccess0: nVidia driver finds no devices, sddm dies with ABRT



Package: libpciaccess0
Version: 0.13.4-1+b2
Severity: important


-- System Information:
Debian Release: 9.12
  APT prefers oldstable-updates
  APT policy: (500, 'oldstable-updates'), (500, 'oldstable')
Architecture: amd64 (x86_64)
Foreign Architectures: i386

Kernel: Linux 4.19.0-0.bpo.6-amd64 (SMP w/40 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8), LANGUAGE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)

Versions of packages libpciaccess0 depends on:
ii  libc6   2.24-11+deb9u4
ii  zlib1g  1:1.2.8.dfsg-5

libpciaccess0 recommends no packages.

Versions of packages libpciaccess0 suggests:
ii  pciutils  1:3.5.2-1

-- no debconf information

SUMMARY:
  - this is fixed by the libpciaccess0 0.14-1 .so library manually
    replacing the distribution library.  Appears to be the PCI address
    space size issue. (see below bug ref)
  - this bug seems pretty major, so is it possible it'll ever get into
    the stretch distribution?
  - possible that it only effects our systems using UEFI boot as we have
    some BIOS boot systems that have similar configurations that don't
    have this problem. (not entirely sure the hardware is all similar)

NOTE: some report data below is from another system that was using both the non-backports kernel as well as the backports kernel (no diff in symptoms)

Ref: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=892754

dmesg:
[   15.915748] systemd[1]: sddm.service: Failed with result 'signal'.
[   17.214465] systemd[1]: sddm.service: Main process exited, code=killed, status=6/ABRT
[   17.216271] systemd[1]: sddm.service: Failed with result 'signal'.
[   18.447283] systemd[1]: sddm.service: Main process exited, code=killed, status=6/ABRT
[   18.449123] systemd[1]: sddm.service: Failed with result 'signal'.
[   19.700575] systemd[1]: sddm.service: Start request repeated too quickly.
[   19.701925] systemd[1]: Failed to start Simple Desktop Display Manager.
[   19.703344] systemd[1]: sddm.service: Failed with result 'signal'. 

hostname:.../share/doc/NVIDIA_GLX-1.0# grep NULL /var/log/Xorg.0.log
[    13.286] (II) NVIDIA(0): nvCommonPlatformProbe: Device is NULL
[    13.286] (II) NVIDIA(0): nvCommonPlatformProbe: Device is NULL 

The following may be irrelevant
hostname:/etc/default/grub.d# dmesg | grep BAR\.\*size
[    4.401542] pci 10000:00:02.0: BAR 13: no space for [io  size 0xb000]
[    4.401544] pci 10000:00:02.0: BAR 13: failed to assign [io  size 0xb000]
[    4.401546] pci 10000:00:03.0: BAR 13: no space for [io  size 0xc000]
[    4.401547] pci 10000:00:03.0: BAR 13: failed to assign [io  size 0xc000]
[    4.401551] pci 10000:00:02.0: BAR 13: no space for [io  size 0xb000]
[    4.401552] pci 10000:00:02.0: BAR 13: failed to assign [io  size 0xb000]
[    4.401554] pci 10000:00:03.0: BAR 13: no space for [io  size 0xc000]
[    4.401556] pci 10000:00:03.0: BAR 13: failed to assign [io  size 0xc000] 

card is there:

hostname:~# nvidia-smi -L
GPU 0: Quadro P400 (UUID: GPU-10d81f86-****-****-****-********cdad) 
NOTE: '*' used to sanitize

device info not properly enumerated
hostname:~# cat /proc/driver/nvidia/gpus/*/information
Model:           Unknown
IRQ:             81
GPU UUID:        GPU-????????-????-????-????-????????????
Video BIOS:      ??.??.??.??.??
Bus Type:        PCIe
DMA Size:        36 bits
DMA Mask:        0xfffffffff
Bus Location:    0000:65:00.0
Device Minor:    0
Blacklisted:     No 
# NOTE: the '?' above are literal (not sanitization replacements)

I am using the following (flawed) script to "workaround" this problem to get my
desktop systems functional (it fixes the problem and the X11 +sddm
system works as expected) :

	dl_file="http://ftp.us.debian.org/debian/pool/main/libp/libpciaccess/libpciaccess0_0.14-1_amd64.deb";
	repl_file="/usr/lib/x86_64-linux-gnu/libpciaccess.so.0.11.1"
	cd /
	cp "${repl_file}" "${repl_file%/*}/__${repl_file##*/}.DIST"
	# (prefix with __ to keep ldconfig from symlinking SONAME of old
	# library)
	if curl -s "${dl_file}" \
	    | dpkg-deb --fsys-tarfile - \
	    | tar xvf - .${repl_file}
	then
	    ldconfig 
	else
	    echo "Something bad happened.  rc=$?"
	    exit 1
	fi
	echo "YOU SHOULD REBOOT NOW"

Thanks,
--stephen


Reply to: