[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Trouble with nvidia drivers in Debian 12 Bookworm



Solved my own problem: I had to do `apt install
linux-headers-cloud-amd64` instead of `apt install
linux-headers-amd64`

On Thu, Jul 13, 2023 at 2:28 PM Sam Clearman <sam@samclearman.com> wrote:
>
> Hi,
> I'm trying to get a Tesla T4 working under Debian 12.
>
> So far I've tried two approaches:
> 1. Using the Debian provided drivers, per
> https://wiki.debian.org/NvidiaGraphicsDrivers
> 2. Using the nVidia provided drivers installed via runfile, per
> https://docs.nvidia.com/datacenter/tesla/tesla-installation-notes/index.html
>
> For 1 (installing the drivers in the debian nonfree repository),
> everything seems to install fine but the drivers don't load properly.
> Systemctl returns the following:
>
> $ systemctl status systemd-modules-load
> × systemd-modules-load.service - Load Kernel Modules
>      Loaded: loaded (/lib/systemd/system/systemd-modules-load.service; static)
>      Active: failed (Result: exit-code) since Thu 2023-07-13 21:05:08
> UTC; 18min ago
>        Docs: man:systemd-modules-load.service(8)
>              man:modules-load.d(5)
>     Process: 220 ExecStart=/lib/systemd/systemd-modules-load
> (code=exited, status=1/FAILURE)
>    Main PID: 220 (code=exited, status=1/FAILURE)
>         CPU: 29ms
>
> Jul 13 21:05:08 localhost systemd-modules-load[226]: modprobe: ERROR:
> could not insert 'nvidia': Invalid argument
> Jul 13 21:05:08 localhost systemd-modules-load[230]: modprobe: FATAL:
> Module nvidia-current-modeset not found in directory
> /lib/modules/6.1.0-10-cloud-amd64
> Jul 13 21:05:08 localhost systemd-modules-load[223]: modprobe: ERROR:
> ../libkmod/libkmod-module.c:1047 command_do() Error running install
> command 'modprobe nvidia ; modprobe -i nvidia-current-modeset ' for m>
> Jul 13 21:05:08 localhost systemd-modules-load[223]: modprobe: ERROR:
> could not insert 'nvidia_modeset': Invalid argument
> Jul 13 21:05:08 localhost systemd-modules-load[232]: modprobe: FATAL:
> Module nvidia-current-drm not found in directory
> /lib/modules/6.1.0-10-cloud-amd64
> Jul 13 21:05:08 localhost systemd-modules-load[220]: Error running
> install command 'modprobe nvidia-modeset ; modprobe -i
> nvidia-current-drm ' for module nvidia_drm: retcode 1
> Jul 13 21:05:08 localhost systemd-modules-load[220]: Failed to insert
> module 'nvidia_drm': Invalid argument
> Jul 13 21:05:08 localhost systemd[1]: systemd-modules-load.service:
> Main process exited, code=exited, status=1/FAILURE
> Jul 13 21:05:08 localhost systemd[1]: systemd-modules-load.service:
> Failed with result 'exit-code'.
> Jul 13 21:05:08 localhost systemd[1]: Failed to start
> systemd-modules-load.service - Load Kernel Modules.
>
> When I try to use the runfile (specifically, this file:
> https://us.download.nvidia.com/tesla/535.54.03/NVIDIA-Linux-x86_64-535.54.03.run)
> it is unable to read the kernel headers that I have installed (if I
> don't specify a location, it says it can't find them, no matter which
> location I specify, it finds something unexpected about what's there).
>
> Any help is appreciated!
>
> PS: Secureboot is disabled, I get the following from mokutil:
> $ mokutil --sb-state
> SecureBoot disabled


Reply to: