Re: sbuild schroot setup with amdgpu support

To: debian-ai@lists.debian.org
Subject: Re: sbuild schroot setup with amdgpu support
From: Cordell Bloor <cgmb-deb@slerp.xyz>
Date: Mon, 6 Jun 2022 12:18:45 -0600
Message-id: <[🔎] 4f4e391e-f7e2-2a8a-5a0e-aae509f846b7@slerp.xyz>
In-reply-to: <[🔎] 61fade6d8c219aad72aeef6505d71ef1f10d343b.camel@riseup.net>
References: <[🔎] Ypp62Uibc6horV4j@fusion> <[🔎] 61fade6d8c219aad72aeef6505d71ef1f10d343b.camel@riseup.net>

Hi Mo,

On 2022-06-06 00:23, M. Zhou wrote:

To use nvidia cards inside docker container, one
has to map devices from the host to container:

docker run -d -p6666:22 \
  --device=/dev/nvidia0:/dev/nvidia0 \
  --device=/dev/nvidiactl:/dev/nvidiactl \
  --device=/dev/nvidia-modeset:/dev/nvidia-modeset \
  --device=/dev/nvidia-uvm:/dev/nvidia-uvm \
  --device=/dev/nvidia-uvm-tools:/dev/nvidia-uvm-tools

I believe the mapping should be similar for AMD cards.

The magic incantation I use for AMD cards is:

docker run -it \
    --device=/dev/dri \
    --device=/dev/kfd \
    --security-opt seccomp=unconfined \
    --group-add video \
    --group-add render

One thing to remember is that the container (schroot)
user space has to match the version of host kernel module.
At least nvidia-smi requires a matching version.

I don't know whether ROCm requires the same.

I do all my development and testing inside docker containers with a wide variety of ROCm versions. The interface between the kernel module and userland is very stable. I think rocgdb might emit a warning that it needs some newer functionality when you try to use recent versions with older driver versions, but for most components you would never even notice.

I've never encountered a problem related to mixing the version of rock-dkms or amdgpu-dkms installed on my host and the version of ROCm in the container. I have limited experience with the built-in kernel driver, but I would expect its interface to be at least as stable as the dkms modules.

If that's true. Then basically it requires the build
machine to be Debian unstable. While our infrastructure
only run with Debian stable, with at most backported packages.

I don't think there's any issue in general with a mismatch. To me, the main question is whether Linux 5.10 is new enough to have all the features that ROCm requires.

Sincerely,
Cory Bloor

Reply to:

References:
- sbuild schroot setup with amdgpu support
  - From: Étienne Mollier <emollier@emlwks999.eu>
- Re: sbuild schroot setup with amdgpu support
  - From: "M. Zhou" <lumin@debian.org>

Prev by Date: Re: sbuild schroot setup with amdgpu support
Next by Date: Processing of onednn_2.6-1~exp1_source.changes
Previous by thread: Re: sbuild schroot setup with amdgpu support
Next by thread: Re: Building rocRAND with Debian HIP
Index(es):
- Date
- Thread