[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: sbuild schroot setup with amdgpu support



Thanks for sharing this. I had some similar experience but that was
not under schroot. To use nvidia cards inside docker container, one
has to map devices from the host to container:

docker run -d -p6666:22 \
  --device=/dev/nvidia0:/dev/nvidia0 \
  --device=/dev/nvidiactl:/dev/nvidiactl \
  --device=/dev/nvidia-modeset:/dev/nvidia-modeset \
  --device=/dev/nvidia-uvm:/dev/nvidia-uvm \
  --device=/dev/nvidia-uvm-tools:/dev/nvidia-uvm-tools

I believe the mapping should be similar for AMD cards.

One thing to remember is that the container (schroot)
user space has to match the version of host kernel module.
At least nvidia-smi requires a matching version.

I don't know whether ROCm requires the same.
If that's true. Then basically it requires the build
machine to be Debian unstable. While our infrastructure
only run with Debian stable, with at most backported packages.

On Fri, 2022-06-03 at 23:19 +0200, Étienne Mollier wrote:
> Hi all,
> 
> TL;DR: if one wishes to expose gpu to an sbuild environment, to
> run the build time test suite in almost the same configuration
> as on buildd, one quick and easy but perhaps deemed not very
> safe way to do it would be to bind mount the whole /dev into
> schroots via /etc/schroot/default/fstab.  For a more selective
> way, see the next paragraphs.
> 
> In rocm-hipamd, I adjusted the test suite to be triggered only
> if there is an amd gpu available[1].  For the moment, I only
> check for the existence of the /dev/kfd character device, but
> perhaps I should be more diligent in the verification?  Anyway,
> to trigger, this means the build environment must expose said
> amdgpu, which is not the default behavior with the classical
> sbuild based environment used on buildd servers I believe.
> 
> [1]: https://salsa.debian.org/rocm-team/rocm-hipamd/-/commit/6da7b85a08a142c5ac38cd856a3572a2e01168e5
> 
> To run the build, I'm using sbuild with an schroot slightly
> adjusted to expose the gpu device, and only it.  Here below is a
> sample of my /etc/schroot/schroot.conf; note the profile set to
> amdgpu:
> 
> 	# schroot.conf(5)
> 	[sid-amd64-amdgpu]
> 	aliases=amdgpu
> 	description=Debian sid (unstable) with amdgpu binding
> 	type=directory
> 	directory=/mnt/archs/sid-amd64
> 	union-type=overlay
> 	users=emollier
> 	groups=sbuild
> 	root-groups=sbuild
> 	profile=amdgpu
> 
> This profile is defined partly in /etc/schroot/amdgpu/.  I make
> the /etc/schroot/amdgpu/copyfiles from a simple copy of the same
> file from the default profile, same thing for the nssdatabases.
> 
> Regarding /etc/schroot/amdgpu/fstab, it also includes a binding
> for pulling the cards via /dev/dri/, but this could be exposed
> by script if fine grained selection of gpu were needed I think:
> 
> 	# fstab: static file system information for chroots.
> 	# Note that the mount point will be prefixed by the chroot path
> 	# (CHROOT_PATH)
> 	#
> 	# <file system> <mount point>   <type>  <options>       <dump>  <pass>
> 	/proc           /proc           none    rw,bind         0       0
> 	/sys            /sys            none    rw,bind         0       0
> 	/dev/pts        /dev/pts        none    rw,bind         0       0
> 	tmpfs           /dev/shm        tmpfs   defaults        0       0
> 	/dev/dri        /dev/dri        none    rw,bind         0       0
> 
> Finally, to pull the /dev/kfd but not the rest of /dev, I added
> the script /etc/schroot/setup.d/15mkkfd below; this basically
> looks up how the real /dev/kfd character device looks like on
> the host, then rebuilds it inside the schroot environment:
> 
> 	#!/bin/sh
> 	set -e
> 	
> 	. "$SETUP_DATA_DIR/common-data"
> 	. "$SETUP_DATA_DIR/common-functions"
> 	. "$SETUP_DATA_DIR/common-config"
> 	
> 	mkamdgpunod () {
> 		# We can bind mount /dev/dri/, so it is part of the profile's fstab.
> 		# However, the node /dev/kfd must be reconstructed from the host's.
> 		AMDGPU_DEV_DIR="$CHROOT_MOUNT_LOCATION/dev"
> 		AMDGPU_KFD_MAJOR="$(stat --format 0x%t /dev/kfd)"
> 		AMDGPU_KFD_MINOR="$(stat --format 0x%T /dev/kfd)"
> 		AMDGPU_KFD_MODE="$(stat --format 0%a /dev/kfd)"
> 		AMDGPU_KFD_USER="$(stat --format %U /dev/kfd)"
> 		AMDGPU_KFD_GROUP="$(stat --format %G /dev/kfd)"
> 		mknod "$AMDGPU_DEV_DIR/kfd" --mode "$AMDGPU_KFD_MODE" \
> 			c "$AMDGPU_KFD_MAJOR" "$AMDGPU_KFD_MINOR"
> 		chown "$AMDGPU_KFD_USER:$AMDGPU_KFD_GROUP" "$AMDGPU_DEV_DIR/kfd"
> 		unset AMDGPU_KFD_GROUP AMDGPU_KFD_USER AMDGPU_KFD_MODE
> 		unset AMDGPU_KFD_MINOR AMDGPU_KFD_MAJOR AMDGPU_DEV_DIR
> 	}
> 	
> 	if [ "$CHROOT_PROFILE" = "amdgpu" ] && [ "$STAGE" = "setup-start" ]
> 	then test ! -c /dev/kfd || mkamdgpunod
> 	fi
> 
> Once all this is setup, and assuming an otherwise working sbuild
> configuration, I trigger builds of a given package by choosing
> the amdgpu specific schroot with -c flag:
> 
> 	$ sbuild -c sid-amd64-amdgpu rocm-hipamd_5.0.0-1~exp1.dsc
> 
> I don't consider myself an schroot guru, so would welcome any
> improvements one might see.  As revealed by tests in the past
> few weeks, the test suite may impede the display of the host, or
> trigger kernel tainting, so I suppose exposing only an auxiliary
> gpu might be more prudent, at least initially, thus the possible
> need for selective device reconstruction below /dev/dri/.
> 
> Next step might be to wrap up some working autopkgtests for
> these packages, and thus determine a similar configuration for
> lxc, the debian ci container type for running autopkgtests.  But
> lazy me has been sticking to testing packages in schroot instead
> of lxc so far, which is not a recommended way, so haven't a
> procedure yet.
> 
> In hope this helps with setting up build-test environments,
> Have a nice day,  :)


Reply to: