Hi Christian, On 2025-05-12 10:54, Christian Kastner wrote:
[3]: Sadly, after much trying, it seems that the analog for [1] in rootless containers, using the 'podman+rocm' backend, is not possible due to come cgroupsv2 restriction. However, I still have the code for that, and I guess I could ship it for people who want to try it in rootful containers.
I take it you are referring to setting environment variables in podman workers? The ROCR_VISIBLE_DEVICES variable can isolate the GPU at a fairly low level in the ROCm user land [2]. Or, do you mean only passing through a subset of devices at all? I forget how you were approaching this.
In any case, isolation via rooted containers would probably be useful as an option. I'd like to limit Pinwheel and Arctophylax to a single GPU [3]. They're getting a fair bit of interactive use now and that would make it easier to share them. It's up to you, though.
Sincerely, Cory Bloor [2]: https://rocm.docs.amd.com/en/docs-6.4.0/conceptual/gpu-isolation.html[3]: https://salsa.debian.org/rocm-team/community/team-project/-/wikis/Continuous-integration-workers