gpuenv-utils 0.1.4 uploaded: Powersaving / uptime management
Hi,
I've released a new version of gpuenv-utils to our APT repo [1].
It ships a new gpuenv-auto-uptime service [2] aim at CI workers. Once
enabled, it will
(1) reboot a host if a GPU has become unresponsive
(2) shut down an idle host.
(1) should eliminate one of the last needs for manual intervention,
which currently happens frequently because of a kernel bug [3].
When coupled with a daily wake-up by RTC alarm via the BIOS, (2) should
reduce power consumption.
This only works if either the pre-test hook from debci_3.10+rocm3 or the
new gpuenv-aware execution driver from debci_3.10+rocm4 are used.
Otherwise, gpuenv-server will not be aware of GPU utilization.
Best,
Christian
[1]: https://apt.rocm.debian.net
[2]: https://salsa.debian.org/rocm-team/gpuenv-utils#services
[3]: https://salsa.debian.org/rocm-team/autopkgtest/-/issues/9
Reply to: