[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Preparing Argo and Lyra for the CI (Was: Preparing Ursa and Lyra for the CI)



Hi folks,

Lyra is now functional and stable. This server has four workers, each with 8 CPU cores, 30 GiB of RAM and one MI25 GPU (Vega 10; gfx900). The PCIe passthrough works flawlessly with qemu, though I am using a workaround to enable "-cpu host" until a debci issue is resolved [1]. Lyra is quite slow, but the system is rock solid.

rocBLAS is failing due to an alarm timeout [2]. This appears to be because rocblas-test is using a timer to abort the program if test is not completed fast enough under the assumption that the test must be deadlocked. I'm tempted to patch the timer out of rocblas-test since a deadlock would eventually be caught by the autopkgtest timeout anyway.

Now that Lyra has stabilized, the gfx900 tests on the CI should be a useful resource about problems in ROCm packages on Debian.

Sincerely,
Cory Bloor

[1]: https://salsa.debian.org/rocm-team/debci/-/issues/6
[2]: https://ci.rocm.debian.net/data/autopkgtest/unstable/amd64+gfx900/r/rocblas/912/log.gz


Reply to: