Preparing Argo and Lyra for the CI (Was: Preparing Ursa and Lyra for the CI)
Hi folks,
I've confirmed that the AMD FirePro S9300 X2 is a Fiji GPU. It predates
Polaris 10 (a.k.a. Ellesmere), so I felt "Ursa" was not appropriate.
Fiji is from the Pirate Islands series, so I've renamed "Ursa" to "Argo".
Argo and Lyra are now connected to the Debian ROCm CI and running jobs
for gfx803 and gfx900 [1]. There were some firmware troubles, so it was
a surprisingly long road to get to this point. Once the hardware was
working, Christian was of great help in getting the CI software
configured. The CI still lists many failures on gfx803 and gfx900, but
new builds should be working for some packages.
One failing test suite is that of rocfft, which times out after five
hours [2]. These old servers have terrible single-thread performance, so
it takes a long time to run the rocfft test suite. The rocsparse,
rocblas, and rocsolver packages are also failing. Those tests crash with
the error "Illegal instruction" [3]. I've not yet determined the cause
of this problem, but it does not occur when the QEMU CPU model is
configured as pass-through. It's not clear to me why this problem is not
seen on the gfx1030 CI machine.
Argo and Lyra use a combined total of 450 W at idle, so I might shut
them down when the job queue is empty. I'm sure we can do something
clever with IPMI to only boot the systems when they're needed, but for
now I'll handle it manually.
Sincerely,
Cory Bloor
[1]: https://ci.rocm.debian.net/
[2]: https://ci.rocm.debian.net/packages/r/rocfft/unstable/amd64+gfx900/
[3]: https://ci.rocm.debian.net/packages/r/rocblas/unstable/amd64+gfx900/
Reply to: