Hi Gard, On 2024-02-10 06:16, Gard Spreemann wrote:
Hi, First of all: Thanks for your massive effort in maintaining ROCm in Debian, and contributing to saving all of us machine learning people from CUDA prison! I see that you also maintain ci.rocm.debian.net for CI testing of ROCm packages on real AMD hardware. I was wondering: is this service also available for OpenCL packages? If so, what steps are required? Sorry if I'm overlooking some obvious documentation somewhere.
I'd be thrilled to run tests for OpenCL packages on AMD GPU hardware. The ROCm CI system is up and running and providing useful feedback for GPU-enabled packages, but it's still very rough around the edges. It would benefit greatly from skilled developers using it and contributing improvements when they find shortcomings.
Christian Kastner is the expert on the CI system for ROCm packages, but I believe you need to:
1. Add a wrapper script that skips the test if not running on a node with a GPU available. You could consult the rocsparse test script for an example [1].
2. Ask Christian to add the package to the the ci.rocm.debian.net watch list.
In my case, I'm thinking about clblast, which is currently only undergoing regular testing with the CPU-based OpenCL implementation POCL on ci.debian.net. Testing on real hardware would be great.
That would be good. AMD is no longer testing OpenCL on some of their older GPUs (e.g. Vega 10), so the community is going to have to test on that hardware if we want to have any chance of catching bugs when they are inevitably introduced.
Feel free to CC any reply to the appropriate list.
Done.
Best, Gard
Sincerely, Cory Bloor[1]: https://salsa.debian.org/rocm-team/rocsparse/-/blob/debian/5.7.1-1/debian/tests/upstream-binaries?ref_type=tags