[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: ROCm CI for OpenCL packages?



Hi Christian, Cordell.

Christian Kastner <ckk@debian.org> writes:

> Hi Gard,
>
> glad that your interested in our CI!
>
> On 2024-02-10 21:25, Cordell Bloor wrote:
>> On 2024-02-10 06:16, Gard Spreemann wrote:
>> I'd be thrilled to run tests for OpenCL packages on AMD GPU
>> hardware.
>> The ROCm CI system is up and running and providing useful feedback
>> for
>> GPU-enabled packages, but it's still very rough around the edges. It
>> would benefit greatly from skilled developers using it and
>> contributing
>> improvements when they find shortcomings.
>> 
>> Christian Kastner is the expert on the CI system for ROCm packages,
>> but
>> I believe you need to:
>> 
>> 1. Add a wrapper script that skips the test if not running on a node
>> with a GPU available. You could consult the rocsparse test script
>> for an
>> example [1].
>
> Yep, that's correct. You'd need to add a new autopkgtest to
> src:clblast
> that checks if a GPU is available, and exits with code 77 if not. Feel
> free to use [1] that Cory linked.
>
> This will lead to the GPU-needing tests being skipped on the official
> ci.debian.net (where no GPUs are available), rather than failing.

Thanks. This is very helpful.


>> 2. Ask Christian to add the package to the the ci.rocm.debian.net
>> watch
>> list.
>
> Even easier: if any binary packages built by src:clblast depend on any
> ROCm library, they will be automatically tested on ci.rocm.debian.net.
>
> It's a bit trickier if there is no binary dependency (eg: some dlopen
> approach), but that can be worked around. Please let me know if that
> is
> the case.
>
> Our CI is still under development and feedback from maintainers would
> be
> greatly appreciated, so please do share feature requests, if you have
> them.

Got it. Thanks.

Is it possible to be granted (temporary) access to this hardware for the
purpose of writing tests?

Finally: Are the OpenCL components of ROCm actually in Debian yet? I had
a very supeficial go at writing a test using an AWS instance, but it
seemed I could only find the GPU exposed as a Clover platform. It was my
understanding that this has been deprecated? (I'm very impressed by your
massive ROCm efforts – I'm just confused about the relationships between
various components, I think).


Thanks.


 Best,
 Gard

Attachment: signature.asc
Description: PGP signature


Reply to: