Re: RFS: rccl/5.3.3-2 -- ROCm Communication Collectives Library
On 2023-05-18 20:26, Cordell Bloor wrote:
> Hurrah. Another ROCm component on Unstable. I think I can faintly see the light at the end of the tunnel!
Indeed, well done!
> We might also want to look into adding packaging rccl-tests [1] to
> expand the test suite for this package, as there's only a limited
> selection of tests in the rccl repo. I'm not really sure what the best
> way to do that would be, though.
The common pattern seems to be to ship a -tests package:
$ apt-cache search '\-tests$'
I was sloppy and shipped rocrand's tests as -test (singular), which I'll
have to fix.
> We also need a multi-gpu system to test on to do a full verification,
> as it can really only run smoke tests on a single-gpu system.
Good point.
I've got some of my testing questions of my own to add to that. I'll
start a wiki page where we can draft a general testing strategy.
With only a month or so to go until unstable is unfrozen, I'd really
like to have a solid testing strategy in place before we start upgrading
components.
Best,
Christian
Reply to: