[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: RFS: rccl/5.3.3-2 -- ROCm Communication Collectives Library



On 2023-05-18 20:26, Cordell Bloor wrote:
> Hurrah. Another ROCm component on Unstable. I think I can faintly see the light at the end of the tunnel! 

Indeed, well done!

> We might also want to look into adding packaging rccl-tests [1] to
> expand the test suite for this package, as there's only a limited
> selection of tests in the rccl repo. I'm not really sure what the best
> way to do that would be, though.

The common pattern seems to be to ship a -tests package:

  $ apt-cache search '\-tests$'

I was sloppy and shipped rocrand's tests as -test (singular), which I'll
have to fix.

> We also need a multi-gpu system to test on to do a full verification,
> as it can really only run smoke tests on a single-gpu system.
Good point.

I've got some of my testing questions of my own to add to that. I'll
start a wiki page where we can draft a general testing strategy.

With only a month or so to go until unstable is unfrozen, I'd really
like to have a solid testing strategy in place before we start upgrading
components.

Best,
Christian


Reply to: