[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: rocBLAS on arm64 and ppc64el



Hi Wookey,

On 2023-07-18 15:52, Wookey wrote:
> On 2023-07-17 15:51 -0600, Cordell Bloor wrote:
>>  the Tensile build times out after two and a half hours
> 
> AIUI the roc* stuff is all about using GPU accelerators. So when these
> tests are run on our buildds (which in the arm64 case do not have AMD
> GPU cards fitted SFAIK (do the amd64 buildds have them?). What is
> actually being tested? Presumably fallback CPU-only paths? (which will
> be most of why it's very slow).

It's the *build* that times out. We are building the tests, for use on
individual machines, and for use with autopkgtests on specialized infra.

> 
> There is a whole range of GPU and NPU acceleration hardware out there,
> some of which is arch-specific, and some of which isn't. This is
> particularly relevant to debian-science and debian-ai packages, and we
> are presumably currently very limited in what we can test as a distro?

Correct, the official infra does not support this. [1] summarizes the
major problems.

To address these, we're setting up our own infra, with patched versions
of debci, autopkgtest, and so on, with the intention of feeding these
back once they're proven. (buildds have lower priority at the moment.)

Our classic Architecture field is insufficient to describe a lot of
modern computation. I think the Policy will need to evolve to be able to
express GPU, NPU and other accelerators to a certain degree, and we hope
that our prototypes can help accomplish this.

> I wonder what subset of the possible combinations we should be trying
> to test, and what facilities exist for us to do that testing on. It's
> not practical for debian's buildds to cover very much of this space,
> but perhaps 3rd parties could (or already do) provide hardware we
> could use for testing? e.g if there is currently no place for random
> OSS projects to test things like GPUs on arm64 then ARM could probably
> be persuaded to do something about that. It might take a while, but
> knowing what was actually useful would be a good start.

If ARM could host something with an x16 slot in which to stick a GPU,
that would be *fantastic*, because mainboards like that seem to be
unobtanium for consumers.

We'd like to support arm64, but we have no way to test whether we're
actually successful with our work or not.

Best,
Christian

[1] https://lists.debian.org/debian-ai/2023/03/msg00038.html


Reply to: