[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: rocBLAS on arm64 and ppc64el



On Mon, 2023-07-17 at 15:51 -0600, Cordell Bloor wrote:
> Hello,
> 
> In my RFS for rocblas 5.5.1+dfsg-1~exp2, I ended up just disabling
> the 
> autopkgtests for arm64 and ppc64el rather than entirely disabling
> those 
> architectures.

Understandable. But note that, if you need access to arm64 or ppc64el
machines, you can apply for the access to Debian porter boxes.
-- As long as we do not require an AMD GPU there.

> There are two reasons why the build is failing on those platforms:
> (a) 
> the rocm-hipamd bug with linking compiler_rt on ppc64el [1], and (b),
> the Tensile build times out after two and a half hours [2]. Problem A
> will be fixed by rocm-hipamd 5.2.3-11, but Problem B requires some
> thought.
> 
> I see three possible solutions:
> 
> 1. The timeout could be increased. I expect that the Tensile build
> would 
> eventually complete successfully.

Can we patch the program to let it output something to screen in
some time interval (like 5 minutes). For buildd this is a
straightforward workaround for timeout.

And, is the program using too much swap space? Sometimes the tests
or the build times out because super slow swap.

> 2. The rocblas library could be built without Tensile on arm64 and 
> ppc64el. Without Tensile, some parts of the rocBLAS API would be
> missing 
> the performance of the library would be greatly reduced.
> 3. We could limit the rocblas library to amd64.

I'd rank "limiting to amd64-only" as the last resort.
Something like disabling tests on non-amd64 architectures is still
better than that.


Reply to: