Re: rocBLAS on arm64 and ppc64el
On Mon, 2023-07-17 at 15:51 -0600, Cordell Bloor wrote:
> Hello,
>
> In my RFS for rocblas 5.5.1+dfsg-1~exp2, I ended up just disabling
> the
> autopkgtests for arm64 and ppc64el rather than entirely disabling
> those
> architectures.
Understandable. But note that, if you need access to arm64 or ppc64el
machines, you can apply for the access to Debian porter boxes.
-- As long as we do not require an AMD GPU there.
> There are two reasons why the build is failing on those platforms:
> (a)
> the rocm-hipamd bug with linking compiler_rt on ppc64el [1], and (b),
> the Tensile build times out after two and a half hours [2]. Problem A
> will be fixed by rocm-hipamd 5.2.3-11, but Problem B requires some
> thought.
>
> I see three possible solutions:
>
> 1. The timeout could be increased. I expect that the Tensile build
> would
> eventually complete successfully.
Can we patch the program to let it output something to screen in
some time interval (like 5 minutes). For buildd this is a
straightforward workaround for timeout.
And, is the program using too much swap space? Sometimes the tests
or the build times out because super slow swap.
> 2. The rocblas library could be built without Tensile on arm64 and
> ppc64el. Without Tensile, some parts of the rocBLAS API would be
> missing
> the performance of the library would be greatly reduced.
> 3. We could limit the rocblas library to amd64.
I'd rank "limiting to amd64-only" as the last resort.
Something like disabling tests on non-amd64 architectures is still
better than that.
Reply to: