Packaging hipblaslt in progress

To: debian-ai@lists.debian.org
Subject: Packaging hipblaslt in progress
From: Kari Pahula <kaol@debian.org>
Date: Sun, 10 Mar 2024 14:19:50 +0200
Message-id: <[🔎] Ze2lZoKbenQoguUV@sammakko3.piperka.net>

I've been working on getting hipblaslt packaged (ITP #1064071). The
status right now is that it compiles and I have been trying to get the
tests run before seeing to the rest of the packaging flow. I tried to
jump to 6.0.2 first but ROCm 5.7 wasn't quite ready for it so I stuck
to 5.7 for now.

I've been using rocblas package as a reference since they have a lot
in common, especially with the Tensile dependency. Like with rocblas,
I had to patch hipblaslt to extend GPU compatibility to use gfx1030
kernels since I have a gfx1032. Running rocblas-test works for me
without failures but what I have with hipblaslt fails a lot.

$ obj-x86_64-linux-gnu/clients/staging/hipblaslt-test --gtest_break_on_failure
hipBLASLt version: 300

Query device success: there are 1 devices
-------------------------------------------------------------------------------
Device ID 0 : AMD Radeon Pro W6600 gfx1032
with 8.6 GB memory, max. SCLK 2910 MHz, max. MCLK 875 MHz, compute capability 10.3
maxGridDimX 2147483647, sharedMemPerBlock 65.5 KB, maxThreadsPerBlock 1024, warpSize 32
-------------------------------------------------------------------------------
info: parsing of test data may take a couple minutes before any test output appears...

[==========] Running 8603 tests from 2 test suites.
[----------] Global test environment set-up.
[----------] 8558 tests from _/matmul_test
[ RUN ] _/matmul_test.matmul/pre_checkin_matmul_bad_arg_bad_arg
[ OK ] _/matmul_test.matmul/pre_checkin_matmul_bad_arg_bad_arg (109 ms)
[ RUN ] _/matmul_test.matmul/pre_checkin_matmul_bad_arg_bad_arg_t2
[ OK ] _/matmul_test.matmul/pre_checkin_matmul_bad_arg_bad_arg_t2 (0 ms)
[ RUN ] _/matmul_test.matmul/pre_checkin_matmul_bad_arg_bad_arg_t3
[ OK ] _/matmul_test.matmul/pre_checkin_matmul_bad_arg_bad_arg_t3 (0 ms)
[ RUN ] _/matmul_test.matmul/pre_checkin_alpha_beta_zero_NaN_f16_rf16_rf16_rf16_rf32_r_NN_256_128_64_nnan_256_64_nnan_256_256_1
clients/gtest/../include/unit.hpp:208: Failure
Expected equality of these values:
float(hCPU[i + j * size_t(lda) + k * strideA])
Which is: 0
float(hGPU[i + j * size_t(lda) + k * strideA])
Which is: 0.033782959

8492 failed tests in all, if you think it would be useful to see I can
share a full output. I also tried with a gfx1036 (Ryzen 9 7900X's
integrated GPU) but that fared no better than Dimgrey Cavefish, only
that it got errors about insufficient memory too. I suspect my patch
to substitute with gfx1030 is missing something since the results look
very similar to a run without the patch at all.

Could someone who's more familiar with this have a look? Looking at
radeontop it looks like the GPU is doing something at least when I run
the program.

As a second matter, are any of the data files in the included Tensile
DFSG burdened? I've done a dfsg repack for my source package but that
only removed docs/cleanup_text.sh since a file with just "By Lee
Killough" comment looked suspicious to me and it's not used for the
build. But rocblas has the same file so it may be fine. But I think
the included yaml files do need some reviewing, I see that rocblas
removes some due to DFSG violations and I don't think I can tell from
the ones included with hipblaslt if they also need to be removed or
replaced.

As I said above, the packaging part of this is still obviously work in
progress. The current version doesn't even have a debian/copyright.
I'll finish that once I've verified that what I have is functional in
the first place.

I'm guessing it could be a future project to package
https://github.com/ROCm/Tensile and use that for both rocblas and
hipblaslt. Buildd maintainers would like it at least, running Tensile
during a build seems to be quite a room warmer.

Finally, I didn't coordinate with anyone about ITPing on this RFS, I
just saw it on the list and took it. At least nobody said no when I
told the list about it, so I guess it's okay for me to look on it and
nobody was doing it yet.

I've placed my current work at
https://salsa.debian.org/rocm-team/hipblaslt . I may yet amend and
trash it until upload unless someone else wants to start working on it
already.

Any names I should add to the Uploaders field?

Reply to:

Follow-Ups:
- Re: Packaging hipblaslt in progress
  - From: Cordell Bloor <cgmb@slerp.xyz>
- Re: Packaging hipblaslt in progress
  - From: Christian Kastner <ckk@debian.org>
- Re: Packaging hipblaslt in progress
  - From: Cordell Bloor <cgmb@slerp.xyz>

Prev by Date: Bug#1065813: src:rocm-compilersupport: fails to migrate to testing for too long: clang-17 not available on mips64el
Next by Date: Bug#1065941: pytorch: Please drop dependencies on python3-distutils
Previous by thread: Processed: src:rocm-compilersupport: fails to migrate to testing for too long: clang-17 not available on mips64el
Next by thread: Re: Packaging hipblaslt in progress
Index(es):
- Date
- Thread