[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Packaging hipblaslt in progress



I've been working on getting hipblaslt packaged (ITP #1064071).  The
status right now is that it compiles and I have been trying to get the
tests run before seeing to the rest of the packaging flow.  I tried to
jump to 6.0.2 first but ROCm 5.7 wasn't quite ready for it so I stuck
to 5.7 for now.

I've been using rocblas package as a reference since they have a lot
in common, especially with the Tensile dependency.  Like with rocblas,
I had to patch hipblaslt to extend GPU compatibility to use gfx1030
kernels since I have a gfx1032.  Running rocblas-test works for me
without failures but what I have with hipblaslt fails a lot.

$ obj-x86_64-linux-gnu/clients/staging/hipblaslt-test --gtest_break_on_failure
hipBLASLt version: 300

Query device success: there are 1 devices
-------------------------------------------------------------------------------
Device ID 0 : AMD Radeon Pro W6600 gfx1032
with 8.6 GB memory, max. SCLK 2910 MHz, max. MCLK 875 MHz, compute capability 10.3
maxGridDimX 2147483647, sharedMemPerBlock 65.5 KB, maxThreadsPerBlock 1024, warpSize 32
-------------------------------------------------------------------------------
info: parsing of test data may take a couple minutes before any test output appears...

[==========] Running 8603 tests from 2 test suites.
[----------] Global test environment set-up.
[----------] 8558 tests from _/matmul_test
[ RUN      ] _/matmul_test.matmul/pre_checkin_matmul_bad_arg_bad_arg
[       OK ] _/matmul_test.matmul/pre_checkin_matmul_bad_arg_bad_arg (109 ms)
[ RUN      ] _/matmul_test.matmul/pre_checkin_matmul_bad_arg_bad_arg_t2
[       OK ] _/matmul_test.matmul/pre_checkin_matmul_bad_arg_bad_arg_t2 (0 ms)
[ RUN      ] _/matmul_test.matmul/pre_checkin_matmul_bad_arg_bad_arg_t3
[       OK ] _/matmul_test.matmul/pre_checkin_matmul_bad_arg_bad_arg_t3 (0 ms)
[ RUN      ] _/matmul_test.matmul/pre_checkin_alpha_beta_zero_NaN_f16_rf16_rf16_rf16_rf32_r_NN_256_128_64_nnan_256_64_nnan_256_256_1
clients/gtest/../include/unit.hpp:208: Failure
Expected equality of these values:
  float(hCPU[i + j * size_t(lda) + k * strideA])
    Which is: 0
  float(hGPU[i + j * size_t(lda) + k * strideA])
    Which is: 0.033782959

8492 failed tests in all, if you think it would be useful to see I can
share a full output.  I also tried with a gfx1036 (Ryzen 9 7900X's
integrated GPU) but that fared no better than Dimgrey Cavefish, only
that it got errors about insufficient memory too.  I suspect my patch
to substitute with gfx1030 is missing something since the results look
very similar to a run without the patch at all.

Could someone who's more familiar with this have a look?  Looking at
radeontop it looks like the GPU is doing something at least when I run
the program.

As a second matter, are any of the data files in the included Tensile
DFSG burdened?  I've done a dfsg repack for my source package but that
only removed docs/cleanup_text.sh since a file with just "By Lee
Killough" comment looked suspicious to me and it's not used for the
build.  But rocblas has the same file so it may be fine.  But I think
the included yaml files do need some reviewing, I see that rocblas
removes some due to DFSG violations and I don't think I can tell from
the ones included with hipblaslt if they also need to be removed or
replaced.

As I said above, the packaging part of this is still obviously work in
progress.  The current version doesn't even have a debian/copyright.
I'll finish that once I've verified that what I have is functional in
the first place.

I'm guessing it could be a future project to package
https://github.com/ROCm/Tensile and use that for both rocblas and
hipblaslt.  Buildd maintainers would like it at least, running Tensile
during a build seems to be quite a room warmer.

Finally, I didn't coordinate with anyone about ITPing on this RFS, I
just saw it on the list and took it.  At least nobody said no when I
told the list about it, so I guess it's okay for me to look on it and
nobody was doing it yet.

I've placed my current work at
https://salsa.debian.org/rocm-team/hipblaslt .  I may yet amend and
trash it until upload unless someone else wants to start working on it
already.

Any names I should add to the Uploaders field?


Reply to: