[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Packaging hipblaslt in progress



Hi Kari,

On 2024-03-10 06:19, Kari Pahula wrote:
I've been working on getting hipblaslt packaged (ITP #1064071).  The
status right now is that it compiles and I have been trying to get the
tests run before seeing to the rest of the packaging flow.  I tried to
jump to 6.0.2 first but ROCm 5.7 wasn't quite ready for it so I stuck
to 5.7 for now.

Interesting. It relies on functionality from ROCm 6?

I've been using rocblas package as a reference since they have a lot
in common, especially with the Tensile dependency.  Like with rocblas,
I had to patch hipblaslt to extend GPU compatibility to use gfx1030
kernels since I have a gfx1032.  Running rocblas-test works for me
without failures but what I have with hipblaslt fails a lot.

...
[==========] Running 8603 tests from 2 test suites.
...

8492 failed tests in all, if you think it would be useful to see I can
share a full output.  I also tried with a gfx1036 (Ryzen 9 7900X's
integrated GPU) but that fared no better than Dimgrey Cavefish, only
that it got errors about insufficient memory too.  I suspect my patch
to substitute with gfx1030 is missing something since the results look
very similar to a run without the patch at all.

If 8492/8603 tests are failing, I think that's a reasonable assessment.

Could someone who's more familiar with this have a look?  Looking at
radeontop it looks like the GPU is doing something at least when I run
the program.

I'll take a look later this week, unless someone beats me to it.

As a second matter, are any of the data files in the included Tensile
DFSG burdened?  I've done a dfsg repack for my source package but that
only removed docs/cleanup_text.sh since a file with just "By Lee
Killough" comment looked suspicious to me and it's not used for the
build.  But rocblas has the same file so it may be fine.  But I think
the included yaml files do need some reviewing, I see that rocblas
removes some due to DFSG violations and I don't think I can tell from
the ones included with hipblaslt if they also need to be removed or
replaced.

Lee Killough was an AMD employee on the rocBLAS team. AMD owns the copyright to all his contributions to the ROCm libraries. The root license file should apply, but feel free to file an issue asking for clarification. There should probably be a standard copyright header in every code file.

With regards to DFSG issues in the Tensile sources, there were a few kernels that were provided as binary blobs. For those, the problem was that the shader language used for the original sources did not have an open source compiler, so the compiled sources were checked in instead. The YAML files that were removed from rocBLAS were those that referenced the removed shaders. I believe that the binary blobs were removed in ROCm 6, so Tensile should no longer be DFSG burdened. However, I have not actually reviewed the updates yet myself.

I'm guessing it could be a future project to package
https://github.com/ROCm/Tensile and use that for both rocblas and
hipblaslt.  Buildd maintainers would like it at least, running Tensile
during a build seems to be quite a room warmer.

I've filed an ITP for rocm-tensile [1]. However, the separate Tensile package will not have any significant effect on build times.

You can imagine Tensile as a combination compiler and build system written in Python that takes N input source files written in YAML and outputs M machine code files (plus a few metadata files) [2]. A library like rocBLAS consists of roughly five million lines of YAML source files (and a few hundred thousand lines of C++). The slow part is compiling all those YAML files, but despite being built by Tensile they are not part of the Tensile project.

Splitting the build tool into a separate package is the right thing to do, but the TensileHost library is quite small and that is the only duplicated work that would be saved. The vast majority of the time during the build is spent using TensileCreateLibrary to compile the project's YAML sources, and those are unique per-project.

Finally, I didn't coordinate with anyone about ITPing on this RFS, I
just saw it on the list and took it.  At least nobody said no when I
told the list about it, so I guess it's okay for me to look on it and
nobody was doing it yet.

Seems reasonable to me.

Sincerely,
Cory Bloor

[1]: https://bugs.debian.org/1064257
[2]: The description of Tensile as a "combination compiler and build system written in Python" with high fan-in and fan-out describes why projects using Tensile are so slow and resource-intensive to build, although that's only scratching the surface of the dysfunction. For example, if you run a profiler, you'll notice that the Python portions of the build are nearly all spent in copy.deepcopy.


Reply to: