[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: ROCm Some elements for January

Please, you and Etienne, don't apologize if we're in a do-ocracy :)

During last two weeks, I mostly bumped the existing
packages into a slightly better shape, and prospected up the stack.
I finally tackled the full compilation with Cordell's script [0] a few days ago (~2h) and, Etienne, all tests passed on Radeon VII, without touching the kernel modules!

I see three main fronts where to make a decision and/or place following efforts:

# I. To package-or-not-to-package AMD-LLVM

I have done most of my efforts with the goal of not packaging amd-llvm.
Given what Cordell said about the differences between amd and vanilla [1],
it makes sense to me to not package.

Arch packagers on the other hand package LLVM with ROCm [2].

The gist of my efforts is that there is a minor patch to comgr, a few other corrections detailed here [3],
and then rocRAND tests all pass (!) rocBLAS and up do not compile yet [4].
The "hip compiler" `hipcc` is just a perl script wrapping Clang and the AMDGPU LLVM backend.
I had to disable parallel jobs doing that among other things.
I feel like not being very far from un-bundling with a clear cut, but not sure.
LLVM folks on IRC oriented me towards AMD fellow Mattew Arsenault,
whom is expert at LLVM: maybe could we invite him to a next meeting?

Or we can "just" package this LLVM flavor...

# II. The install paths

As Cordell stated [5], the layout of the stack installation is currently being discussed again upstream. I have attached a proposal of a stub directory structure which goes full multi-arch.

I think there are two other main options: stick to the maximum to the install paths that upstream cmake declares (respect debian/tmp), which is mostly what our current packages do, and it is quite messy: /usr/amdgcn/'bitcode', a mix of /usr/lib/<triplet>/<package> and /usr/lib/<package>...

Or adopt the way of some other big stacks, i.e. llvm, which seem to pour their whole debian/tmp into /usr/lib/<package>, make symlinks from /bin and /share to it, be done with it and be tolerated.

The proposal that I attached needs the most work - either with local patches to cmake files and/or pushing for changes upstream. The upside is that I think that it respects better the FHS,
and the recent multi-arch direction of the distribution.

# III. Documentation

Doc is lackluster across the whole platform which is quite a pity [6]...
Happily, when you get to user-facing libraries, it gets much better.
I am nowhere near knowledgeable enough to write good manpages.
Could we ask Jeremy Newton for help on the lower part of the stack, maybe
in exchange of some packaging help in Fedora :) ?

This is my feedback for today, it remains to be seen how all of these topics evolve in January. There might be a new ROCm release soon, I know that there is a release each month.
When is 5.0 due?

Best regards, Maxime

[0] https://gist.github.com/cgmb/7cd9a481c42ce132b5d6420380becef3
[1] https://lists.debian.org/debian-ai/2021/05/msg00034.html
[2] https://github.com/rocm-arch/rocm-arch
[3] https://github.com/ROCm-Developer-Tools/HIP/issues/2449#issuecomment-1003305883
[4] https://github.com/ROCmSoftwarePlatform/Tensile/issues/1455
[5] https://lists.debian.org/debian-ai/2021/11/msg00053.html
[6] https://github.com/RadeonOpenCompute/ROCm/issues/1652

Attachment: v1.tar
Description: Unix tar archive

Reply to: