[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

ROCm packaging session memo (Was: Next ROCm packaging work session?)

Good day AI enthousiasts,

M. Zhou, on 2022-01-23:
> Memos will be appreciated.
> On Sun, 2022-01-23 at 18:07 +0100, maxzor wrote:
> > Hello,
> > 
> > Can I sollicitate another jeetsi meeting? I am hoping for the last
> > three
> > of us
> > participants Cory, Etienne and me, but obviously anyone joining,
> > either
> > already
> > in the team or not (wink Mo), is very welcome!
> > What about same hour as last time, 19h00 UTC+1 next Tuesday (01/25)?
> > 
> > Topics could be for example
> > - packaging status,
> > - testing status,
> > - installation feedback (HIP /usr/include, high-level libs...),
> > - re-aligning on TODOs,
> > - anything you choose :)

Please find the memo hereafter.  I've attempted to gather my
notes and Maxime's and incidentally ended up with a novel, so
I hope the big blurb of text will be okay…

rocm integration to debian meeting notes (2022-01-25)


  * Cordell Bloor
  * Jeremy Newton
  * Maxime Chambonnet
  * Étienne Mollier

Jeremy Newton exposed his contributions:
  * update on device libs, working with fixing the packaging structure
    (pending review on Fedora side [1])
  * also helping the comgr packaging in Fedora distribution.
  * can help us target where to address patch issues and licensing
    for instance: comgr embeds different licenses: MIT, NCSA, BSD 3 Clauses;
    this is normally inventoried in a dedicated section in NOTICES.txt.

[1]: https://bugzilla.redhat.com/show_bug.cgi?id=2044664

Maxime raised the question about checking whether his comgr build is okay, with
 the few modifications necessary to build for Debian, especially given the part
 of the code cloned from amd-stg-open instead of the last point version.
Jeremy spent some time locating the needed changes to get comgr v4.5.2 to build
 with llvm-13 [2].  Thanks!  His work could be reused for a patch series on top
 of comgr 4.5.2 for the Debian package.  amd-stg-open is the tip of the public
 part of the development. The version is not entirely clear yet, but would
 likely be whatever ROCm 5.1 becomes, which will surely depend on llvm-14.
 (To be noted, ROCm 5.0 will probably depend on a mid-development snapshot of
 llvm-14, which may thus make this particular version not suited for Debian.)
 Next components in line would then be rocclr, hip and any any of its modules.
 comgr, rocclr, and the rocr-runtime then should be sufficient to get hip.

[2]: https://github.com/Mystro256/ROCm-CompilerSupport/commits/rocm-4.5.2-llvm13

hip building would need four repos at once: rocclr, hip, hipamd, opencl:
  * hipamd: this is the amd backend
  * hip: this is the generic caller (it still might have an nvidia wrapper)
  * hip and hipamd would probably be built together, given remark from Cory
  * Jeremy mentioned opengl built separately alongside mesa and cie, so thought
    hip and hipamd could eventually be separate as well, but it seems to not be
    the case yet.

Cordell has a focus work on math libraries (blas, etc).
 ROCclr needed embedding into at least rocm-opencl-runtime and rocm-hipamd to
 get to build a component.  Fedora has a strict policy on not vendoring, but
 not everyone clear on Debian side.
Étienne: tried to highlight it might depend on the interpretation of Policy
 item 4.13 [3], but normally not much accepted neither.

[3]: https://www.debian.org/doc/debian-policy/ch-source.html#embedded-code-copies

Étienne: mentioned rocm-cmake and rocm-device-libs are uploaded to NEW.
 Update of rocr-runtime should be possible once both are available in the
 archive (rocr-runtime already available in experimental as provided in
 some early ROCm 3.x, but stuck to that version due to missing dependencies.

Étienne: tried answering questions about NEW processing, what it is: making
 sure upload to the archive is legal (copyright review), FHS checks, making
 sure the packages and files naming is sane, etc.  Indicated the risk of
 version skew as the numerous sources go through ftpmaster new queue, and their
 processing time is hard to predict.  Version skew is offset by sticking to
 experimental for the moment.

Maxime hit issues on the CMakeLists.txt of rocblas, something having to do with
 GNUInstallDirs, to get it to have symlinks properly behave in FHS context.

Maxime: What is the test suite running GPU tests on salsa CI?
 There is Vega20, MI25, ...
Étienne: This sounds like upstream AMD CI being triggered by Debian CI.
 Could be a stray .gitlab-ci.yml, might need being overridden by setting the
 parameter in Salsa to use d/salsa-ci.yml instead.  Other option could be to
 Files-Exclude the CI file in d/copyright if this comes from upstream.

Maxime: List of Debian GPUs available for testing/buildd?
Étienne: Wondering whether salsa admins could provide runners supporting GPUs.
 Noticed later the misunderstood question.  Currently buildd do not have GPUs
 (or should not be expected to have some).

Étienne: made a naive adjustment to put bitcode below /usr/share/amdgcn instead
 of /usr/amdgcn in rocm-device-libs.
Maxime: raised that changes are already committed to use DEVICE_LIB_PATH or
 HIP_DEVICE_LIB_PATH cmake variables in upper libraries.
Étienne: Need to adjust rocm-device-libs patch to make use of cmake variables
 instead, especially in the light that those could possibly be deemed better
 located in /usr/lib/amdgcn for instance.

HIP usr/include and not usr/hip -isystem dirty flags
cmake delete usr/include ?
Cory considers poking HIP team, suggests to also ask Clang team.
One of the hard parts left for current 4.5.2.
usr/hip triplet?

Cory rewrote rocsolver cmake for rocblas, so is quite knowledgeable on the
 topic.  Paraphrasing him, he'd be happy to help with fixing any build problems
 encountered in the math libraries, and upstreaming the changes; he's been
 slowly improving the CMake for several of the libraries, but there's still a
 lot left to do.

Thank you all for your time, be it by participating or even just

Have a nice day,  :)
Étienne Mollier <emollier@emlwks999.eu>
Fingerprint:  8f91 b227 c7d6 f2b1 948c  8236 793c f67e 8f0d 11da
Sent from /dev/pts/1, please excuse my verbosity.
On air: Thank You Scientist - Anchor

Attachment: signature.asc
Description: PGP signature

Reply to: