[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: AMD ROCm packaging session followup notes







On Fri, Nov 26, 2021, 5:06 AM Cordell Bloor <cgmb-deb@slerp.xyz> wrote:

Great summary, Étienne.

Thanks Cordell for your time, don't hesitate to complement or
correct me;
Thank you for taking the meeting. The time was arranged to be convenient for Japan and Canada, so it was rather extraordinary that you came from Europe. Your English is also a lot better than my French. Je allé à l'école française, mais je oublie beaucoup.

Upon closer inspection, I see that you wanted me to complement you. I suppose I can do that too.

    I think one source of
    confusion is that there seem to be distinct Github teams
    involved in the various ROCm components.  I gathered some
    components on RadeonOpenCompute [1], and some other such as
    the HIP compiler in ROCm-Developer-Tools [2].  There might
    be other locations, I didn't manage to locate rocclr for
    instance.

    [1]: https://github.com/RadeonOpenCompute/
    [2]: https://github.com/ROCm-Developer-Tools/
There is also https://github.com/ROCmSoftwarePlatform/

Generally speaking, the lowest-level components are found under RadeonOpenCompute, the middle-of-the-stack components are found under ROCm-Developer-Tools and the high-level libraries and frameworks are found under ROCmSoftwarePlatform.

I didn't manage to locate rocclr for instance.

https://github.com/ROCm-Developer-Tools/ROCclr

 Q: In which order to build the different components?

 A: This is still not entirely clear, but trying to package
    targets will eventualy reveal dependency trees (hopefully
    without loops involved).
There won't be any loops involved.

It's not a minimal dependency tree, but I've attached a file with the tarball URLs, build commands and packages that I used to build the first few components (up to and including comgr) on the Debian unstable docker image. I didn't isolate the packages from each other as I was building them, so anything installed in an earlier stage _could_ be a dependency for a later stage (but few actually are).

Going through the file from top to bottom defines one possible (serialized) build order. Attempting each step in an isolated environment would quickly lead you to the actual minimal tree.

It's good to see that AMD is working with Debian in packaging rocm apart from making it open-source unlike CuDNN 

Nvidia didn't even bother to make cudnn publicly available without creating nvidia developer account and only provides packages for Ubuntu 18.04 (i mean for every other distro the package repositories are incomplete).

And another issue we have to address is tensorflow packaging which is still pending

Hopefully we can build tf with cpu+rocm+onednn for now and later think about adding cudnn support if anything changes with Nvidia

This insightfully written policy needs some attention:
https://people.debian.org/~lumin/debian-dl.html

Reply to: