Re: AMD ROCm packaging session followup notes

To: "M. Zhou" <lumin@debian.org>
Cc: debian-ai@lists.debian.org
Subject: Re: AMD ROCm packaging session followup notes
From: Karthik <karthikgatiganti@gmail.com>
Date: Fri, 26 Nov 2021 21:55:51 +0530
Message-id: <[🔎] CAOwTTknmNdyrCB3xLmQwdTnqOP4w5jXj9ViGA=9JkVUiYTwMYw@mail.gmail.com>
In-reply-to: <[🔎] 5d58090d6dd2716f1520cd739b5d4d3234c62f32.camel@riseup.net>
References: <YZ6/HwZXUxcwGovD@fusion> <[🔎] 6aad08a6-b0dc-2e16-f831-f6fc8e538bb3@slerp.xyz> <[🔎] CAOwTTk=t58mis=B7o9JMaZK4Gk7CHCX42zh+OTXRAtt-4oJwbA@mail.gmail.com> <[🔎] 5d58090d6dd2716f1520cd739b5d4d3234c62f32.camel@riseup.net>

On Fri, Nov 26, 2021 at 8:38 PM M. Zhou <lumin@debian.org> wrote:
>
> On Fri, 2021-11-26 at 12:59 +0530, Karthik wrote:
> >
> > It's good to see that AMD is working with Debian in packaging rocm
> > apart from making it open-source unlike CuDNN
> >
> > Nvidia didn't even bother to make cudnn publicly available without
> > creating nvidia developer account and only provides packages for
> > Ubuntu 18.04 (i mean for every other distro the package repositories
> > are incomplete).
>
> In fact, cudnn library can be redistributed under a series of
> complicated conditions as per the cudnn EULA, after downloading
> with a personal accounts. Anonymous download links for cudnn also
> exist (see archlinux PKGDBUILD for cudnn).
>

Yes, my current setup is:
debian unstable,
nvidia proprietary driver from nvidia-driver deb package,
cuda from nvidia-cuda-toolkit deb package holding at version 11.2.2-3,
cudnn8 from https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/libcudnn8_8.1.1.33-1+cuda11.2_amd64.deb
extracted to /usr/local/lib,
tensorflow 2.6 from pip

Basically everything is from debian archive except cudnn and tf

> I once tried to read the cudnn license and get it into Debian.
> I had given up eventually since I strongly feel I'm not working
> with a friendly thing.
>
> I use CUDA to train neural networks for dayjob, but for Debian
> development I look forward to SYCL/ROCm's breaking the monopoly
> in open-source manner.
>
> Pytorch is also a good candidate for adding the ROCm support.
>
Yeah,I thought of giving pytorch a try but stopped due to the cuda
version mismatch b/w pytorch and tf
and never bothered to create isolated environments as pytorch android
support seems to be experimental.
I mainly use tf due to its good android support.
> > Hopefully we can build tf with cpu+rocm+onednn for now and later
> > think about adding cudnn support if anything changes with Nvidia
>
> TBH cudnn is the most useful library for neural network acceleration,
> but is meanwhile the most annoying library to package.
>
Agreed, No wonder why Nvidia is now worth ~$800B
(I mean it significantly contributed apart from gaming)
> > This insightfully written policy needs some attention:
> > https://people.debian.org/~lumin/debian-dl.html
> >
>
> Thank you for the attention. It's not out-of-date, but I think I should
> add some more updates to it when I have time.
Sure, Feel free to contact me for any discussion.
I think we should also consider some other approaches regarding model
reproducibility
as we can't get several terabytes of data(typical dnn datasets) into deb archive

Reply to:

References:
- AMD ROCm packaging session followup notes
  - From: Étienne Mollier <emollier@emlwks999.eu>
- Re: AMD ROCm packaging session followup notes
  - From: Cordell Bloor <cgmb-deb@slerp.xyz>
- Re: AMD ROCm packaging session followup notes
  - From: Karthik <karthikgatiganti@gmail.com>
- Re: AMD ROCm packaging session followup notes
  - From: "M. Zhou" <lumin@debian.org>

Prev by Date: Re: AMD ROCm packaging session followup notes
Next by Date: Re: AMD ROCm packaging session followup notes
Previous by thread: Re: AMD ROCm packaging session followup notes
Next by thread: Re: AMD ROCm packaging session followup notes
Index(es):
- Date
- Thread