[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: AMD ROCm packaging session followup notes



On Fri, Nov 26, 2021 at 8:38 PM M. Zhou <lumin@debian.org> wrote:
>
> On Fri, 2021-11-26 at 12:59 +0530, Karthik wrote:
> >
> > It's good to see that AMD is working with Debian in packaging rocm
> > apart from making it open-source unlike CuDNN
> >
> > Nvidia didn't even bother to make cudnn publicly available without
> > creating nvidia developer account and only provides packages for
> > Ubuntu 18.04 (i mean for every other distro the package repositories
> > are incomplete).
>
> In fact, cudnn library can be redistributed under a series of
> complicated conditions as per the cudnn EULA, after downloading
> with a personal accounts. Anonymous download links for cudnn also
> exist (see archlinux PKGDBUILD for cudnn).
>

Yes, my current setup is:
debian unstable,
nvidia proprietary driver from nvidia-driver deb package,
cuda from nvidia-cuda-toolkit deb package holding at version 11.2.2-3,
cudnn8 from https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/libcudnn8_8.1.1.33-1+cuda11.2_amd64.deb
extracted to /usr/local/lib,
tensorflow 2.6 from pip

Basically everything is from debian archive except cudnn and tf

> I once tried to read the cudnn license and get it into Debian.
> I had given up eventually since I strongly feel I'm not working
> with a friendly thing.
>
> I use CUDA to train neural networks for dayjob, but for Debian
> development I look forward to SYCL/ROCm's breaking the monopoly
> in open-source manner.
>
> Pytorch is also a good candidate for adding the ROCm support.
>
Yeah,I thought of giving pytorch a try but stopped due to the cuda
version mismatch b/w pytorch and tf
and never bothered to create isolated environments as pytorch android
support seems to be experimental.
I mainly use tf due to its good android support.
> > Hopefully we can build tf with cpu+rocm+onednn for now and later
> > think about adding cudnn support if anything changes with Nvidia
>
> TBH cudnn is the most useful library for neural network acceleration,
> but is meanwhile the most annoying library to package.
>
Agreed, No wonder why Nvidia is now worth ~$800B
(I mean it significantly contributed apart from gaming)
> > This insightfully written policy needs some attention:
> > https://people.debian.org/~lumin/debian-dl.html
> >
>
> Thank you for the attention. It's not out-of-date, but I think I should
> add some more updates to it when I have time.
Sure, Feel free to contact me for any discussion.
I think we should also consider some other approaches regarding model
reproducibility
as we can't get several terabytes of data(typical dnn datasets) into deb archive


Reply to: