Re: HIP on the NVIDIA platform

To: Cordell Bloor <cgmb-deb@slerp.xyz>, debian-ai@lists.debian.org
Subject: Re: HIP on the NVIDIA platform
From: "M. Zhou" <lumin@debian.org>
Date: Wed, 19 Oct 2022 21:28:13 -0400
Message-id: <[🔎] 4f9582d23838fff10cb73537a5b5105718cb8abe.camel@debian.org>
In-reply-to: <[🔎] 373a1a31-9481-7f5c-5f7c-6589cb05844f@slerp.xyz>
References: <[🔎] 373a1a31-9481-7f5c-5f7c-6589cb05844f@slerp.xyz>

I'd like to share just some personal thoughts regarding cuda.

The "non-free" keyword makes a huge difference in many details
during the debian development process. For instance, due to
being non-free, the packages depending on cuda won't
be automatically built by Debian's own infrastructure.
According to my past experience, dealing with non-free packages
is definitely not something easy, especially for CUDA
-- let alone a dependency tree on top of it.

One example: cuda gcc support is always lagging behind debian
sid. When there is a incompatibility issue, we can patch the
cuda header locally to workaround issues, but we are prohibited to
modify the patch distributed through debian archive due to EULA.
This is what things look like several years ago. I eventually
turned extremely weakly motivated to deal with cuda stuff for
Debian -- My debian works related to cuda can be dragged for years
(for example, pytorch-cuda, cupy, etc.).

I acknowledge the fact that I use cuda (pytorch) for research
every day, but in Debian I prefer to exercise software freedom
and not to be pointlessly constrained by a strict EULA.

Potential contributors should be aware of the implications of
a debian package being non-free.

On Wed, 2022-10-19 at 14:20 -0600, Cordell Bloor wrote:
> Hello,
> 
> I'm starting to look at packaging hipFFT and hipSPARSE. These
> libraries 
> each have two variants; they can be built for either the AMD ROCm 
> platform or the NVIDIA CUDA platform. In the case that hipSPARSE is 
> built for the AMD platform, it depends on rocSPARSE. However, when it
> is 
> built for the NVIDIA platform, it depends on cuSPARSE. This is more
> or 
> less how all the hip{FFT,SPARSE,RAND,BLAS,SOLVER} libraries work.
> 
> Of the hip* libraries, hipRAND is the farthest along in packaging on 
> Debian. For hipRAND, only the AMD platform variant was packaged thus 
> far. That seems like a reasonable way to start, given that the CUDA 
> platform variant would require the cuRAND headers provided by 
> nvidia-cuda-dev [1], which is a non-free package. However, I wonder
> what 
> the ultimate goal should look like.
> 
> What limitations might we run into if we were to build and package
> the 
> NVIDIA variant of hipRAND? I presume that the binary package for that
> variant would have to be uploaded to the non-free archive. It would
> also 
> need a package name to distinguish it from the AMD platform variant.
> 
> This is not an urgent topic, as my priority is getting them working
> on 
> the AMD platform first. Nevertheless, I wanted to start thinking
> about 
> NVIDIA support so I could keep it in mind as I'm working on the other
> packaging.
> 
> Sincerely,
> Cory Bloor
> 
> [1]: https://packages.debian.org/sid/nvidia-cuda-dev
>

Reply to:

References:
- HIP on the NVIDIA platform
  - From: Cordell Bloor <cgmb-deb@slerp.xyz>

Prev by Date: Re: rocprim and rocthrust copyright review
Next by Date: Re: HIP on the NVIDIA platform
Previous by thread: HIP on the NVIDIA platform
Next by thread: Re: HIP on the NVIDIA platform
Index(es):
- Date
- Thread