Re: pytorch-cuda: What range of GPUs to support?

To: debian-ai@lists.debian.org
Subject: Re: pytorch-cuda: What range of GPUs to support?
From: Christian Kastner <ckk@debian.org>
Date: Thu, 20 Feb 2025 09:27:52 +0100
Message-id: <[🔎] 9cac3fa9-9507-4226-8c5a-811217d32a55@debian.org>
In-reply-to: <[🔎] def4afd22df3b60ed97210e30dc320cb82c91254.camel@debian.org>
References: <[🔎] 747cd15759c1bc3caff9c3110233de251bc08029.camel@debian.org> <[🔎] def4afd22df3b60ed97210e30dc320cb82c91254.camel@debian.org>

Hi Mo,

Personally, I think at minimum we should support everything that was
typically shipped with 8GB+ VRAM, if recent enough (your 8 years is
definitely that). 8GB+ is easily within the realm of your typical
non-corporate Debian user.

That support should include everything up to Hopper as A100/H100 is
where the exciting stuff is happening now.

At the lower end, I think 8 years is more than sufficient. While Debian
indeed likes to support "ancient hardware" on principle, I don't think
that really applies to GPGPU compute.

It may be technically possible to do stuff on ancient cards... but why
would you, apart from a party trick. performance/W, and other ancillary
cards, are not worth it.

On 2025-02-19 03:15, M. Zhou wrote:
> On Tue, 2025-02-18 at 18:24 -0500, M. Zhou wrote:
> After reading the CUDA documentation, I realize that the previous list
> is problematic. Some GPUs like A100 will be excluded for support.
> https://docs.nvidia.com/cuda/hopper-compatibility-guide/index.html
> Nvidia A100 is architecture 8.0. The only 8.X cubin included in the
> list "6.1;7.5;8.6" is 8.6, which is higher than the device capability.
> So the built binary will not run on A100 (according to the documentation).
> 
> I intended to exclude GPUs older than GTX1080 (6.1), but the pytorch
> upstream pypi package seems to support even older GPUs:
> 
> In [3]: torch.cuda.get_arch_list()
> Out[3]: ['sm_50', 'sm_60', 'sm_70', 'sm_75', 'sm_80', 'sm_86', 'sm_90']
> 
> So I'm going to revise the architecture list, following the upstream.
> That will make the binary much larger but I think it is a less
> questionable configuration. Maybe something like this:
> 
>   5.0;6.0;7.0;7.5;8.0;8.6;9.0+PTX

Sounds reasonable to me.

Best,
Christian

Reply to:

References:
- pytorch-cuda: What range of GPUs to support?
  - From: "M. Zhou" <lumin@debian.org>
- Re: pytorch-cuda: What range of GPUs to support?
  - From: "M. Zhou" <lumin@debian.org>

Prev by Date: Re: llama.cpp, whisper.cpp, ggml: Next steps
Next by Date: Re: Status of MIOpen?
Previous by thread: Re: pytorch-cuda: What range of GPUs to support?
Next by thread: Bug#1088540: libamdhip64-dev: Imported target "hiprtc::hiprtc" references missing file
Index(es):
- Date
- Thread