[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: pytorch-cuda: What range of GPUs to support?



Hi Mo,

Personally, I think at minimum we should support everything that was
typically shipped with 8GB+ VRAM, if recent enough (your 8 years is
definitely that). 8GB+ is easily within the realm of your typical
non-corporate Debian user.

That support should include everything up to Hopper as A100/H100 is
where the exciting stuff is happening now.

At the lower end, I think 8 years is more than sufficient. While Debian
indeed likes to support "ancient hardware" on principle, I don't think
that really applies to GPGPU compute.

It may be technically possible to do stuff on ancient cards... but why
would you, apart from a party trick. performance/W, and other ancillary
cards, are not worth it.

On 2025-02-19 03:15, M. Zhou wrote:
> On Tue, 2025-02-18 at 18:24 -0500, M. Zhou wrote:
> After reading the CUDA documentation, I realize that the previous list
> is problematic. Some GPUs like A100 will be excluded for support.
> https://docs.nvidia.com/cuda/hopper-compatibility-guide/index.html
> Nvidia A100 is architecture 8.0. The only 8.X cubin included in the
> list "6.1;7.5;8.6" is 8.6, which is higher than the device capability.
> So the built binary will not run on A100 (according to the documentation).
> 
> I intended to exclude GPUs older than GTX1080 (6.1), but the pytorch
> upstream pypi package seems to support even older GPUs:
> 
> In [3]: torch.cuda.get_arch_list()
> Out[3]: ['sm_50', 'sm_60', 'sm_70', 'sm_75', 'sm_80', 'sm_86', 'sm_90']
> 
> So I'm going to revise the architecture list, following the upstream.
> That will make the binary much larger but I think it is a less
> questionable configuration. Maybe something like this:
> 
>   5.0;6.0;7.0;7.5;8.0;8.6;9.0+PTX

Sounds reasonable to me.

Best,
Christian


Reply to: