[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: pytorch-cuda: What range of GPUs to support?



On Tue, 2025-02-18 at 18:24 -0500, M. Zhou wrote:
> Hi Team,
> 
> (CC'ed debian-science, but please redirect discussion to -ai@l.d.o)
> 

After reading the CUDA documentation, I realize that the previous list
is problematic. Some GPUs like A100 will be excluded for support.
https://docs.nvidia.com/cuda/hopper-compatibility-guide/index.html
Nvidia A100 is architecture 8.0. The only 8.X cubin included in the
list "6.1;7.5;8.6" is 8.6, which is higher than the device capability.
So the built binary will not run on A100 (according to the documentation).

I intended to exclude GPUs older than GTX1080 (6.1), but the pytorch
upstream pypi package seems to support even older GPUs:

In [3]: torch.cuda.get_arch_list()
Out[3]: ['sm_50', 'sm_60', 'sm_70', 'sm_75', 'sm_80', 'sm_86', 'sm_90']

So I'm going to revise the architecture list, following the upstream.
That will make the binary much larger but I think it is a less
questionable configuration. Maybe something like this:

  5.0;6.0;7.0;7.5;8.0;8.6;9.0+PTX

I hope this will not lead to linker overflow.


Reply to: