[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: RFC: Adding an X-ROCm-Built-For field to packages targeting ISAs



Hi Christian,

There is perhaps an additional complexity with the generic targets that you may wish to consider in your design.

The new generic targets have a hidden version number. When you specify that you wish to build for gfx11-generic, the compiler turns that into a command to build for gfx11-generic-v0. If we were to imagine that there were a new gfx11 GPU released that was added to gfx11-generic and that required changes to the gfx11-generic code generation in order to function, then LLVM would increment the internal ISA version number [1]. Using that newer version of LLVM, a request to build for gfx11-generic would build for gfx11-generic-v1.

This version number is so that if you attempt to run an old gfx11-generic-v0 binary on that new and incompatible gfx11 GPU, the HIP Runtime would know that the code object is not compatible and would not load it. This would be resolved by rebuilding the binary for gfx11-generic on the newer compiler, which would output gfx11-generic-v1 code objects that the HIP Runtime would recognize as being compatible with that new hardware.

In any case, the point of this is that I think the information that you care about is the compiler's full target name with the version number. This distinction doesn't matter yet, as we're not using generic targets on Debian yet and those are the only targets that have a version number. Also, I don't think LLVM has ever incremented a generic target version number yet. Nevertheless, it's something to consider for the future if we're designing for the long term.

On 2025-06-27 00:38, Christian Kastner wrote:
I would like to propose that all binary packages built for AMD GPU ISAs
document those ISAs in an X-ROCm-Built-For field.

For example:

   X-ROCm-Built-For: gfx900 gfx1030 gfx1200 ...

I do mean *all* packages, so including all our reverse dependencies.

It would be nice if this could be consistent across various different types of accelerators. Ultimately, this field basically means that the program was built by calling `clang++ --offload-arch=<$X-ROCm-Built-For[0]> --offload-arch=<$X-ROCm-Built-For[1]> ....`.

For AMD GPUs, those values are gfx900 gfx1030 gfx1200 ...
For Intel GPUS, those values are bdw, acm_g10, acm_g11, pvc ...
For NVIDIA GPUs, those values are sm_60, sm_70, sm_80, sm_90 ...

I also wonder if other sorts of accelerators might be supported through the same mechanism (e.g., NPUs). To compile code for the XDNA NPU, you invokes clang using --target=aie2-none-unknown-elf.

Would it make sense to have one field that specifies the accelerator architectures for all vendors? Or would it make more sense to have a different field for each vendor / accelerator toolchain? e.g., X-Offload-Arch vs. X-<Vendor>-<Device Type/Runtime/Toolchain>-Arch?

Your approach is basically the latter (modulo minor naming differences). I don't have anything against that. It's just worth making an explicit decision to take that approach, if that's the plan.

It's debatable whether this should also be added to -dev packages.
I myself don't think this would contribute much, other than extra
maintenance work.

I don't think it makes sense on them anyway, as they don't contain any GPU code. That also cleanly solves matters for libraries such as rocfft, where the library does not contain any GPU code (because it depends on run-time compilation) and therefore works on many different GPUs.

We could also use this list to "bridge" back to our CI. Does a package
pass all its tests on the listed ISAs -> otherwise, report a bug.

Although, I suppose this idea implies a somewhat different interpretation of the field. You are not saying, "this is the ISA that the package was built for" but rather "these are the GPUs that the package supports". Those are very different things in the case of generic targets, the SPIR-V target, and run-time compilation. You'll need to be clear about which you mean.

Sincerely,
Cory Bloor

[1]: Of course, if a new gfx11 GPU did not require any changes to the gfx11-generic code generation to function, the version number would not be incremented. That's the best-case scenario, because it means that old binaries remain compatible with newer hardware and require nothing but an update to the driver / runtime.


Reply to: