Hello,
Pierre van Houtryve, an AMD compiler developer, has opened pull request to add 'generic' ISAs to LLVM [1] as a key feature of Code Object v6. This change would introduce four new instruction sets: gfx9-generic, gfx10.1-generic, gfx10.3-generic and gfx11-generic. I'll briefly describe them.
1. gfx11-generic is the lowest-common denominator of the features of gfx1100, gfx1101, gfx1102, gfx1103, gfx1150, and gfx1151. This ISA covers all RDNA 3 GPUs.
2. gfx10.3-generic is the lowest-common denominator of the features of gfx1030, gfx1031, gfx1032, gfx1033, gfx1034, gfx1035, gfx1036, and gfx1037. I believe those ISAs are all identical, so the generic ISA will be gfx1030 in all but name. This ISA covers all RDNA 2 GPUs.
3. gfx10.1-generic includes only the lowest-common denominator of the features of gfx1010, gfx1011, gfx1012, and gfx1013. This ISA covers all RDNA 1 GPUs.
4. gfx9-generic is the lowest-common denominator of the features of gfx900, gfx902, gfx904, gfx906, gfx909, and gfx90c. This new ISA is practically the same as gfx900, but with a handful of instructions removed (since they were not available in gfx904). This ISA covers all Vega GPUs.
With most of these generic ISAs, there is a tradeoff between
performance and compatibility. When you compile for a generic ISA,
you increase the range of hardware that can be supported with a
single code object. However, the compiler may not be able to use
the best instructions for each platform (e.g., the dot-product
instructions from gfx1011 are excluded from gfx10.1-generic).
Additionally, the generic ISA must include the union of all
workarounds for hardware bugs found on all of the architectures it
supports.
The generic targets are versioned internally, so while clang
users may only specify a target name like gfx11-generic, the fully
qualified name as seen by the runtime loader will actually be
something like gfx11-generic-v1. If the generic ISA must be
revised to add support for new hardware, this will be handled by
bumping the generic ISA's version in the compiler. This versioning
is not useful for the existing generic targets (where we know
exactly what the lowest-common denominator is), but it will be
important for future generations of hardware (since the generic
ISA may be released before all hardware in the generation has been
released). The runtime loader will know which gfx architectures
are supported on each version of a generic ISA, which ensures that
code objects are not loaded on incompatible hardware.
It is still very early in the development cycle for this feature.
It will be many months before support is available in a ROCm
release. Nevertheless, I wanted to bring attention to this change.
I believe it is an important feature for distributions such as
Debian, because it allows for supporting a much larger range of
hardware with smaller binaries. This feature will (eventually)
obsolete the Debian-specific patches introduced into rocr-runtime
[1] and rocm-hipamd [2].
Sincerely,
Cory Bloor
[1]: https://github.com/llvm/llvm-project/pull/76955
[2]:
https://salsa.debian.org/rocm-team/rocr-runtime/-/blob/debian%2F5.2.3-6/debian/patches/0004-extend-isa-compatibility-check.patch?ref_type=heads
[3]:
https://salsa.debian.org/rocm-team/rocm-hipamd/-/blob/debian%2F5.2.3-13/debian/patches/0026-extend-hip-isa-compatibility-check.patch?ref_type=heads