[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Notes about ROCm on RDNA2



Hi folks,

I wanted to briefly share some of what that I've learned about using ROCm with RDNA2 GPUs. I'm sure many of you are aware that there's only a couple RDNA2 GPUs that are officially supported by the AMD ROCm project upstream: the Radeon Pro W6800 and Radeon Pro v620. In practice, however, all Navi 21 GPUs share the same processor id (gfx1030) and will work just fine despite not being officially supported.

Other Navi 2x GPUs have a different processor id. If I recall correctly, Navi 22 is gfx1031, Navi 23 is gfx1032, Navi 24 is gfx1033, etc. Thus, if you try to use the AMD binaries built for Navi 21 on Navi 22 or above, no compatible code objects will be found and your program will exit with a fatal error from the HIP runtime.

I'd thought that this incompatibility was fundamental, but it seems I was wrong. LLVM treats the gfx1030–gfx1036 targets identically. It generates code objects with different processor ids stamped in the metadata, but the executable code is all the same. In fact, if you tell libhsakmt to report the device as being gfx1030, any of the gfx103x GPUs will be able to load and execute code compiled for gfx1030. This can be done by setting an environment variable [1]:

    export HSA_OVERRIDE_GFX_VERSION=10.3.0

As far as I can tell, despite having differing ids, all RDNA2 desktop GPUs share the same ISA and can execute the same code. This was a pleasant surprise, as it greatly expands the list of hardware that could be used with ROCm.

Sincerely,
Cory Bloor

[1]: https://github.com/RadeonOpenCompute/ROCT-Thunk-Interface/blob/rocm-5.2.0/src/topology.c#L1180


Reply to: