[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Notes about ROCm on RDNA2



Dear Cordell,
 I test HSA_OVERRIDE_GFX_ VERION=10.3.0 on my RX5700XT and ROCm-5.2, both tensorflow-rocm and pytorch-1.12.0 can run mnist properly.
 So, do we have to support navi10 separately? Looks like we can use the navi21 codes.


xyz20003@gmail.com
邮箱:xyz20003@gmail.com



---- Replied Message ----
From Cordell Bloor<cgmb-deb@slerp.xyz>
Date 07/11/2022 08:49
To debian-ai@lists.debian.org<debian-ai@lists.debian.org>
Cc
Subject Notes about ROCm on RDNA2
Hi folks,

I wanted to briefly share some of what that I've learned about using
ROCm with RDNA2 GPUs. I'm sure many of you are aware that there's only a
couple RDNA2 GPUs that are officially supported by the AMD ROCm project
upstream: the Radeon Pro W6800 and Radeon Pro v620. In practice,
however, all Navi 21 GPUs share the same processor id (gfx1030) and will
work just fine despite not being officially supported.

Other Navi 2x GPUs have a different processor id. If I recall correctly,
Navi 22 is gfx1031, Navi 23 is gfx1032, Navi 24 is gfx1033, etc. Thus,
if you try to use the AMD binaries built for Navi 21 on Navi 22 or
above, no compatible code objects will be found and your program will
exit with a fatal error from the HIP runtime.

I'd thought that this incompatibility was fundamental, but it seems I
was wrong. LLVM treats the gfx1030–gfx1036 targets identically. It
generates code objects with different processor ids stamped in the
metadata, but the executable code is all the same. In fact, if you tell
libhsakmt to report the device as being gfx1030, any of the gfx103x GPUs
will be able to load and execute code compiled for gfx1030. This can be
done by setting an environment variable [1]:

    export HSA_OVERRIDE_GFX_VERSION=10.3.0

As far as I can tell, despite having differing ids, all RDNA2 desktop
GPUs share the same ISA and can execute the same code. This was a
pleasant surprise, as it greatly expands the list of hardware that could
be used with ROCm.

Sincerely,
Cory Bloor

[1]:
https://github.com/RadeonOpenCompute/ROCT-Thunk-Interface/blob/rocm-5.2.0/src/topology.c#L1180


Reply to: