Hi folks,
It looks like we will lose the last vestiges of upstream gfx803 support in the update to ROCm 6 [1]:
ROCm 5.7.1 is the last working version for polaris (I'm stuck on it with an RX580), though compilation needs various `-DAMDGPU_TARGETS=gfx803` flags. PyTorch 2.2 works, building with `export USE_ROCM=1` and `-DAMDGPU_TARGETS=gfx803`. (Newer PyTorch needs to address https://github.com/pytorch/pytorch/issues/119081 before it'll work again.) If you try ROCm 6, run clinfo with export AMD_LOG_LEVEL=1, you'll probably see: Unsupported HSA device gfx803 (PCI ID 67df) for ISA amdgcn-amd-amdhsa--gfx803 Error creating new instance of Device. Best I can tell from poking around the code diffs, it's not easy to just make it work regardless. The code for ROC_ENABLE_PRE_VEGA is gone, and even if you sneak it back in there, clinfo works but any actual opencl program fails.
I have a pile of previously supported gfx803 GPUs, including the
RX570, WX 7100, S9300 X2, MI6 and MI8. I'm willing to use them to
run whatever tests folks want. However, the reality is that ROCm
has been kinda broken on this hardware ever since AMD dropped
official support back in 2020. When I test older versions of
rocBLAS, I find out-of-bounds writes showing up in its test suite
since ROCm 3.7 (the first version after official support was
dropped).
It's unfortunate, as there is still a lot of gfx803 hardware out in the wild. The reality is, though, that neither AMD nor the broader community have taken the steps needed to keep this architecture alive. As the support for this hardware is broken and nobody has stepped up to fix it, I think we should drop gfx803 from the library builds when we upgrade to ROCm 6.1.
Sincerely,
Cory Bloor
[1]:
https://www.reddit.com/r/ROCm/comments/1fbck0z/comment/lm1j2n6/