[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: ROCM host architectures



Hi Christian,

On 2/21/23 14:06, Christian Kastner wrote:
the builds for the recent rocr-runtime [1] failed on all 32-bit
architectures, and rocm-hipamd [2] only succeeded on amd64 and arm64. I
assume this will be the case for many other related packages.
I'm very curious if the arm64 build actually works. Does anybody have hardware to test it?
This is hardly surprising, as some of these architectures (most notably
armel, armhf, mipsel, i386) are probably not at all what ROCm was
intended for. For example, it seems that PCIe 3.0 with atomics are
required for anything GPU-related.

I suspect that trying to enable 32-bit architectures would be a very painful process given that upstream is not considering them at all. Though, the requirement for PCIe 3.0 with atomics is perhaps a bit overstated. In September 2021, AMD driver developer Alex Deucher [1] clarified,

PCI atomics are not required by the firmware on vega parts. They are required by the firmware on navi parts, but we are in the process of fixing that. Beyond the firmware, there are no requirements for PCI atomics in the greater ROCm stack in general, although they are required for certain features (e.g., atomic shader instructions writing to system memory).

I'm afraid the docs have probably steered you wrong. Alas, there are a lot of things in ROCm where the only real way to know if something works is to try it and see.

Some of these packages present quite a burden to the buildds of these
already very constrained systems.

I propose that unless we expect this builds to pass and the packages to
also be usable, we should initially limit to:

    Architecture: amd64 arm64 ppc64el
That seems reasonable.
For package buildds that already pass on architectures not listed above,
I'm somewhat skeptical that we'd want them in testing, if only for the
maintenance burden that they can cause (unusable or FTBFS -> RC).

I don't think the above would have any negative impact on actual users,
but on the positive side, would reduce the burden on the relevant
buildds, and possibly also on the maintainers.

Going forward, assuming that these packages will sooner or later all
have autopkgtests anyway, I would suggest that we extend the
architecture list as soon as we get the tests to pass.

Thoughts?

Sounds good to me.

Sincerely,
Cory Bloor

[1]: https://www.phoronix.com/forums/forum/linux-graphics-x-org-drivers/open-source-amd-linux/1276533-does-rocm-require-pcie-3-0-with-pcie-atomics-or-not

Reply to: