Re: Enabling ROCm on Everything
Hi Étienne,
On 3/21/23 14:34, Étienne Mollier wrote:
So perhaps this is a non-problem (at least regarding rocsparse,
but other components may prove to be more difficult if they are
much larger).
The rocfft library is about 4x the size of rocsparse, which is why it is
split up into five shared object libraries.
On 3/21/23 14:34, Étienne Mollier wrote:
I'm not sure what to think. Long term there will be a need
upstream to split the libraries when architectures will add up,
otherwise the model will not scale due to the issues pointed out
by Cory. Short term the monolithic library is not good, but
fair enough, and splitting would introduce a number of issues
pointed out by Mo Zhou.
I have no idea what upstream will do when they hit the binary size
limit. If their packaging infrastructure is sophisticated enough, maybe
they'll split their packages too. However, there's always the
possibility that they will just start dropping old architectures from
their binaries.
By contrast, Debian specializes in packaging and could significantly
improve the experience of working with ROCm. Consider two case studies
[1][2]. In the first case, the author began struggling with ROCm in part
because they wanted to work with gfx1101 (Radeon Pro V520) and gfx1031
(RX 6700 XT) but neither of those architectures are included in upstream
binaries despite being fully functional [1]. In the second case, the
author began by fighting with the amdgpu-install script and ended with
fruitlessly trying to enable gfx90c (which is not included in upstream
binaries) [2]. Whereas in the solution I propose, they could just `apt
install` their desired -gfx10 or -gfx9 packages.
I strongly believe that with a little help from upstream, some clever
packaging in Debian could dramatically improve the average person's
experience working with AMD GPU libraries. It would be nice if upstream
had a bytecode format that made fancy packaging was unnecessary. Maybe
one day that will happen, but I wouldn't hold my breath waiting.
We can fix this situation with the tools that we have here and now. It
might not be the simplest packaging, but I think that the expanded
hardware support it enables is worth the complexity.
Sincerely,
Cory Bloor
[1]: https://threedots.ovh/blog/2022/05/amd-rocm-a-wasted-opportunity/
[2]: https://scalability.org/state-of-amds-rocm/
Reply to: