Re: ROCm in-box support at Computex 2025
Hi Cory,
On 2025-05-28 08:24, Cordell Bloor wrote:
> AMD certainly wanted to have Debian on that list, so maybe we should
> talk about why it wasn't there.
>
> Ultimately, the problem was that AMD couldn't commit to having the ROCm
> stack on Debian in a state in which it could be recommended as an
> alternative AMD-packaged ROCm stack within the next year. To get it to a
> state that it could be recommended to users, we'd have to get ROCm
> updated to the latest upstream release and package the remaining
> components required for PyTorch.
>
> AMD is working towards getting ROCm to that point on Debian. The
> inability to commit to getting it there by a specific date is a
> reflection of (1) the structure of the Debian project, in which nobody
> but the FTP Masters can decide that something will be accepted,
That is technically true, but ftp-master's involvement in accepting
packages is in a relatively superficial compliance capacity; one that
any larger organization distributing code and/or binaries will have,
including AMD and Ubuntu.
Could you (some time) elaborate on what problems AMD saw here? There
is a BoF event scheduled for DebConf25 on package acceptance [3] and
this would be valuable input for it.
> and (2) the fact that the Debian project is still so far away from
> that goal.
That applies even more so to Ubuntu though.
>> And we now all find ourselves in the strange situation where new
>> packaging contributors will need to be onboarded from the Ubuntu side,
>> whereas contributors from Debian must re-evaluate how to continue (see
>> below).
>
> I am also helping to onboard some AMD-hired Debian packaging
> contributors from Fedora. It's rather unfortunate that Mario and myself
> are the only two AMD developers
> [...]
> In any case, I don't yet know exactly what additional resources AMD will
> be bringing onboard with this announcement. Your concerns are heard, but
> I think it's premature to criticize the contributors being on-boarded
> when nobody knows who they are.
Please note that I wasn't criticizing the people (I don't even know
them), I was criticizing the situation. I mean, you yourself just
characterized one way it involves yourself as unfortunate.
>> (1) We know the packaging will happen for Ubuntu, so there is little
>> point in expending our own limited capacities, rather than to wait
>> for that. That doesn't just avoid duplication, it reduces the risk
>> of conflicting development paths.
>
> I think this would be true of any solution which involved paid
> contributors and volunteers. Why would you volunteer to do something
> that other people would be paid to do otherwise?
For one, that's a pretty odd question to ask a Debian contributor, of
all ;-)
There are many reasons why people contribute to FOSS. Sometimes it's
about a shared idealistic goal. Or just for the fun of it. Or to scratch
one's own itch. Why would it affect my own motivations if others do it
to pay the rent? I'd probably have to stop contributing to >90% of my
upstreams if doing work others get paid for were an issue. It's not.
In any case, when done right, "paid contributors exist" isn't a
displacing factor, it's an attracting one, as it means there are more
people to collaborate with.
> It makes total sense to reallocate your time elsewhere. If AMD is going
> to dedicate the resources to building out that base, then maybe you can
> spend your time on more interesting things?
I know you meant this well in a personal sense, but can you see how that
might sound displacing to anyone who chose to contribute to the Debian
ROCm Team so far. As opposed to a call-out for a Ubuntu+Debian
collaboration, for example.
Also, what if AMD doesn't ends up not building out that base. Or decides
to postpone it. Or makes design decisions downstream that are
irreconcilable upstream. etc.
We know that individual contributors at AMD, including yourself, are
going above and beyond for Debian! But you also know what kind of a
show-stopper "not officially supported" can be.
>> (2) We can expect that ROCm in Ubuntu will see extensive CI testing,
>> so again, there is little point in operating, much less expanding,
>> our own CI.
[Clarification going forward: with "our own CI", I meant our specific
instance at ci.rocm.debian.net. And I'm not the author of debci, I
maintain a fork of it.]
> I think it is unlikely that the Ubuntu continuous integration system
> will have anywhere near the breadth of hardware that the Debian ROCm CI
> contains. And I don't think it has public logs, either.
Yet another reason why I would have liked to hear about a Ubuntu+Debian
effort.
> Nevertheless, there is an important reason why I think you should
> continue to work on the Debian Continuous Integration system: Debian
> should be building a vendor-neutral system for supporting
> accelerator architectures (Intel, NVIDIA, AMD GPUs, NPUs, and FPGAs)
> on the DebCI.
Well, yes, as I've repeatedly stated that to be the goal from the
beginning (eg [4]). Our policies will eventually need to be updated,
and our ROCm CI was the perfect experimental playground to inform that
process.
> And weren't you working on expanding the Debian ROCm CI system to cover
> both Debian and Ubuntu?
That work is long done, see my RFC [5] for how/when to activate this.
> If the testing on Ubuntu is sufficient to reason
> about Debian, then wouldn't the existing testing on Debian already be
> sufficient to reason about Ubuntu?
Only to a degree. Enough of the stack diverge, even if only
occasionally, to require proper testing on both. Kernel, compilers,
GLIBC, Python, etc.
> AMD will be doing internal testing for ROCm components and key
> applications on both Debian and Ubuntu, but I still don't think that's a
> replacement for the DebCI.
Neither do I, hence why I would have liked to hear about a Ubuntu+Debian
effort.
>> Again, I realize how egotistical this might sound, and I'm sorry for
>> that. I know this is a win for open source in general, I just wish this
>> were more of a win for Debian, given how much we contributed to this.
>
> I think it is an enormous win for Debian. All I can ask is that you
> withhold judgement until we can share more details.
That is certainly a fair ask. I'll refrain from commenting further on
this until we have more details.
Best,
Christian
>> [1] Not to diminish the extra Ubuntu work you did get our packages
>> updated and synced for 24.04.
> [2]: https://lists.debian.org/debian-ai/2024/06/msg00001.html
[3]: https://debconf25.debconf.org/talks/3-package-acceptance-in-debian-challenges-and-opportunities/
[4]: https://lists.debian.org/debian-ai/2023/07/msg00098.html
[5]: https://lists.debian.org/debian-ai/2024/07/msg00078.html
Reply to: