Re: Using hwcaps instead of simde dispatcher scripts?
Hi Nilesh, Michael,
thank you for sharing your existing solution, and the interesting
discussion!
On 2025-07-27 13:40, Michael R. Crusoe wrote:
> On 26/07/2025 19.32, Nilesh Patra wrote:
>
> Using hwcaps for dynamically-linked scientific computing libraries is a
> great idea, yes! I recommend improving the documentation at https://
> wiki.debian.org/InstructionSelection#hwcaps with concrete Debian-
> specific examples (or perhaps linking to a new wiki page if that gets
> too long).
Good idea, I will do that. I wasn't aware of that page, thanks!
> Unfortunately, I think that many of the packages from the scientific
> Debian Blends teams don't put their performance critical functions in a
> dynamically loaded library, and thus would NOT benefit from the GLIBC
> 2.33+ hwcaps feature. Using your example of the "scrappie" Debian
> package, we see that there are only binaries, and no dynamic libraries:
> https://packages.debian.org/sid/amd64/scrappie/filelist https://
> packages.debian.org/unstable/scrappie
>
> I would love to see a generic Debian dispatcher script that could be
> used for amd64 systems (and eventually arm64 & riscv64 systems) to
> select between binaries using a similar naming scheme to GLIBC hwcaps,
> but anchored in /usr/bin/ (/usr/bin/x86_64-v[1234]/* ?)
> For binary selection, we could add a script to https://
> tracker.debian.org/pkg/subarch-select which would be symlinked from /
> usr/bin/app-name and would use subarch-select to choose between /usr/
> bin/x64_64-v[1234]/app-name based upon the current CPU's capabilities.
That sounds like a fantastic idea. Going too far for now, but FHS
forbids subdirectories in /usr/bin, but something like
/usr/libexec/debian-hwcaps/<level> or whatever will of course also work.
And I wasn't aware of subarch-select. Really neat.
> Likewise I would love to see shared helpers for d/rules for building
> both shared library packages and single-binary packages which automate
> the multiple builds and multiple installation locations needed, thus
> simplifying the work required to take full advantage of GLIBC hwcaps
> and/or the debian-wide shared dispatcher script mentioned above. (Some
> packages might have critical code in both an application binary and
> shared libraries, thus benefiting from using both of the multi-build
> approaches outlined above).
This would be terrific but I fear also extremely complicated. I'm not
even sure this can be solved with additional helpers, this might need
modification of debhelper directly.
For example, how would an override dh_auto_configure look like? But the
same problem applies to many other steps in the sequence. But also not all.
Given where compute is going, it's definitely something that we as a
Project should be looking in to.
> For RISCV64, I would suggest that the RISC-V Application Profiles
> (RVA{20,22,23}) would be used in the same way that the x86_64-v{1,2,3,4}
> micro-architectures are used on amd64; but this is not yet supported by
> GLIBC. However Debian could support them in the same way that I suggest
> above for amd64 in /usr/bin/x86_64-v[234]/*, perhaps using /usr/bin/
> riscv64-RVA{20,22,23}/*.
There was a RISCV64 BoF at DebConf25 where I inquired about this.
Aurelien said GLIBC upstream is working on hwcaps there.
> For arm64, I think this would require a bit more research. I'm not sure
> that subsequent ARMv{8,9} revisions are strictly followed as I've
> noticed that ARM suggests checking for specific CPU features and not for
> architecture revisions like "ARMv8.6".
>From what I gather, ARMv{8,9} don't have uarch levels that guarantee
some feature, they are all optional. So yeah, you really have to test
whether they are present or not. I found this out via ggml.
A good example are Apple's M4 chips, none of them appear to support SVE
but M4 supports SME, which came after SVE.
In any case, I guess this is a discussion that also include
debian-science and the debhelper maintainers. I wasn't aware of most of
the stuff you shared, and I assume that many others on -science won't
yet be aware of any of the solutions discussed here.
I'm not sure everyone will be able to contribute solutions, but at least
we should be able to collect more use cases that could inform some of
the design decisions. Are we forgetting something, are we overlooking
quirks, etc.
At some point, this could probably also be a DEP, I guess?
Best,
Christian
Reply to: