RFC: threadding-aware virtual BLAS/LAPACK?
Through some previous discussions I realized that the issue of
threadding implementation of the BLAS can be sometimes cumbersome. For
example, sometimes we can observe severe performance regression from
pthread program + BLAS with gomp, or even observe severe calculation
error from gomp program + BLAS with iomp (aka llvm openmp).
These threadding disasters will be eventually propagated to our end
users, and may possibly harm their scientific computing experience.
Recall what Fedora does for the BLAS libraries: they dont do any virtual
package at all. OpenBLAS packages with different threadding support are
given different sonames:
openblas + pthread: libopenblasp.so.*
openblas + openmp: libopenblaso.so.* (IIRC)
openblas + serial: libopenblass.so.* (IIRC)
Although this makes implementations not switchable at runtime, but at
least the users won't have to struggle with threads.
I wrote the alternatives mechanism for gentoo's blas/lapack, which
resembles Debian's. However, gentoo's package management system supports
a "USE flag" feature which allows the user to set global threadding
implementation for the whole system.
------------- what can we do
Maybe we can provide some more virtual shared objects such as
libblasp.so, which has candidates such as openblas-pthread and
In that case if a Debian maintainer intentionally choose to link against
a pthread BLAS (libblasp) to avoid issues of an openmp BLAS (libblaso),
we can help the users to avoid threadding issues when they don't read
any documentation (actually I think 99% of the users won't read the doc,
or have enough background to understand the threadding issue).
We can dig deeper into this direction if this turns to be a sensible
direction for making improvement.
This is a part of my GSoC2020 project, although this topic is not on the