Re: Cross-distro Call for BLAS64/LAPACK64 Convention Alignment
- To: Mo Zhou <lumin@debian.org>, Felix Yan <felixonmars@archlinux.org>,        Susi Lehtola <jussilehtola@fedoraproject.org>,        Sébastien Villemot <sebastien@debian.org>,        Benda Xu <heroxbd@gentoo.org>, Milan Bouchet-Valat <nalimilan@club.fr>,        François Bissey <frp.bissey@gmail.com>,        openblas-owner@fedoraproject.org, atlas-owner@fedoraproject.org,        blas-owner@fedoraproject.org, blis-owner@fedoraproject.org
- Cc: debian-science@lists.debian.org
- Subject: Re: Cross-distro Call for BLAS64/LAPACK64 Convention Alignment
- From: Susi Lehtola <jussilehtola@fedoraproject.org>
- Date: Sat, 3 Aug 2019 13:00:05 +0300
- Message-id: <[🔎] ec28cef1-a425-cff8-95df-4ff32fc8c4b0@fedoraproject.org>
- In-reply-to: <[🔎] b0faa822655d68f0d50cb85e0392b59b@debian.org>
- References: <[🔎] b0faa822655d68f0d50cb85e0392b59b@debian.org>
Dear all,
I'm including
- openblas-owner@fedoraproject.org
- atlas-owner@fedoraproject.org
- blis-owner@fedoraproject.org and
- blas-owner@fedoraproject.org
into the list of recipients, as these packages and their maintainers are 
also touched by the proposal.
On 8/3/19 8:55 AM, Mo Zhou wrote:
Dear BLAS/LAPACK maintainers[0],
(Please invite more distribution maintainers to join the discussion
if you know the right contact, thanks. Please keep the debian-science
public mailing list in CC to keep the discussion public.)
Mordern BLAS/LAPACK implementations, such as OpenBLAS, BLIS, etc
support compile-time options to enable the array-64bit-indexing[1]
feature. BLAS/LAPACK libraries with such feature enabled are not
compatible to the 32bit-indexing versions in the ABI level, but
they lack a standard in SONAME and ELF symbol names. Please allow
me to omit the explanation about why we need to reach a consensus
on BLAS64/LAPACK64 convention and why scientific computing users
need them, as I think all the participants have enough background
on this problem.
In this mail I'll simply call these 64bit-indexing implementations
as BLAS64/LAPACK64. I'm trying to reach a consensus with major
distributions on the convention for BLAS64/LAPACK64 libraries,
including SONAME and ELF symbol names. @nalimilan had already
said about this on GitHub during the Julia's debate on their
libopenblas64_.so .
------ First let's look at the SONAME ---------
32-bit indexing BLAS/LAPACK are not compatible with the
64-bit indexing BLAS/LAPACK libraries in the ABI level.
If we don't change the SONAME for the 64-bit ones,
every people, including developers and users, would
get confused on what configuration is.
Fedora's .spec compiles OpenBLAS more than once, producing
a series of libraries (which share the same functionality)
with different index length and different threading models.
Particularly the named the default 64bit-indexing variant
of OpenBLAS as "libopenblas64.so".
Last year I raised a similar discussion on debian-devel
about the convention. The first version I proposed is
things like "libblas-ilp64.so" and "libopenblas-ilp64.so"
following MKL/Fortran's convention. But our opinion
eventually converged at "libopenblas64.so", "libblas64.so",
etc.
Hereby I propose:
*. We (cross-distro) append the "64" string to the SONAME
    of a 64bit-indexing BLAS/LAPACK variant in order to
    differentiate them from the currently existing
    32bit-indexing ones.
Indeed, this is what we are doing in Fedora at the moment.
----- Then let's look at the symbol names ------------
Different 32-bit BLAS/LAPACK libraries are expected to
be compatible with each other in both API/ABI level.
Based on this prior knowledge Debian developed a
runtime-switching mechanism[3] to switch the actually
used BLAS/LAPACK library, which had been working
well for years. Recently we've even introduced
intel-mkl (non-free) into this mechanism.
We discussed this briefly some time ago, but not much has happened so 
far. I think it could be done, but it would take quite a lot of work; 
the BLAS and the LAPACK libraries are tied together since several 
optimized BLAS libraries like OpenBLAS also provide some further 
optimized LAPACK routines which are probably not cross-compatible.
Also, Debian's choice is not runtime, as it is a system-wide choice 
based on alternatives. A better method would be environment modules, 
where the choice can be made by the user on a session level; this is 
what we are using for MPI libraries, e.g. OpenMPI vs MPICH.
For most BLAS/LAPACK implementations that support
64bit-indexing, the symbol names are not mangled.
which means both libblas.so and libblas64.so
provide symbols such as "sgemm_" (fortran
subroutine for single-precision general
matrix-matrix multiplication", but with different
index types (int32_t v.s. int64_t).
Hereby I propose:
* We (cross-distro) don't mangle the symbols in
   a 64bit-indexing BLAS/LAPACK library to avoid
   introducing surprise to scientific computing
   users.
Again, this is also what we are doing at the moment, at least both for 
OpenBLAS as well as reference BLAS/LAPACK.
The following special case explains why I propose so:
----- And there are special cases : Julia -------
Julia re-SONAME'ed their vendored openblas into
libopenblas64_.so, and mangled all the exported
symbols by suffixing "64_".
To some extent this choice is made in order to
avoid symbol clash with system libraries if
libblas.so and libblas64.so are indirectly
dynamically linked to a program.
This choice has a downside: Julia's pre-bulit binaries
are all linked against libopenblas64_.so (e.g. Arpack.jl)
And distribution packages won't be useful
anymore to Julia's ecosystem.
Julia is indeed a special case, because they want to be able to use 
*both* the 32-bit and the 64-bit versions of BLAS/LAPACK from inside the 
same binary. This, of course, requires symbol mangling. In Fedora, we 
supply special versions of OpenBLAS just for Julia.
What's your opinion? If we made a good start,
later on I'll try to contact some upstream such as intel's MKL team.
I would like to raise a further issue: parallel / threaded libraries. 
For instance, ATLAS and OpenBLAS can be compiled either as a sequential 
library, a parallel library based on pthreads, and a parallel library 
based on OpenMP.
OpenMP is probably the most used parallellism method in scientific 
software, since it is extremely easy to work with. In these cases, it is 
important to use an OpenMP BLAS/LAPACK library so that new threads don't 
get created inside regions where the application code is already running 
in parallel. However, the pthreads LAPACK/BLAS libraries may be faster 
in applications that are not OpenMP parallel.
Again, the point here is not to mangle symbol names.. but different 
sonames should be used. For OpenBLAS in Fedora, we have the following 
variants:
- libopenblas: sequential version
- libopenblasp: pthreads version
- libopenblaso: OpenMP version
... with 3 variants of each corresponding to the bitness, e.g. 
libopenblas is 32-bit, libopenblas64 is 64-bit, and libopenblas64_ is 
64-bit with symbol mangling for Julia allowing it to be linked in 
combination to the 32-bit libopenblas.
--
--
Susi Lehtola
Fedora Project Contributor
jussilehtola@fedoraproject.org
Reply to: