Towards lapack / lapack64 packaging
Hi science team,
I'm trying to add multi-flavor support to the openblas
package, as a part of the ongoing BLAS64 + LAPACK64 work.
However, there is some problems need to be discussed.
Two problems will be discussed in this email:
(1) building problem about OpenBLAS's liblapack64.so
(2) confirming details for our standard of BLAS/LAPACK virtual packages
To any other developers: If you maintain a (recursive)
reverse-dependency of libblas.so or liblapack.so, please
at least read the point 1 in section (2)
for a pitfall warning about performance.
(1) building problem about OpenBLAS's liblapack64.so
-----------------------------------------------------
For those who are not sure what the "64" suffix in BLAS64
and LAPACK64 means:
BLAS and LAPACK are very important numerical
linear algebra librarries that operates contiguous
numerical arrays.
libblas.so and liblapack.so provides functions
with 32-bit array indexing, e.g.
float cblas_asum(int N, float* X, int incX);
which calculates
sum_{i=1}^N abs(x_i)
However, "int" is 32-bit long on amd64.
This simply doesn't work with arrays containing
more than 2^31 elements. Hence we need a 64-bit
indexing variant, for example:
float cblas_asum(int64_t N, float* X, int64_t incX);
Note, as pointed out by Ben long time ago, the
correct type for pointer offset should be size_t
or ptrdiff_t, IIRC.
The 64-bit variants are needed by some scientific
computing users, and packages in cluding Julia language.
Sébastien pointed out that the `liblapack64.so` library
in my implementation[1] mixed 32-bit indexing code
and 64-bit indexing code. Because
liblapack64.so is compiled objects from:
(1) bin:liblapack-pic (32-bit indexing static lib)
(2) openblas's optimized lapack subset
when I turn on the INTERFACE64=1 flag in order to
build a 64-bit variant, the linker just mixes
symbols from 32-bit indexing bin:liblapack-pic
and symbols from 64-bit indexing openblas code,
yielding a quite problematic liblapack64.so
Sébastien provided some possible solutions:
1. build a 64-bit indexing variant of src:lapack
2. provide a liblapack64-pic (Sébastien prefer this)
Yes, the solution *2 poses very little workload because
we just need to rebuild lapack with fortran flag "-i8".
However, I'm thinking about the 3-rd solution:
3. disentangle src:lapack and src:openblas and just
use src:openblas's embedded copy of src:lapack.
(currently that embedded copy is removed from debian
source)
This (maybe) poses even less workload to me compared to *2 .
[1] https://salsa.debian.org/science-team/openblas/tree/lumin/
(2) confirming details for our standard of BLAS/LAPACK virtual packages
-----------------------------------------------------------------------
Disambiguity is very important before starting this section.
Everything will definitely turn into a mess if I don't do so.
In this section, I'll use the following notations:
* Uppercased "BLAS" means the standard BLAS API and ABI,
fortran-based. Debian's virtual packages libblas.so and
libblas.so.3 provide BLAS API and ABI. A typical BLAS
symbol looks like "sasum_" (suffixed by an underscore)
* Uppercased "CBLAS" means the c-version of the standard
BLAS API and ABI. A typical CBLAS symbol looks like
"cblas_sasum". (prefixed by "cblas_") The CBLAS ABI
has been squashed into libblas.so{,.3} . It's not
recommended to link against libcblas.so if you
found one in the Atlas package -- which splitted
the BLAS and CBLAS ABI into different shared objs.
* Uppercased "LAPACK" means the standard LAPACK API and ABI,
also fortran-based. Debian's liblapack.so and liblapack.so.3
provides the ABI.
* Uppercased "LAPACKE" means the c-version of the LAPACK
API and ABI. On Debian it is shipped by bin:liblapacke,
instead of squashed into liblapack.so (sounds a bit messy)
* Uppercased BLAS64, CBLAS64, LAPACK64, LAPACKE64 are
the corresponding 64-bit indexing variants.
It's important to differentiate fortran stuff from C stuff
because fortran stores array in column-major, while C in
row-major. Now let me point out some messy stuff:
1. BLAS/CBLAS packages looks relatively tidy, except Atlas
which splitted CBLAS into a separate libcblas.so .
That's a pitfall and numpy had ever fell into it: #913567
Debian's Atlas is terribly slow due to ISA baseline.[2]
Should we squash Atlas's libcblas.so back into it's
libblas.so ? [3] Like all other alternative libraries did.
2. LAPACK and LAPACKE are well-seperated into different
shared libraries. Sometimes LAPACKE is simply not
built. LAPACK has been registered in the alternatives
system: "liblapack.so", "liblapack.so.3".
Can we confirm that it's fine to provide only LAPACK
via liblapack.so and don't register LAPACKE in the
alternatives system?
If most reverse dependencies only require the fortran
ABI (LAPACK) instead of the C ABI (LAPACKE), then I
think it's fine to keep Debian's LAPACKE packages
as what they are for now.
[2] That's fine. We have well-optimized implementations
as alternatives: src:blis, src:intel-mkl, src:openblas
[3] That said, I think I'm not going to do it because
Atlas lost my interest as it is not fast enough and
not easy to tune.
Reply to: