[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: priority score of libflame (LAPACK alternative)?



Some updates.

Previously I conducted the experiments with LD_PRELOAD=...libflame.so
to override the lapack symbols. When really use libflame.so as a
liblapack.so provider, programs may end up failing due to some missing
symbols.

Now, the missing symbols are borrowed from liblapack_pic.a
(liblapack-dev) for building the shared lapack.so (libflame).
A package built from the master branch should be usable as a general
liblapack.so provider.

Programs no longer fail with errors such as "undefined symbols".
Speed gain has been sustained. Apart from that, I noted that 5 of
the numpy unit tests failed with libflame::lapack.so backend.

On Mon, Dec 02, 2019 at 01:13:04PM +0000, Mo Zhou wrote:
> Hi science team,
> 
> As usual, I'd like to inform the team before registering a new lapack
> implementation into our blas/lapack ecosystem.  The new implementation
> is called "libflame", from the upstream of BLIS:
>   https://github.com/flame/libflame
> Similar to BLIS, it is a lapack-like object-based implementation, and
> provides a compatibility layer to the traditional (fortran) lapack
> called "lapack2flame".
> 
> I noticed this library because it's one of the AMD's reviving math library
> stack (to some extent the MKL counterpart?): https://developer.amd.com/amd-aocl/
> It is also noted that AMD upstreamed their patches to BLIS upstream.
> That's a healthy phenomenon.
> 
> My preliminary tests of single precision SVD factorization demonsrate a
> significant improvement over the netlib lapack and the openblas
> lapack[1] implementation. Please find the results in the last part of
> this mail.
> 
> Given these obvervations, I propose to
> 
>   * set the priority value of `libflame` (as a liblapack.so.3 provider)
>     to 80,
> 
> because 1) I'm still not sure wether the libflame compat layer provides
> the complete ABI; 2) We have not tested is sufficiently; 3) 80 is close
> to the BLIS priority values (for libblas.so.3).
> 
> ---
> 
> My test code can be found in the MKL packaging:
> https://salsa.debian.org/science-team/intel-mkl/blob/master/debian/tests/test-gesvd.cc
> Preliminary packaging can be found here:
> https://salsa.debian.org/science-team/libflame
> Switching alternatives has been made easy by my tiny util:
> https://tracker.debian.org/pkg/rover
> 
> Results on Xeon Gold 6126 (sgesvd_, 512x512 matrix size):
> 
>   BLAS=openblas LAPACK=openblas -> ~560ms  # pthread
>   BLAS=atlas    LAPACK=atlas    ->  N/A    # cgesvdq_ symbol not found
>   
>   BLAS=netlib   LAPACK=netlib   -> ~820ms
>   BLAS=atlas    LAPACK=netlib   -> ~600ms
>   BLAS=blis     LAPACK=netlib   -> ~560ms  # BLIS_NUM_THREADS=1
>   
>   BLAS=netlib   LAPACK=libflame -> ~700ms
>   BLAS=atlas    LAPACK=libflame -> ~490ms
>   BLAS=blis     LAPACK=libflame -> ~415ms  # BLIS_NUM_THREADS=1
>   BLAS=openblas LAPACK=libflame -> ~415ms
> 
> I didn't compare it with MKL (non-free). That's unnecessary.
> 
> [1] openblas lapack $\approx$ netlib lapack, except for a few routines.
> 


Reply to: