Re: Altivec in baseline for ppc64?
On Tue, Jul 13, 2021 at 7:21 PM Sébastien Villemot <firstname.lastname@example.org> wrote:
> Hi Mathieu,
> Le mardi 13 juillet 2021 à 18:56 +0200, Mathieu Malaterre a écrit :
> > On Tue, Jul 13, 2021 at 2:04 PM Sébastien Villemot <email@example.com> wrote:
> > >
> > > The wiki page that synthesizes architecture specificities indicates
> > > that Altivec is included in the baseline for the ppc64 port:
> > > https://wiki.debian.org/ArchitectureSpecificsMemo#ppc64
> > >
> > > However my understanding is that this port supports any powerpc64 CPU,
> > > including some that don’t have Altivec (e.g. POWER4 or POWER5). This is
> > > also what the main wiki page for PPC64 says:
> > > https://wiki.debian.org/PPC64
> > >
> > > Can someone please clarify the situation?
> > >
> > > (I’m asking because I’m the maintainer of the openblas package, and
> > > knowing whether Altivec is available or not, and more generally what is
> > > in the baseline, is essential for proper packaging).
> > I do not believe that you can do much as a packager. You cannot assume
> > anything on the target arch. You need to do the same thing as ffmpeg
> > is doing for avx2/sse4 on amd64, you need to do runtime detection. So
> > unless upstream is doing something very clever you cannot compile blas
> > using any of the fancy altivec instructions :(
> > The man page for ld.so mentions something about optimized libraries
> > (search for "/usr/lib/sse2/"), but this is currently not in use in
> > Debian (AFAIK).
> Actually OpenBLAS has its own runtime detection mechanism, which is
> used to select the best linear algebra kernel for the current CPU
> (those kernels are mainly written in assembly, and take advantage of
> available ISA extensions). This mechanism is used on several archs,
> including ppc64el (so at runtime, OpenBLAS chooses between a POWER8 and
> a POWER9 kernel; there is even a POWER10 kernel already available).
> However, I cannot enable this mechanism on ppc64 and powerpc, because
> the runtime detection only works for POWER6 and above, and my
> understanding is that for these two ports the baseline is lower. Hence
> on these two archs, only one kernel is included in the package binaries
> (currently POWER4 for ppc64 and PPCG4 for powerpc). For optimal
> performance, users should recompile OpenBLAS locally (as indicated in
> the package description and in README.Debian).
There are plenty of people on this mailing list that could test/verify
that. Is there a quick way to check that your openblas package is
compiled correctly for ppc32 and ppc64 (like a verbose mode) ? Did you
do any experiment on perotto.debian.net ?
> I am however not sure that my current choices for the ppc64 and powerpc
> baselines are optimal, hence this thread.
> ⢀⣴⠾⠻⢶⣦⠀ Sébastien Villemot
> ⣾⠁⢠⠒⠀⣿⡁ Debian Developer
> ⢿⡄⠘⠷⠚⠋⠀ https://sebastien.villemot.name
> ⠈⠳⣄⠀⠀⠀⠀ https://www.debian.org