[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: shapeit4 and AVX2





On Mon, 9 Nov 2020 at 17:21, Étienne Mollier <etienne.mollier@mailoo.org> wrote:
Howdy,

I'm filling a wishlist item in the bug tracker, so that the
discussion does not disappear inside mail archives.  I gave a
try to shapeit4 autopkgtest suite with and without FMA & AVX2
support, but it had a run time of 1m25s in both cases on my
machine (Ryzen 5 3600 w/ 6 cores).  It is quite possible I
neglected some other bottlenecks though, but the assembler did
embed AVX2 instructions when I checked the build result.  Out of
curiosity, has someone figures on the performance gain for that
software when extensions are available?

Michael R. Crusoe, on 2020-11-05 21:26:30 +0100:
> As documented at
> https://wiki.debian.org/SIMDEverywhere

shapeit4 provides a dedicated code path for "-mfma -mavx2" build
options, and another one for generic builds.  Is it still worth
using SIMDe in this particular situation?  The "use case"
paragraph of the wiki page seems to suggest it is not strictly
needed here.

Given the fallback route that doesn't use SIMD, then implementing our own is not necessary, however compiling the FMA+AVX2 path using SIMDe on non-x86 archs may result in a speedup for them.

Would be best to get a bigger training dataset to confirm the benefit, or at least the lack of regression :-)

If a performance benefit is observed, it might be interesting to see if the AVX-only and "lower" SIMD levels on x86 also experience a speed up.

Reply to: