Hello all,
As an experiment I've added support for non-SSE2 CPUs to the libssw library at
https://salsa.debian.org/med-team/libssw/tree/simde using the SIMDE header only library that provides implementations for many SIMD C/C++ intrinsics using plain source code, or SIMD equivalents on other processors (like NEON on arm64).
Upstream has been very responsive to my issues and pull requests, so this might be a nice way to improve portability of the software we package.
Specifically I'm interested in seeing more of our packages for the latest RaspberryPI systems (arm64).
Two downsides of using the SIMDE library:
1) Doesn't work with raw assembly, only C/C++ compiler intrinsics (<emmintrin.h> and friends)
2) Switching between different types of SIMD (like using SSE fallbacks for an SSE2 operation) is done at compile time and not run time.
Questions for you all:
1) Is this a good idea?
2) Should we carry these patches if upstream doesn't accept them?
3) Any ideas about compiling with different -m{avx2,avx,sse4.2,sse4.1,ssse3,sse3,sse2,sse,mmx} settings + simple wrapper generation to pick the right executable?
Cheers,