Re: fftw: Usage of SSE in 64bit?
On Tuesday 21 June 2011 05:55:40 Steven G. Johnson wrote:
> I am one of the FFTW developers, and wanted to comment on this.
Thanks a lot!
> Yes, you should definitely use --enable-sse/--enable-sse2 flags in when
> compiling single/double precision versions of FFTW on all x86 and x86-64
> platforms. This is *not* just a matter of compiler flags -- it enables
> the compilation of special computational kernels in FFTW that explicitly
> use SSE/SSE2 intrinsics.
> In addition to x86-64, note that this is SAFE to enable in general for
> all 32-bit x86 platforms. FFTW checks at runtime to see whether the
> processor supports SSE/SSE2 and disables its SSE/SSE2 code if not.
> (Similarly for Altivec on PowerPC, and similarly in the next release for
> AVX instructions.)
Well, that depends what you are aiming for. If you want to have a single 32bit
x86 package which is guaranteed to work for all x86 compatible CPus out there
starting say at a Pentium II level you have to ensure that this will still
work - for my case where I have ~ 1800 computers doing number crunching and
all are 64bit this is another matter then the one Debian has for packaging.
> For benchmarking, I would recommend using the "bench" program that comes
> with FFTW. e.g. you can compare for a size-1024 FFT with and without the
> SSE/SSE2 kernels just by doing:
> ./bench -opatient 1024
> ./bench -opatient -onosimd 1024
> On my 64-bit Intel Xeon E5440 running FFTW 3.2.2 and Debian GNU/Linux,
> the SSE/SSE2 version is faster for size 1024 by a factor of 1.7 in
> double precision and by a factor of 3.4 in single precision.
Interesting, I think I need to rerun my tests again but then again this could
be that I was just using a 'measured' plan.
Thanks a lot for the insight!