[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: fftw: Usage of SSE in 64bit?

Hi Steven

On Tuesday 21 June 2011 05:55:40 Steven G. Johnson wrote:
> I am one of the FFTW developers, and wanted to comment on this.

Thanks a lot!
> Yes, you should definitely use --enable-sse/--enable-sse2 flags in when
> compiling single/double precision versions of FFTW on all x86 and x86-64
> platforms.  This is *not* just a matter of compiler flags -- it enables
> the compilation of special computational kernels in FFTW that explicitly
> use SSE/SSE2 intrinsics.
> In addition to x86-64, note that this is SAFE to enable in general for
> all 32-bit x86 platforms.  FFTW checks at runtime to see whether the
> processor supports SSE/SSE2 and disables its SSE/SSE2 code if not.
> (Similarly for Altivec on PowerPC, and similarly in the next release for
> AVX instructions.)

Well, that depends what you are aiming for. If you want to have a single 32bit 
x86 package which is guaranteed to work for all x86 compatible CPus out there 
starting say at a Pentium II level you have to ensure that this will still 
work - for my case where I have ~ 1800 computers doing number crunching and 
all are 64bit this is another matter then the one Debian has for packaging.
> For benchmarking, I would recommend using the "bench" program that comes
> with FFTW. e.g. you can compare for a size-1024 FFT with and without the
> SSE/SSE2 kernels just by doing:
>      ./bench -opatient 1024
>      ./bench -opatient -onosimd 1024
> On my 64-bit Intel Xeon E5440 running FFTW 3.2.2 and Debian GNU/Linux,
> the SSE/SSE2 version is faster for size 1024 by a factor of 1.7 in
> double precision and by a factor of 3.4 in single precision.

Interesting, I think I need to rerun my tests again but then again this could 
be that I was just using a 'measured' plan.

Thanks a lot for the insight!



Reply to: