Re: fftw: Usage of SSE in 64bit?

To: debian-science@lists.debian.org
Subject: Re: fftw: Usage of SSE in 64bit?
From: "Steven G. Johnson" <stevenj@alum.mit.edu>
Date: Mon, 20 Jun 2011 23:55:40 -0400
Message-id: <[🔎] itp4ns$ate$1@dough.gmane.org>
In-reply-to: <201103231452.27320.carsten.aulbert@aei.mpg.de>
References: <201103231452.27320.carsten.aulbert@aei.mpg.de>

(Reposting, as this message does not seem to have gone through.)

I am one of the FFTW developers, and wanted to comment on this.

Yes, you should definitely use --enable-sse/--enable-sse2 flags in whencompiling single/double precision versions of FFTW on all x86 and x86-64platforms. This is *not* just a matter of compiler flags -- it enablesthe compilation of special computational kernels in FFTW that explicitlyuse SSE/SSE2 intrinsics.

In addition to x86-64, note that this is SAFE to enable in general forall 32-bit x86 platforms. FFTW checks at runtime to see whether theprocessor supports SSE/SSE2 and disables its SSE/SSE2 code if not.(Similarly for Altivec on PowerPC, and similarly in the next release forAVX instructions.)

In general, I would recommend that the packager read the FFTWinstallation manual closely, since it documents these options.

    http://fftw.org/doc/Installation-on-Unix.html

For benchmarking, I would recommend using the "bench" program that comeswith FFTW. e.g. you can compare for a size-1024 FFT with and without theSSE/SSE2 kernels just by doing:

    ./bench -opatient 1024
    ./bench -opatient -onosimd 1024

On my 64-bit Intel Xeon E5440 running FFTW 3.2.2 and Debian GNU/Linux,the SSE/SSE2 version is faster for size 1024 by a factor of 1.7 indouble precision and by a factor of 3.4 in single precision.


Regards,
Steven G. Johnson

Reply to:

Follow-Ups:
- Re: fftw: Usage of SSE in 64bit?
  - From: Carsten Aulbert <carsten.aulbert@aei.mpg.de>

Prev by Date: Re: fftw: Usage of SSE in 64bit?
Next by Date: Re: fftw: Usage of SSE in 64bit?
Previous by thread: Re: fftw: Usage of SSE in 64bit?
Next by thread: Re: fftw: Usage of SSE in 64bit?
Index(es):
- Date
- Thread