[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Why not 03 ?



On 30.05.2014 09:40, Xavier Roche wrote:
> On Fri, May 30, 2014 at 11:10:29AM +1000, Russell Stuart wrote:
>> In particular -O3 turns on auto-vectorisation.  It can provide a big
>> speed up to programs that can take advantage of it
> [...]
>> As others have pointed our -O3 turns on optimisations that help on some
>> architectures and hinder on others.  Vectorisation sort of falls into
>> that category: hinder becomes "fail with a SIGILL".
> 
> On x86-64, AFAICS, you have at least SSE2 and 16 XMM registers, whatever the processor is. Yes, you can not enable AVX(2), but you still can do interesting vector optimizations with the most common x86-64 processor.
> 
> (*) http://en.wikipedia.org/wiki/X86-64
> (*) http://en.wikipedia.org/wiki/Advanced_Vector_Extensions
> 
> 

to be able to make use of the autovectorizer in non trivial loops you
usually need more options than just O3.
The C standard is very strict in regards to floating point semantics,
e.g. they are not associative, there may be signaling nans, errno may
need to be set, memory can alias etc.
This normally prevents autovectorizers from working without adding extra
flags telling the compiler about special circumstances of a loop.
You can do this via gccs function attribute e.g. adding
-funsafe-math-optimizations to a function where the compiler may go
crazy (OpenMP 4.0 also introduces pragma SIMD for this purposes).

So enabling O3 by default will most likely not gain us much for most
cases as application that profit as a whole from vectorization are not
that common and if they do they are usually to complex to allow
autovectorization without patches.

Also it would only be effective on amd64, x32, arm64 and ppc64el as
those are the only platforms that have mandatory SIMD instructions.


Reply to: