Re: usage of -mtune=core2 ? (Was: Bug#776812: vsearch: FTBFS on non-x86: uses non-portable flags)

To: Debian Mentors List <debian-mentors@lists.debian.org>, Tim Booth <avarus@fastmail.fm>
Cc: 776812@bugs.debian.org
Subject: Re: usage of -mtune=core2 ? (Was: Bug#776812: vsearch: FTBFS on non-x86: uses non-portable flags)
From: Andreas Tille <andreas@an3as.eu>
Date: Mon, 2 Feb 2015 11:53:38 +0100
Message-id: <[🔎] 20150202105338.GR26807@an3as.eu>
In-reply-to: <[🔎] 1422873500.2249.118.camel@gmail.com>
References: <20150202034852.32067.23162.reportbug@ghostwheel.internal.ucko.debian.net> <[🔎] 20150202065128.GP26807@an3as.eu> <[🔎] 1422873500.2249.118.camel@gmail.com>

Hi Gert,

thanks for your helpful comments.

On Mon, Feb 02, 2015 at 11:38:20AM +0100, Gert Wollny wrote:
> Hello, 
> 
> On Mon, 2015-02-02 at 07:51 +0100, Andreas Tille wrote:
> > Hi Mentors,
> 
> > It is very important to build vsearch with the maximum optimisation for speed
> > and thus I wonder whether dropping this option is a good idea or whether
> > I should enable it on i386 and amd64 (the question extends also to
> > freebsd-i386/freebsd-amd64 once an other issue in freebsd with this
> > package is solved).
> 
> On amd64 sse/sse2 is enabled by default. 
> 
> Tuning the code for a specific processor (i.e. core2) might not be such
> a good idea, according to the GCC man page one should use -mtune=generic
> instead: 
> 
> "generic: 
> 
>  Produce code optimized for the most common IA32/AMD64/EM64T processors.
> If you know the CPU on which your code will run, then you should use the
> corresponding -mtune or -march option instead of -mtune=generic.  But,
> if you do not know exactly what CPU users of your application will have,
> then you should use this option.
> As new processors are deployed in the marketplace, the behavior of this
> option will change.  Therefore, if you upgrade to a newer version of
> GCC, code generation controlled by this option will change to reflect
> the processors that are most common at the time that version of GCC is
> released. " 

Tim, could you clarify with upstream if they agree that -mtune=generic is
the option that should be used?  In this case my patch in svn I prepared
in advance (x86_spezific_opts.patch) should be dropped.
 
> In addition, with itksnap I saw that -funroll-loops and -ftree-vectorize
> improved performance a lot, and these are options that do not depend on
> the architecture, but are also not enabled by default.
> 
> -funroll-loops may also slow down the code, you should check this. It is
> especially effective if there are many small loops of fixed size (like
> it is the case with ITK's types that are templated over dimensions). 
> 
> -ftree-vectorize may be useless on x86 without SSE but on amd64 it could
> give some speedups.

Tim, could you do some performance checks?  I have no idea whether the
usual upstream test suite is a proper check for this. 

Kind regards

      Andreas.

-- 
http://fam-tille.de

Reply to:

Follow-Ups:
- Re: usage of -mtune=core2 ? (Was: Bug#776812: vsearch: FTBFS on non-x86: uses non-portable flags)
  - From: Tim Booth <avarus@fastmail.fm>

References:
- usage of -mtune=core2 ? (Was: Bug#776812: vsearch: FTBFS on non-x86: uses non-portable flags)
  - From: Andreas Tille <andreas@an3as.eu>
- Re: usage of -mtune=core2 ? (Was: Bug#776812: vsearch: FTBFS on non-x86: uses non-portable flags)
  - From: Gert Wollny <gw.fossdev@gmail.com>

Prev by Date: Re: usage of -mtune=core2 ? (Was: Bug#776812: vsearch: FTBFS on non-x86: uses non-portable flags)
Next by Date: Re: usage of -mtune=core2 ? (Was: Bug#776812: vsearch: FTBFS on non-x86: uses non-portable flags)
Previous by thread: Re: usage of -mtune=core2 ? (Was: Bug#776812: vsearch: FTBFS on non-x86: uses non-portable flags)
Next by thread: Re: usage of -mtune=core2 ? (Was: Bug#776812: vsearch: FTBFS on non-x86: uses non-portable flags)
Index(es):
- Date
- Thread