Re: Optimized compilation

To: debian-devel@lists.debian.org
Subject: Re: Optimized compilation
From: Steve Dunham <dunham@cse.msu.edu>
Date: Wed, 16 Oct 2002 10:17:28 -0400
Message-id: <[🔎] 20021016141728.GA17835@arctic.cse.msu.edu>
In-reply-to: <[🔎] 20021011201254.A14975@stm.lbl.gov>
References: <20020928023359.GF22876@imperatrice.arcanes> <20020928044533.GB29746@quetzlcoatl.dodds.net> <20020928231930.GA27249@imperatrice.arcanes> <[🔎] 20021006014125.GC5105@doc.ic.ac.uk> <[🔎] 20021011201644.GP5761@imperatrice.arcanes> <[🔎] 20021011202103.GA9739@riva.ucam.org> <[🔎] 20021012082013.50737f2b.bug1@optushome.com.au> <[🔎] 20021011201254.A14975@stm.lbl.gov>

On Fri, Oct 11, 2002 at 08:12:54PM -0700, David Schleef wrote:
> On Sat, Oct 12, 2002 at 08:20:13AM +1000, Glenn McGrath wrote:
> > It really grates on my nerves to hear people arguing that it only makes
> > it a little bit faster so its not worth doing.... its shouldnt be your
> > decision, let the user decide if they want to take full advatage of their
> > hardware, or run it in 1985 emulation mode.

> In case you hadn't noticed, most modern CPUs will execute even
> really crappy assembly code optimally.  Most instruction-level
> bottlenecks come from data dependencies, which is a direct
> result of the way the C code is written.  This is not something
> that the compiler can optimize around without breaking the C
> standard.

> CPU manufacturers have made compiler-based optimizations largely
> irrelevant.  A better approach to doing something really
> meaningful is to find poorly performing inner loops and rewrite
> them with an eye on improving instruction flow through modern
> CPUs.

This isn't true of the Sparc (it is pipelined rather than out of
order, deeply pipelined in the case of the UltraSparcIII).  ia64 is a
VLIW processor, which makes its performance extremely dependent on
compiler optimizations.  Dunno where the rest of our archs stand.

Actually, Debian performance on the sparc is _really_ bad because of
backwards compatibility.  gcc, by default, targets a sparc chip that
doesn't have integer multiply.  (BenC says this will change after the
gcc3.2 transistion.)

(Although, I think gentoo goes to the other extreme and compiles
everything as 64-bit code on newer sparcs, which can give a 2x
performance hit in some cases.)

But I agree that it's not worth it to have multiple sub archs for the
entire x86 distribution.  (However, it could be argued that some code,
like mpeg decoders, be distributed for multiple subarchs.)

Steve
dunham@cse.msu.edu

Reply to:

References:
- Re: Optimized compilation
  - From: Andrew Suffield <asuffield@debian.org>
- Re: Optimized compilation
  - From: Pierre THIERRY <pierre.thierry@moine-fou.org>
- Re: Optimized compilation
  - From: Colin Watson <cjwatson@debian.org>
- Re: Optimized compilation
  - From: Glenn McGrath <bug1@optushome.com.au>
- Re: Optimized compilation
  - From: David Schleef <ds@schleef.org>

Prev by Date: Re: Spam: process the web archives?
Next by Date: Re: Spam: process the web archives?
Previous by thread: Re: Optimized compilation
Next by thread: Re: dpkg-source v2
Index(es):
- Date
- Thread