[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: opzioni per l'ottimizzazione...



Ho trovato molto interessante questo articolo e dai commenti relativi:
http://freshmeat.net/articles/view/730/

Estrapolo alcuni pezzi (molti sono da commenti e quindi sono da verificare):

[...]
A higher -O does not always mean improved performance. -O3 increases the code size and may introduce cache penalties and become slower than -O2. However, -O2 is almost always faster than -O.
[...]
First, best generic optimisation level for modern architectures is a Pentium - and you can really assume everyone has at least an old Pentium. Trying to optimise code (by hand or by genetic mutation) for Pentium 2 and 3, as well as for Athlon, almost never shows more than 2% speed increase, provided that Pentium code was really optimal, and taking into account that GCC doesn't effectively use MMX and SSE, except for special vector intrinsics, which are used very rarely in the code.
[...]
My experience with GCC 3.2 is that, for a fairly well-written (read: optimized, not neat) integer/mem code compiled with some sensible optimization options, the code generated that is slightly slower and larger compared to GCC 2.96 output.
[...]
I have seen gcc 3.2.1 outperform gcc 2.95.3 on
video compression applications by 3-5%, on an
Athlon running Linux and also on Cygwin. Pretty
impressive. (gcc 2.96 is buggy btw, beware -- pull
down mplayer if you doubt this.)
[...]
A few more suggestions as regards optimization -- I've done a bit of benchmarking with different options. Most specific tweaks you can try (above -O3) make very little difference on average code, with the exception of three options.

First, -fomit-frame-pointer can provide a small boost (admittedly, not as much as I'd expect), at least on x86. The drawback is that you will not be able to get backtraces from core dumps or dying apps. This might be worth using if you have a program that's *almost* fast enough, but not quite, like an emulator or movie player, and you're not doing development on it (or care about sending in bug reports).

Second, -ffast-math *can* be very helpful, though most programs will not see much of a benefit, since usually you don't see a ton of floating-point operations in most software. This *can*, as per the gcc man page, break correct software, but I've yet to run into a package that it causes problems with.

Third, -fstrict-aliasing produces a speedup of around 10% in snes9x. While strict ANSI code should not be broken by it, it's relatively easy for someone to write code that *does* break with -fstrict-aliasing. I haven't seen many problems with it.


Fourth, -DNDEBUG isn't technically a compiler flag, but will tell the preprocessor not to evaluate assert() conditions. Decent for production builds. Most developers avoid having assert()s in inner loops *anyway*, so this is unlikely to provide a huge speedup. Also, while code with side effects should not be placed in assert() statements, this is easy to do -- and code of this nature will break with -DNDEBUG on. For most software, very minor benefits.

Fifth, -DG_DISABLE_ASSERT is a similar flag to -DNDEBUG, but applies to g_assert() in the glib package (used by gnome and gtk software). Again, for most software, very minor or nonexistant benefits.

Sixth, there are a few new arch types in gcc 3.2. If you used to use -march=i686 but have a pentium 2, you should now be using -march=pentium2.

Seventh, while the real-world benefits appear to be minimal, I've written some simple tests to see if the optimizer rips out branches that should obviously be dead code. gcc does not do so without -fexpensive-optimizations. OTOH, while I feel that -fexpensive-optimizations generates more appealing machine language, I haven't seen any huge performance benefits granted by it.

And just for the heck of it, (while this isn't really optimization-related), always compile with -pipe and -Wall. -Wall *will* help you find bugs, and -pipe will speed up compilation (in some packages, by a lot).
[...]
First, the maximum optimization on modern GCCs is, indeed, -O3. This has not always been the case. Higher optimizations have existed in (much) earlier versions, usually undocumented.

I believe the highest optimization ever recognized was a massive -O6. We're talking 10 years or so ago, here. At some point, -O6 and -O5 vanished, leaving the KotH to -O4, which itself shortly vanished.
[...]
Since most software isn't cpu-bound, and since memory and disk are also limited resources, why not try -Os?

`-Os'
Optimize for size. `-Os' enables all `-O2' optimizations that do
not typically increase code size. It also performs further
optimizations designed to reduce code size.

Code compiled with this option would run just as fast (in wall-clock time, since it isn't cpu-bound), but reduces memory consumption, leaving more space for disk caches.
[...]

non l'ho finito

Ciao
Davide

--
Linux User: 302090: http://counter.li.org
Prodotti consigliati:
Sistema operativo: Debian: http://www.it.debian.org
Strumenti per l'ufficio: OpenOffice.org: http://it.openoffice.org
Database: PostgreSQL: http://www.postgres.org
Browser: FireFox: http://texturizer.net/firefox
Client di posta: Thunderbird: http://texturizer.net/thunderbird
Enciclopedia: wikipedia: http://it.wikipedia.org
--
Non autorizzo la memorizzazione del mio indirizzo di posta a chi usa
outlook: non voglio essere invaso da spam


--
Email.it, the professional e-mail, gratis per te: http://www.email.it/f

Sponsor:
Per l?Upgrade del tuo PC scegli Upgrade Pack? Scheda Madre,
* Processore e Memoria preassemblati, pronti da installare sul tuo PC?
Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=2651&d=22-9



Reply to: