--- Begin Message ---
- To: Debian Bug Tracking System <submit@bugs.debian.org>
- Subject: gcc-3.4: could optimize FP multiplies better
- From: Peter Cordes <peter@cordes.ca>
- Date: Thu, 26 Aug 2004 03:32:24 -0300
- Message-id: <E1C0Do8-0003Jv-2H@hume.biochem.dal.ca>
Package: gcc-3.4
Version: 3.4.1-7.0.0.1.amd64
Severity: wishlist
compiling this function:
double baz(double foo, double bar)
{
return foo*foo*foo*foo*bar*bar*bar*bar;
}
on amd64 with -O6 -ffast-math, gcc emits this code:
foo.o: file format elf64-x86-64
Disassembly of section .text:
... (some similar functions that I was messing around with) ...
0000000000000050 <ddbar>:
50: f2 0f 59 c0 mulsd %xmm0,%xmm0
54: f2 0f 59 c0 mulsd %xmm0,%xmm0
58: f2 0f 59 c1 mulsd %xmm1,%xmm0
5c: f2 0f 59 c1 mulsd %xmm1,%xmm0
60: f2 0f 59 c1 mulsd %xmm1,%xmm0
64: f2 0f 59 c1 mulsd %xmm1,%xmm0
68: c3 retq
So, it notices that it can do foo*foo*foo*foo with two mulsd instructions,
but it misses the same optimization for bar*bar*bar*bar. It would save one
FP multiply overall to do:
mulsd %xmm0, %xmm0
mulsd %xmm1, %xmm1
mulsd %xmm0, %xmm0
mulsd %xmm1, %xmm1
mulsd %xmm1, %xmm0
retq
Also, the two non-dependent muls could run in parallel.
Without -ffast-math, of course, gcc can't take advantage of the laws of
arithmetic like that and has to do all the multiplies the straightforward
way.
Anyway, that's what I noticed while poking around waiting for pure64 to
download from alioth to this fancy new dual Opteron I'm setting up for one
of my users :) Correct me if I'm all wrong about this optimization
being possible...
-- System Information:
Debian Release: 3.1
Architecture: amd64 (x86_64)
Kernel: Linux 2.6.8-1-amd64-k8-smp
Locale: LANG=C, LC_CTYPE=C
Versions of packages gcc-3.4 depends on:
ii binutils 2.15-1 The GNU assembler, linker and bina
ii cpp-3.4 3.4.1-7.0.0.1.amd64 The GNU C preprocessor
ii gcc-3.4-base 3.4.1-7.0.0.1.amd64 The GNU Compiler Collection (base
ii libc6 2.3.2.ds1-16.0.0.1.amd64 GNU C Library: Shared libraries an
ii libgcc1 1:3.4.1-7.0.0.1.amd64 GCC support library
-- no debconf information
--- End Message ---