[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#268115: marked as done ([PR 18589] could optimize FP multiplies better)



Your message dated Fri, 5 Apr 2013 23:41:51 +0100
with message-id <CAPQ4b8=F1hF=XYKZT9onKfPK7nkJTbaKxvR6+cPwTvK+NBPGNA@mail.gmail.com>
and subject line Re: [PR 18589] could optimize FP multiplies better
has caused the Debian Bug report #268115,
regarding [PR 18589] could optimize FP multiplies better
to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact owner@bugs.debian.org
immediately.)


-- 
268115: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=268115
Debian Bug Tracking System
Contact owner@bugs.debian.org with problems
--- Begin Message ---
Package: gcc-3.4
Version: 3.4.1-7.0.0.1.amd64
Severity: wishlist

 compiling this function:
double baz(double foo, double bar)
{
   return foo*foo*foo*foo*bar*bar*bar*bar;
}

 on amd64 with -O6 -ffast-math, gcc emits this code:

foo.o:     file format elf64-x86-64

Disassembly of section .text:

... (some similar functions that I was messing around with) ...
0000000000000050 <ddbar>:
  50:	f2 0f 59 c0          	mulsd  %xmm0,%xmm0
  54:	f2 0f 59 c0          	mulsd  %xmm0,%xmm0
  58:	f2 0f 59 c1          	mulsd  %xmm1,%xmm0
  5c:	f2 0f 59 c1          	mulsd  %xmm1,%xmm0
  60:	f2 0f 59 c1          	mulsd  %xmm1,%xmm0
  64:	f2 0f 59 c1          	mulsd  %xmm1,%xmm0
  68:	c3                   	retq   


 So, it notices that it can do foo*foo*foo*foo with two mulsd instructions,
but it misses the same optimization for bar*bar*bar*bar.  It would save one
FP multiply overall to do:
mulsd %xmm0, %xmm0
mulsd %xmm1, %xmm1
mulsd %xmm0, %xmm0
mulsd %xmm1, %xmm1
mulsd %xmm1, %xmm0
retq
 Also, the two non-dependent muls could run in parallel.
 
 Without -ffast-math, of course, gcc can't take advantage of the laws of
arithmetic like that and has to do all the multiplies the straightforward
way.

 Anyway, that's what I noticed while poking around waiting for pure64 to
download from alioth to this fancy new dual Opteron I'm setting up for one
of my users :)  Correct me if I'm all wrong about this optimization
being possible...

-- System Information:
Debian Release: 3.1
Architecture: amd64 (x86_64)
Kernel: Linux 2.6.8-1-amd64-k8-smp
Locale: LANG=C, LC_CTYPE=C

Versions of packages gcc-3.4 depends on:
ii  binutils        2.15-1                   The GNU assembler, linker and bina
ii  cpp-3.4         3.4.1-7.0.0.1.amd64      The GNU C preprocessor
ii  gcc-3.4-base    3.4.1-7.0.0.1.amd64      The GNU Compiler Collection (base 
ii  libc6           2.3.2.ds1-16.0.0.1.amd64 GNU C Library: Shared libraries an
ii  libgcc1         1:3.4.1-7.0.0.1.amd64    GCC support library

-- no debconf information


--- End Message ---
--- Begin Message ---
fixed -1 1.120exp2
stop

4.8 was released in experimental and probably soon in unstable. gcc
1.120exp2 depends on that version on the major architectures.

Cheers.

--- End Message ---

Reply to: