[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#572746: libm: sinf/cosf performance is awful on amd64



On Sun, Mar 07, 2010 at 04:17:08PM +0100, Aurelien Jarno wrote:
> On Sat, Mar 06, 2010 at 11:42:51AM +0100, Jerome Vizcaino wrote:
> > Package: libc6
> > Version: 2.10.2-6
> > Severity: normal
> > 
> > Hi,
> > 
> > After many tests and research I've come to the conclusion that the float variants 
> > of
> > sin/cos (and maybe others) are anormaly slow Debian amd64.
> > The performance loss is really impressive (around 8 to 9 times slower).
> > I've attached the prog used to make my experiments and used it in the following 
> > cases.
> > 
> > + Lenny-amd64: sinf/cosf is really slow
> > + Lenny-i386: float performance is ok (faster than the cos/sin using double)
> > + Sid-amd64: sinf/cosf slow
> > + Lenny-amd64 using lenny-i386 binary and 32bits libs: float performance is OK.
> 
> On amd64, only sincos has an optimized version, sincosf is using the
> generic C implementation. On i386, there are optimized version of both 
> sincos and sincosf
> 
> > + OpenSuse 64 bits (10.3 and 11.1): using the binary compiled on lenny-amd64, 
> > the tests run fine !
> > => The problem is not compiler related.
> > 
> > There seems to be a problem with the way libm is compiled for the amd64 
> > architecture on Debian.
> > This is why the OpenSuse test was run: the problem is somewhere in the compile 
> > chain or debian specific patches.
> > 
> 
> The problem is clearly not Debian specific, and is also present
> upstream. OpenSuse is probably using a patch to workaround the problem.
> 

This is confirmed, there using an AMD version of the libm library on
x86_64, still coded in C for the sincosf function.

A quick an dirty implementation of sincosf in x86_64 assembly gives me a
speed around 4% slower than sincos. What kind of performance ratio do 
you get on SuSe?

The solution seems to write each *f function in x86_64 assembly, but
that'll probably take time.

-- 
Aurelien Jarno	                        GPG: 1024D/F1BCDB73
aurelien@aurel32.net                 http://www.aurel32.net



Reply to: