Bug#572746: libm: sinf/cosf performance is awful on amd64
I could not say for sure the difference between sin and sinf (for example) on
Suse but the performance ratio I had on 32 bits, stayed the same on 64 bits.
This is why I was surprised to get impressive slowness when moving to debian :(
Thanks for pointing out the Suse patch : as we only have Suse or Debian at work
I could not do more comparisons.
How about including patches from OpenSuse ? Is it possible as a quick
Thanks for your help.
On Sunday 07 March 2010, you wrote:
> On Sun, Mar 07, 2010 at 04:17:08PM +0100, Aurelien Jarno wrote:
> > On Sat, Mar 06, 2010 at 11:42:51AM +0100, Jerome Vizcaino wrote:
> > > Package: libc6
> > > Version: 2.10.2-6
> > > Severity: normal
> > >
> > > Hi,
> > >
> > > After many tests and research I've come to the conclusion that the
> > > float variants of
> > > sin/cos (and maybe others) are anormaly slow Debian amd64.
> > > The performance loss is really impressive (around 8 to 9 times slower).
> > > I've attached the prog used to make my experiments and used it in the
> > > following cases.
> > >
> > > + Lenny-amd64: sinf/cosf is really slow
> > > + Lenny-i386: float performance is ok (faster than the cos/sin using
> > > double) + Sid-amd64: sinf/cosf slow
> > > + Lenny-amd64 using lenny-i386 binary and 32bits libs: float
> > > performance is OK.
> > On amd64, only sincos has an optimized version, sincosf is using the
> > generic C implementation. On i386, there are optimized version of both
> > sincos and sincosf
> > > + OpenSuse 64 bits (10.3 and 11.1): using the binary compiled on
> > > lenny-amd64, the tests run fine !
> > > => The problem is not compiler related.
> > >
> > > There seems to be a problem with the way libm is compiled for the amd64
> > > architecture on Debian.
> > > This is why the OpenSuse test was run: the problem is somewhere in the
> > > compile chain or debian specific patches.
> > The problem is clearly not Debian specific, and is also present
> > upstream. OpenSuse is probably using a patch to workaround the problem.
> This is confirmed, there using an AMD version of the libm library on
> x86_64, still coded in C for the sincosf function.
> A quick an dirty implementation of sincosf in x86_64 assembly gives me a
> speed around 4% slower than sincos. What kind of performance ratio do
> you get on SuSe?
> The solution seems to write each *f function in x86_64 assembly, but
> that'll probably take time.