--- Begin Message ---
- To: Debian Bug Tracking System <submit@bugs.debian.org>
- Subject: libm: sinf/cosf performance is awful on amd64
- From: Jerome Vizcaino <vizcaino_jerome@yahoo.fr>
- Date: Sat, 6 Mar 2010 11:42:51 +0100
- Message-id: <201003061142.51926.vizcaino_jerome@yahoo.fr>
Package: libc6
Version: 2.10.2-6
Severity: normal
Hi,
After many tests and research I've come to the conclusion that the float variants
of
sin/cos (and maybe others) are anormaly slow Debian amd64.
The performance loss is really impressive (around 8 to 9 times slower).
I've attached the prog used to make my experiments and used it in the following
cases.
+ Lenny-amd64: sinf/cosf is really slow
+ Lenny-i386: float performance is ok (faster than the cos/sin using double)
+ Sid-amd64: sinf/cosf slow
+ Lenny-amd64 using lenny-i386 binary and 32bits libs: float performance is OK.
+ OpenSuse 64 bits (10.3 and 11.1): using the binary compiled on lenny-amd64,
the tests run fine !
=> The problem is not compiler related.
There seems to be a problem with the way libm is compiled for the amd64
architecture on Debian.
This is why the OpenSuse test was run: the problem is somewhere in the compile
chain or debian specific patches.
We're extensively using these for calculations and this is a real problem. Using
cos/sin as
a temporary workaround would do the trick but this is still slower than the
sinf/cosf
implementations that works so well on 32 bits computers...
Thank you
Jerome
-- System Information:
Debian Release: squeeze/sid
APT prefers unstable
APT policy: (500, 'unstable')
Architecture: amd64 (x86_64)
Kernel: Linux 2.6.32-trunk-amd64 (SMP w/2 CPU cores)
Locale: LANG=en_US.utf8, LC_CTYPE=en_US.utf8 (charmap=UTF-8) (ignored: LC_ALL
set to en_US.utf8)
Shell: /bin/sh linked to /bin/bash
Versions of packages libc6 depends on:
ii libc-bin 2.10.2-6 Embedded GNU C Library: Binaries
ii libgcc1 1:4.4.3-3 GCC support library
libc6 recommends no packages.
Versions of packages libc6 suggests:
ii debconf [debconf-2.0] 1.5.28 Debian configuration management sy
pn glibc-doc <none> (no description available)
ii locales 2.10.2-6 Embedded GNU C Library: National L
-- debconf information excluded
CC=gcc
CFLAGS=-DNDEBUG -O3 -D_ISOC99_SOURCE -Wall -Wextra
LDFLAGS=-lm
all: test_trig
clean:
rm test_trig
test_trig: test_trig.c
#include <math.h>
#include <sys/time.h>
#include <stdio.h>
int main(void)
{
const int nbElement_i = 10000000;
int i=0;
float f1=0.0f, f2=0.0f, f3=0.0f;
struct timeval tv1, tv2;
printf("Testing %d sinf and cosf... ", nbElement_i);
fflush(stdout);
gettimeofday(&tv1, NULL);
for(i=0; i<nbElement_i; i++){
f1 += cosf(i);
f2 += sinf(i);
}
// This is needed for gcc to know a and b results
// really matters, otherwise sin and cos could
// be ignored.
f3 = f1+f2;
gettimeofday(&tv2, NULL);
//
printf("Result: %f, Duration: %ld sec %ld usec\n", f3, tv2.tv_sec - tv1.tv_sec, tv2.tv_usec - tv1.tv_usec);
f1 = 0.0f; f2 = 0.0f;
printf("Testing %d sin and cos (with float args)... ", nbElement_i);
fflush(stdout);
gettimeofday(&tv1, NULL);
for(i=0; i<nbElement_i; i++){
f1 += cos(i);
f2 += sin(i);
}
// This is needed for gcc to know a and b results
// really matters, otherwise sin and cos could
// be ignored.
f3 = f1+f2;
gettimeofday(&tv2, NULL);
//
printf("Result: %f, Duration: %ld sec %ld usec\n", f3, tv2.tv_sec - tv1.tv_sec, tv2.tv_usec - tv1.tv_usec);
return 0;
}
--- End Message ---
--- Begin Message ---
- To: Jerome Vizcaino <vizcaino_jerome@yahoo.fr>, 572746-done@bugs.debian.org
- Subject: Re: Bug#572746: libm: sinf/cosf performance is awful on amd64
- From: Aurelien Jarno <aurelien@aurel32.net>
- Date: Thu, 17 Dec 2015 19:35:14 +0100
- Message-id: <20151217183514.GA22525@aurel32.net>
- In-reply-to: <201003061142.51926.vizcaino_jerome@yahoo.fr>
- References: <201003061142.51926.vizcaino_jerome@yahoo.fr>
Version: 2.17-1
On 2010-03-06 11:42, Jerome Vizcaino wrote:
> Package: libc6
> Version: 2.10.2-6
> Severity: normal
>
> Hi,
>
> After many tests and research I've come to the conclusion that the float variants
> of
> sin/cos (and maybe others) are anormaly slow Debian amd64.
> The performance loss is really impressive (around 8 to 9 times slower).
> I've attached the prog used to make my experiments and used it in the following
> cases.
>
> + Lenny-amd64: sinf/cosf is really slow
> + Lenny-i386: float performance is ok (faster than the cos/sin using double)
> + Sid-amd64: sinf/cosf slow
> + Lenny-amd64 using lenny-i386 binary and 32bits libs: float performance is OK.
>
> + OpenSuse 64 bits (10.3 and 11.1): using the binary compiled on lenny-amd64,
> the tests run fine !
> => The problem is not compiler related.
>
> There seems to be a problem with the way libm is compiled for the amd64
> architecture on Debian.
> This is why the OpenSuse test was run: the problem is somewhere in the compile
> chain or debian specific patches.
>
> We're extensively using these for calculations and this is a real problem. Using
> cos/sin as
> a temporary workaround would do the trick but this is still slower than the
> sinf/cosf
> implementations that works so well on 32 bits computers...
SSE2 based sinf/cosf optimized routines have been added in version
2.17-1, fixing the performance and precision issue. I am therefore
closing this bug.
--
Aurelien Jarno GPG: 4096R/1DDD8C9B
aurelien@aurel32.net http://www.aurel32.net
--- End Message ---