[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: confused about performance



Giacomo Mulas wrote:
> On Thu, 14 Jun 2007, Leopold Palomo-Avellaneda wrote:
> 
>>> The last test I ran on my Athlon64 X2 4200+ (2.2 GHz) got me about 10
>>> gigaflops in 32-bit arithmetic and about half of that in 64-bit
>>> arithmetic.
>>
>> I don't understand that. Are you saying that the 64-bits was really
>> worst tha 32?
> 
> He is saying that the version distributed by default in debian of the
> ATLAS linear algebra libraries is much better optimised for performance
> in the 32 bit version than in the 64 bit version. However, if you are in
> for speed, you are way better off using a better optimised linear
> algebra library, such as the GOTO library (hand-optimised assembler
> written by masatoshi goto), or the acml libraries provided by AMD for
> AMD processors or the MKML libraries provided by Intel for Intel
> processors. All of these provide vastly better performance than the
> current incarnations of ATLAS in 64bit.

How recent is your data? I was under the impression that Atlas had
caught up with GOTO and ACML, and possibly even the Intel libraries on
the Core 2 Duo. I'm on the Atlas mailing list -- I can ask there. In any
event, there is enough assembly code in Atlas that I'd expect it to be
competitive with both GOTO and the vendor libraries on AMD 64-bit and
Intel 64-bit chips. And I think 3.7.32 cleaned out some bottlenecks in
the 64-bit SPARC code as well, so it's undoubtedly worthwhile for Debian
to put it in the repositories for SPARC at 3.7.32 and probably not for
older versions.

> ATLAS can be expected to improve
> a lot quickly, but currently is far behind in the 64 bit version. 

As I noted above, that's not the impression I got.

> Also,
> if you are after performance, you should consider using some commercial
> compiler (if you can afford it) instead of the GCC suite, until GNU
> compilers become as good at optimising for x86_64 processors as they are
> for x86.

You're probably right here, at least for 4.1 GCC and older. I haven't
seen anything on GCC 4.2 yet. If you have an *Intel* chip, you
definitely should look at the Intel compilers. They are written by some
good folks and neighbors of mine -- I used to work with a couple of them
in a galaxy long ago and far away. :) And they have access to all the
magic counters on the chip, which as far as I know, isn't even on the
GCC road map.

Finally, for those of you who love the joy of doing this sort of "land
speed record" chasing, there's an excellent collection of resources at

http://www.agner.org/optimize/




Reply to: