[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: alpha versus intel performance?



Danny Heap <danny@gibbs.med.utoronto.ca> writes:

> Our lab purchased a DEC alpha lx164 recently.  We had seen benchmarks
> (and heard comments) that suggested that it would outperform
> similarly-clocked P-II processors, especially in floating point
> operations.
> 
> In fact, our experience has been the reverse: our 350MHz P-IIs
> outperform our 533MHz alpha by at least a factor of 2.  I am including
> a short loop in C, with lots of trig functions (I assume that this
> should showcase floating point operations), plus the output from our
> DEC and Intel boxes.  I am also enclosing the output of `cat
> /proc/cpuinfo` in case that is relevant.
> 
> Any suggestions?  Are there some optimizations for the alpha that are
> (a) still being developed, or (b) that we can pass the compiler?

Suggestion?
Write better code. Your code does just show that an alpha takes a long 
time to wait for the pipeline to fill.

> -----------------------
> This is floating.c:
> -----------------------

> #include <math.h>
> 
> main(){
> double trig1, trig2;
> int ctr;
> 
>  for (ctr=0; ctr<1000000; ctr++){
>    trig1 = sin(ctr);
convert ctr to double, wait.
calculate sin, wait wait wait
>    trig2 = asin(trig1);
calculate asin, wait wait wait
>    trig1 = cos(trig2);
calculate cos, wait wait wat
>    trig2 = acos(trig1);
calculate acos, wait wait wait
>  }
> }

You allways wait for the answere of the previous line. Try the
following code:

#include <math.h>

main(){
double trig1, trig2, trig3, trig4, trig5, trig6, trig7, trig8;
double ctr;

 for (ctr=0; ctr<1000000;){
   trig1 = sin(ctr++);
   trig3 = sin(ctr++);
   trig5 = sin(ctr++);
   trig7 = sin(ctr++);

   trig2 = asin(trig1);
   trig4 = asin(trig3);
   trig6 = asin(trig5);
   trig8 = asin(trig7);

   trig1 = cos(trig2);
   trig3 = cos(trig4);
   trig5 = cos(trig6);
   trig7 = cos(trig8);

   trig2 = acos(trig1);
   trig4 = acos(trig3);
   trig6 = acos(trig5);
   trig8 = acos(trig7);
 }
}

I would say that the code runs much faster than your example, although 
it computes the same amount of sin, asin, cos and acos.

It should wait far less inbetween, it can calculate 4 sin values
before the result from the first is needed.

Another point to mention is that gcc is rather bad at optimising. The
compiler from dec is said to be 30% faster in general.

May the Source be with you.
			Goswin


Reply to: