[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: NEED HELP: beowulf slower that single processor



 
Hi BeoGuys,

I am actually using Linux Debian distribution 2.1 and I have setup a 8
processors beowulf cluster. I am actually using portland compilers.
I compiled mpich-1.2.0 and my application VASP (a density funcional
software). I know that VASP is working well under other Beo-clusters, but
not on mine :-((((( 
Adding more processors to mpirun -np MORE, I get a BIGGER total elaped
time. For istance, with 2 processor, the USER time decreases of 70%
(right, as it should do) but the system time explodes of 1 order or 
magnitude. speed(2 nodes, 2 processor) is almost same of speed(1 node, 1
processor), but with 4 processors, the total speed is 1/3 of the original
one. The system time is huge and the cluster is useless.
 
So I supposed to have a problem in my network, since MPI is able to rsh
everywhere, lauch jobs, and the mpi-benchmark says 6MBytes/seconds (which 
size ?). I have read the Linux TCP poor Performance:
http://www.icase.edu/coral/LinuxTCP.html
I have done this patch. No improvement.
I have, 3com 905 10/100 (driver 0.99L) cards with a cabletron 100 switch,
debian 2.1, PGI compiler  suite, MPICH and LAM compiled with PGI
compilers, vasp source (it has been reported going fast in a beo-cluster).
Netpipe gives me some reasonable numbers for the bandwitdh.
 
Thanks

Stefano Curtarolo 


Reply to: