[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: mpich problems 1.2.5.2-1



On Wednesday 18 February 2004 06:38 pm, Michael Will wrote:
> In article <200401211037.17368.jaschmidt@uofu.net>, John Schmidt wrote:
> > Since upgrading to mpich 1.2.5.2-1, our big mpi based application has
> > stopped working.  I get this error:
> >
> > mpirun -np 2 sus -mpm inputs/MPM/disks2mat4patch.ups
> > xm_2407:  p4_error: Command-line arguments are missing: 0
> > /usr/bin/mpirun: line 1:  2407 Segmentation
> > fault      /home/jas/SCIRun/NewBC/debug/Packages/Uintah/StandAlone/sus
> > "-mpm" "inputs/MPM/disks2mat4patch.ups"
> > -p4pg /home/jas/SCIRun/NewBC/debug/Packages/Uintah/StandAlone/PI2327
> > -p4wd /home/jas/SCIRun/NewBC/debug/Packages/Uintah/StandAlone
> >
> > To confirm that mpi is working, I compiled the mpich examples found in
> > the documentation and ran them for various number of processors and all
> > worked fine using mpich 1.2.5.2-1.
> >
> > On a different machine, I am running mpich 1.2.5-6, and the above command
> > (mpirun -np 2 sus -mpm inputs/MPM/disks2mat4patch.ups) works without any
> > problems.
>
> Did you recompile your application? Maybe it has different command line
> parameters passed to the executable that the mpi implementation picks up
> on.
>
> mpirun is just a script wrapper - try setting it to verbose (set -x) and
> see what it actually executes.
> I assume it will execute another script wrapper mpirun.ch_p4 which again
> you could follow with 'set -x'. Then see if you can make sure it does not
> place any parameters on your executable that it does not like...
>
> No idea what the -O stands for.
>
> Even better, try this on both systems to see the difference:
>
> $ mpirun /bin/echo
>
> In my case the output was '-p4pg /home/mwill/PI11082 -p4wd /home/mwill'
> for an older Scyld-6 beowulf cluster:
>
> mpich-1.2.5-8.5_Scyld
> mpich-p4-gnu-1.2.5-8.5_Scyld
>
> Michael
> --
> Michael Will, Linux Sales Engineer
> Tel:  415.358.2673  Toll Free:  888-PENGUIN
> PENGUIN COMPUTING - The World's Most Reliable Linux System
> www.penguincomputing.com

Michael,

Thanks for taking the time to respond.  Actually, our application had changed 
how it was picking up the mpich arguments and I didn't catch it in the source 
code.  Once I figured that out and changed the source code, things are now 
working fine.

John



Reply to: