[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

[Jeff Squyres <jsquyres@lsc.nd.edu>] Re: LAM: MPE support in lam?



Here is another followup from Jeff re: lam/mpich binary
compatibility. Any comments most welcome!

Take care,
-- 
Camm Maguire			     			camm@enhanced.com
==========================================================================
"The earth is but one country, and mankind its citizens."  --  Baha'u'llah
------- Start of forwarded message -------
Date: Thu, 11 Jan 2001 13:35:44 -0500 (EST)
From: Jeff Squyres <jsquyres@lsc.nd.edu>
To: <lam@mpi.nd.edu>
Subject: Re: LAM: MPE support in lam?
Message-ID: <Pine.LNX.4.30.0101111311540.8228-100000@queeg.squyres.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Reply-To: lam@mpi.nd.edu

On 11 Jan 2001, Camm Maguire wrote:

> > LAM 6.3.3 (see the CVS page, or check out one of the beta release
> > tarballs) works nicely with recent versions of MPE.  It is unlikely that
> > we will distribute MPE with LAM since our release schedules are
> > independant of each other.  It's already bad enough that we include ROMIO
> > and the C++ bindings packages.  :-)
>
> OK, will check it out!

I literally just but 6.3.3b49 out there; if you downloaded b48 from the
beta page, you might as well get b49.  There's very minor differences, but
you might as well.  :-)

> > LAM and MPICH evolved completely independantly, and hence have entirely
> > different inner workings.  Unfortunately, it's not as simple as just
> > getting the enumerated constants in our .h files to agree -- there's a lot
> > of interworkings below the MPI layer in both LAM and MPICH, none of which
> > is common between the two systems.
>
> Is this true even for shared libraries?  If both mpich and lam are

Most likely.  See below.

> setup as shared libs, and if the constants agree, then it would seem
> that all the mpi-relevant stuff a binary would contain would be
> MPI_??? calls, unless the *functions* are redefined in the headers, or
> something.  Just for clarification, what I was thinking about was
>
> 1) compiling foo under {lam,mpich}, with dynamic mpi linking (may have
> 	to name the shared libs something generic)

As of 6.3.3, LAM has "libmpi.whatever" (unavoidable), possibly
"libpmpi.whatever" if you compiled with profiling on (which is now the
default), and a small number of "liblamfoo.whatever" files.  Hence, the
only ones that you have to worry about are libmpi.a and libpmpi.a.

I don't know offhand what MPICH names their libraries; I seem to recall
that they're not plain old libmpi.a anymore, but I could be hallucinating.
Rusty?

> 2) running foo with {mpich,lam}'s mpirun, and with {mpich,lam}'s
> 	shared libs in place of {lam,mpich}'s.

I think it would take a little more than just the constants agreeing in
the .h files.  I'd be willing to bet that some of our function prototypes
are different (e.g., LAM's (MPI_Comm) is actually a typedef for (struct
_comm*), whereas MPICH's is a typedef for (int), IIRC).  ...although in C,
it might not make much of a difference if (sizeof(int) == sizeof(void*)).
Hmm.

(after some thought)

Ah.  Now I know why it's bugging me:

- Because of the type issue (i.e., LAM and MPICH have different types for
the same constant), you'd have to disregard type safety.  As mentioned
above, this probably doesn't matter where (sizeof(int) == sizeof(void*)),
but still not a good idea.

- But even worse, some constants are actually #define's.  I don't know how
MPICH does it, but LAM #define's many if its constants in <mpi.h>.
Hence, they are bound to the user's code at compile time -- many constants
are not referred to at run-time.

So not only would our constant *values* have to agree, we'd also have to
agree on what is compile-time bound and what is run-time bound.

This may not be an insurmountable problem; indeed, I don't know much about
how MPICH treats their innards.  Hence, there really only may be a few
places where we clash (on run-time vs. compile-time bindings).  But the
type issue is still problematic; while it may *work* it may cause hardship
to users because they won't get the type-safety warnings/errors that they
expect from the compiler (welcome to Fortran! ;-).

Just my $0.02.

{+} Jeff Squyres
{+} squyres@cse.nd.edu
{+} Perpetual Obsessive Notre Dame Student Craving Utter Madness
{+} "I came to ND for 4 years and ended up staying for a decade"


_______________________________________________
This list is archived at http://www.mpi.nd.edu/MailArchives/lam/


------- End of forwarded message -------



Reply to: