[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: mpicc segfaults when called by fakeroot



Further thought: if it's not possible to run-time detect that we're running in fakeroot, perhaps an environment variable could be set before OMPI's mpicc is launched in the fakeroot so that we can see in the malloc hook init that that variable exists and therefore dump out before the stat's...? (i.e., effectively provide our own way to know that we're running in fakeroot, and therefore disable all of ompi's malloc init/openfabrics/etc. infrastructure, because OpenFabrics support is *not* required in fakeroot).

In short: perhaps you could setenv OMPI_MCA_disable_memory_allocator to 1, or somesuch. I can easily provide you with a patch (that we'd then also commit upstream, but you'll need the patch until we include this feature in a release) for such a fix.

Is that too ugly?



On Jun 7, 2009, at 5:51 PM, Manuel Prinz wrote:

Hi Jeff and Steve,

thanks a lot for diving into it! It's very appreciated! (I was not able
to access a computer during the last two days, so sorry for being
unresponsive!)

Am Sonntag, den 07.06.2009, 11:04 -0500 schrieb Steve M. Robbins:
I was able to avoid the segfault simply by ifdef'ing out this section
(patch attached). This should suffice in the short term for Debian on
the theory that OpenMPI compatibility with fakeroot is more important
than OpenMPI compatibility with OpenFabrics.

This is very hard to decide. Of course, we need Open MPI to work with
fakeroot, since our build system relies on that. There's no way around
that. As for OpenFabrics, probably most users will use MPI over fast
interconnects, so we really do need InfiniBand support as well. With the
transition in mind, I would consider disabling InfiniBand as a
short-term and temporary option.

Nevertheless, I will do some more tests tomorrow, hoping to find a less
drastic solution. Jeff's suggestion to disable libltdl sounds like a
reasonable thing. As it seems, we should probably disable it anyway
since Open MPI brings it's own copy and does not allow to build against
a version already installed on the system. Jeff, can you confirm that?

(Currently, the versions of libltdl of Open MPI and Debian seem to
differ. Though might not be the reason, it might mean some extra work
for the release and/or security team.)

However, there is clearly a bad interaction between this code, eglibc,
and fakeroot.  Hence the cc's to the various packages.

Thanks for putting them in the loop! I already sent a mail to the libc
maintainers a view days ago but did not test with a downgraded libc.

I'm speculating that memory allocation while in the
__malloc_initialize_hook is a bad thing.  Perhaps the stat() in
fakeroot caused a memory allocation, whereas the regular stat() does
not, as this code doesn't segfault in normal use.

This is what I had in mind as well.

Thanks for your work so far! I'm quite confident that we can sort it out
soon! :)

Best regards
Manuel


--
Jeff Squyres
Cisco Systems


Reply to: