[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: MPI debugging workflows



On 2018-09-02 20:24, Drew Parsons wrote:
On 2018-08-31 22:32, Graham Inggs wrote:
Hi

On 30 August 2018 at 10:39, Drew Parsons <dparsons@debian.org> wrote:
It's a comedy of errors with openmpi3, I see 3.1.2 has triggered new RC bugs

To be clear, 3.1.2 didn't introduce any new RC bugs.  The bugs you see
here [1] affect the version in testing and should not block migration.

What is delaying openmpi's migration is dolfin's autopkgtest failure [2].


That's weird. Looks like the problem comes through PETSc
  from
/tmp/autopkgtest-lxc.g876mqve/downtmp/build.W96/src/dolfin-demo/documented/auto-adaptive-poisson/cpp/main.cpp:14:
  /usr/include/dolfin/common/types.h:24:10: fatal error: petscsys.h:
No such file or directory
   #include <petscsys.h>

petscsys.h is located at
/usr/lib/petscdir/petsc3.9/x86_64-linux-gnu-real/include with the dir
registered in dolfin.pc. I think dolfin stopped finding it because of
a failure testing PetscInt during the cmake configuration. It's not
clear if that failure is a consequence of the warning about the path
to openmpi's libevent or not.

I'll ask upstream if it means we need to rebuild dolfin or petsc.

petsc (and slepc) are running fine.

The PetscInt failure seems to be a red herring, it also failed earlier when tests were running and passing. The failure happened because cmake and FindPETSc.cmake didn't have MPI include paths. I've prepared a patch for it (to prepare tests with cmake using mpicc) anyway.

I think the petscsys.h error is a symptom not a cause. The real error is the libevent path error. It's an artifact from the openmpi3 troubles. In openmpi 3.1.1.real-5 we used the internal libevent (and pmix). We restored use of external libevent in openmpi 3.1.2-1. The error in the test logs about libevent refers to the interim version using the internal library (as recorded in dolfin.pc). When that fails, cmake doesn't insert the petsc path into the makefile as required, and so the build can't find petscsys.h. Rebuilding dolfin resets its record of openmpi paths, enabling tests to succeed.

Since the problem with libevent and pmix was a transitory problem with openmpi3, I don't think more action is needed other than to rebuild dolfin.

Drew


Reply to: