[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: MPICH as default MPI; WAS: MPI debugging workflows



On 2018-10-04 09:08, Drew Parsons wrote:
On 2018-10-03 21:02, Alastair McKinstry wrote:
Hi,

See thread below.

I've just uploaded 3.1.2-5 which I believe fixes the hangs due to
OpenMPI ( non-atomic handling of sending a 64-bit tag, occuring mostly
on archs with 32-bit atomics).

Awkward observation: openmpi 3.1.2-5 now causes dolfin tests to
timeout (on amd64)
https://ci.debian.net/packages/d/dolfin/unstable/amd64/

Tracker page pings lammps and liggghts as well.



Hi Alistair, openmpi3 seems to be stabilised now, packages are now passing tests and libpsm2 is no longer injecting 15 sec delays.

Nice that the mpich 3.3 release is now finalised. Do we feel confident proceeding with the switch of mpi-defaults from openmpi to mpich?

Are there any know issues with the transition? One that catches my eye are the build failures in scalapack. It's been tuned to pass built time tests with openmpi but fails many tests with mpich (scalapack builds packages for both mpi implementations). I'm not sure how concerned we should be with those build failures. Perhaps upstream should be consulted on it. Are similar mpich failures expected in other packages? Is there a simple way of setting up a buildd to do a test run of the transition before making it official?

Drew


Reply to: