Testing parallel execution Re: h5py and hdf5-mpi

To: debian-science@lists.debian.org
Subject: Testing parallel execution Re: h5py and hdf5-mpi
From: Steffen Möller <steffen_moeller@gmx.de>
Date: Wed, 14 Aug 2019 12:39:49 +0200
Message-id: <[🔎] 30ca79ce-fcbc-af18-3597-992582f76c68@gmx.de>
In-reply-to: <[🔎] eb34a343-62b5-110c-add8-9fa0d36da711@debian.org>
References: <[🔎] 1faee1eb2da43d988e3da96e136765b2@debian.org> <[🔎] d0608a21afefbca864e7c3325b258dfc@debian.org> <[🔎] CAFzxpWpNHstOwLWVdLYmo445Wi60+J7Uy4WsFOTvnfUYo1hJ0Q@mail.gmail.com> <[🔎] 1d4e9a6d-6506-36df-4206-3091a0a29edb@gmx.de> <[🔎] d0836264bc68454b2658129254184dc0@emerall.com> <[🔎] eb34a343-62b5-110c-add8-9fa0d36da711@debian.org>

How do autotests work for MPI?


We simply configure the test script to invoke the same tests using
mpirun.

This is a bigger issue.  We have test suites that test MPI features
without checking MPI processor counts (eg the Magics /Metview code).
One workaround is to enable oversubscribe to allow the test to work
(inefficiently), though the suites that use MPI should really detect
and disable such tests if resources are not found. We will always have
features in our codes that our build/test systems aren't capable of
testing: eg. pmix is designed to work scalably to > 100,000 cores. We
can't test that :-)


Maybe the testing for many cores does not need to happen at upload time.
And maybe the testing for behavior in parallel environments does need to
be performed for all platforms but just one. There could then be a
service Debian provides, analogously to reproducible builds etc,  that
performs testing in parallel environments. The unknown limits of
available cores is something the users of
better-than-what-Debian-decides-to-afford infrastructure can address
themselves. The uploader of a package/build demons would just invoke the
parallel run on a single node. Personally, I would like to see multiple
tests, say consecutively on 1,2,4,8,16,32,64,128,256 nodes and stop
testing when there is no more speedup. How many packages would reach
beyond 32?

There are quite some packages in our distro that are multithreaded, i.e.
that don't need mpi. Today, we don't test their performance in parallel
either. But we should. Don't have any systematic way to do so, yet,
though. I could also imagine that such a testing in parallel
environments help gluing our distro with upstream developers a bit more.
Maybe this is something to discuss together with the cloud team who know
how to spawn an arbitrary number of nodes, quickly? And maybe have an
outreach to phoronix.com and/or their openbenchmarking.org?

Steffen

Reply to:

References:
- h5py and hdf5-mpi
  - From: Drew Parsons <dparsons@debian.org>
- Re: h5py and hdf5-mpi
  - From: Mo Zhou <lumin@debian.org>
- Re: h5py and hdf5-mpi
  - From: Ghislain Vaillant <ghisvail@gmail.com>
- Re: h5py and hdf5-mpi
  - From: Steffen Möller <steffen_moeller@gmx.de>
- Re: h5py and hdf5-mpi
  - From: Drew Parsons <dparsons@emerall.com>
- Re: h5py and hdf5-mpi
  - From: Alastair McKinstry <mckinstry@debian.org>

Prev by Date: Re: h5py and hdf5-mpi
Next by Date: Re: h5py and hdf5-mpi
Previous by thread: Re: h5py and hdf5-mpi
Next by thread: Re: h5py and hdf5-mpi
Index(es):
- Date
- Thread