Hi Anton, I suggest this wording for an MPI Performance Testing projectcc: Lucas: I'm aiming to set up MPI testing protocols that could be run on g5k to validate the parallel performance of Debian packages
Project: Debian Cloud Computing (MPI) Performance Testing -------Aim: to characterize and monitor the performance of debian cloud computing (MPI) packages
---Objective: to establish Standard Operating Procedures for monitoring performance of debian MPI packages
---------1) protocol for managing test launching on cloud computing installations
2) protocol for managing test results (e.g. database) 3) protocol for reporting results Background ----------Complex (large-scale) scientific computation is facilitated by parallelization of numerical libraries and end-use packages, typically via MPI (Message Passing Interface). The numerical library stack for applications is usually complex. For instance FEniCS is a library providing automated solutions to differential equations using Finite Element Methods. The numerical stack for FEniCS (python3-dolfinx) could be summarized as
python3-dolfinx | numpy - libdolfinx-dev - petc4py - PETSc | hypre - scotch - superlu-dist - mumps | scalapack - libhdf5-openmpi - mpi4py - openmpi | BLAS - xsimd - lapack - basixAn upgrade of any package, or in the configuration of any package, at any point along the chain has the potential to greatly impact the parallel performance of the end-user application, both positively or negatively depending on the change. We want to be certain that the integrity of the parallel performance of Debian packages is maintained. This complements the existing CI testing reported at https://ci.debian.net/. CI testing helps ensure packages continue to run correctly. With Performance Testing, we want also to be confident that they continue to scale satisfactorily in the HPC or cloud computing sense.
Complementary to monitoring MPI package performance over time, we also want to be able to report comparative performance in the case where alternative packages are available. This point is particularly relevant to the Debian BLAS packages. BLAS clients are expected to build against generic BLAS (libblas3) but run against optimized BLAS implementations (e.g. openblas, blis, atlas). In the case of openblas, serial, pthread and openmpi threading variants are available. How does the performance of the end-user application compare using these different BLAS implementations?
Task ----The FEniCS project provides a package for performance testing (fenicsx-performance-tests). This will be used as the starting point for the project.
1) What runtime parameters should fenicsx-performance-tests be launched with in order to provide sufficiently meaningful scaling tests? 2) How should the launch of tests be managed? (should we use https://reframe-hpc.readthedocs.io/ ?) 2b)How should test machines be managed (e.g. how to configure g5k machines to run tests)? How to manage BLAS comparisons? 3) How should the record of test results be managed? (what kind of database or flatfile?) 4) How should test results be presented? (FEniCS uses plotly to generate https://fenics.github.io/performance-test-results/. Should we follow this or use different presentation?) 5) How should we integrate test result pages with Debian websites? (e.g. link from https://ci.debian.net/ or manage elsewhere?) 6) Can we apply these procedures to benchmark other packages, e.g. nwchem, lammps ?
GSOC Candidate --------------The candidate is expected to be competent in shell scripting, general coding (e.g. python, C), managing installation of debian packages and building software.
MPI and HPC experience is not mandatory but would be an advantage. Drew On 2022-02-22 13:06, Anton Gladky wrote:
Hello Drew, It is a very good idea! Though I would separate this task from QA-work on Debian Science packages. If you want, we could apply one-more project (something like HPC-testing of MPI-based science packages) and point special requirements for possible applicants. Feel free to propose a text for that. Thanks again! Regards AntonAm Di., 22. Feb. 2022 um 12:52 Uhr schrieb Drew Parsons <dparsons@debian.org>:On 2022-02-21 17:42, Anton Gladky wrote: > Dear all, > > Google Summer of Code call for Debian is announced [1]. > I am going to apply Debian Science Team as one of the projects. > > Main topic is QA-Work: Autopkgtests for high-popcon packages, > gitlab-CI for most of packages, bringing not-in-testing packages > into the proper shape to let them migrate to testing. > > If somebody wants to be a co-mentor or if you have better ideas > for the project, please let me know. > > [1] > https://lists.debian.org/debian-devel-announce/2022/02/msg00002.htmlIt would be helpful to run parallel/HPC performance testing for our MPInumerical packages.This would be a type of CI testing that we would set up to run regularlyand report. Lucas Nussbaum is in charge of an academic supercomputing cluster that we can access to run such tests. Some packages have benchmarks already at hand. The FEniCS project for instance offers fenicsx-performance-tests (both prebuilt and source). The project would determined how to set up MPI CI testing (how to activate it on Lucas' system), and what parameters (test size etc) to use to get meaningful numbers. A suggested tools for managing test parameters and results might be https://reframe-hpc.readthedocs.io/en/stable/ The report format could be similar to https://fenics.github.io/performance-test-results/or perhaps the GSoC worker could come up with a better way of presentingresults. It would be useful to be able to quantify how well our HPC packagesactually scale (in cloud computing environments) and monitor if there'sany drop in performance (e.g. with version updates) Also useful to report their performance with the various BLAS alternatives. This would be valuable GSoC project I think. Drew