Re: Google Summer of Code, Debian Science

To: Anton Gladky <gladk@debian.org>
Cc: Debian Science List <debian-science@lists.debian.org>, Lucas Nussbaum <lucas@debian.org>
Subject: Re: Google Summer of Code, Debian Science
From: Drew Parsons <dparsons@debian.org>
Date: Tue, 01 Mar 2022 17:57:56 +0100
Message-id: <[🔎] 5ab7de0d1bdd5f8eea7db5c60a67fc41@debian.org>
Reply-to: dparsons@debian.org
In-reply-to: <CALF6qJkWNOYcf_cJ9weMSCu-Z4uLeOP2V0M4SR5HmGjX-Etm_g@mail.gmail.com>
References: <CALF6qJmJOww9dqFBEKsi+G0aS0OagKtuBEUj9OuK=bu+6qcjMg@mail.gmail.com> <0c198b27008202e1052c06de404f9caf@emerall.com> <CALF6qJkWNOYcf_cJ9weMSCu-Z4uLeOP2V0M4SR5HmGjX-Etm_g@mail.gmail.com>

Hi Anton, I suggest this wording for an MPI Performance Testing project

cc: Lucas: I'm aiming to set up MPI testing protocols that could be runon g5k to validate the parallel performance of Debian packages




Project: Debian Cloud Computing (MPI) Performance Testing
-------

Aim: to characterize and monitor the performance of debian cloudcomputing (MPI) packages

---

Objective: to establish Standard Operating Procedures for monitoringperformance of debian MPI packages

---------

1) protocol for managing test launching on cloud computinginstallations

 2) protocol for managing test results (e.g. database)
 3) protocol for reporting results

Background
----------

Complex (large-scale) scientific computation is facilitated byparallelization of numerical libraries and end-use packages, typicallyvia MPI (Message Passing Interface). The numerical library stack forapplications is usually complex. For instance FEniCS is a libraryproviding automated solutions to differential equations using FiniteElement Methods. The numerical stack for FEniCS (python3-dolfinx) couldbe summarized as


python3-dolfinx
       |
     numpy - libdolfinx-dev - petc4py - PETSc
                                          |
        hypre - scotch - superlu-dist - mumps
            |
       scalapack - libhdf5-openmpi - mpi4py - openmpi
                                                |
                      BLAS - xsimd - lapack - basix

An upgrade of any package, or in the configuration of any package, atany point along the chain has the potential to greatly impact theparallel performance of the end-user application, both positively ornegatively depending on the change. We want to be certain that theintegrity of the parallel performance of Debian packages is maintained.This complements the existing CI testing reported athttps://ci.debian.net/. CI testing helps ensure packages continue torun correctly. With Performance Testing, we want also to be confidentthat they continue to scale satisfactorily in the HPC or cloud computingsense.

Complementary to monitoring MPI package performance over time, we alsowant to be able to report comparative performance in the case wherealternative packages are available. This point is particularly relevantto the Debian BLAS packages. BLAS clients are expected to build againstgeneric BLAS (libblas3) but run against optimized BLAS implementations(e.g. openblas, blis, atlas). In the case of openblas, serial, pthreadand openmpi threading variants are available. How does the performanceof the end-user application compare using these different BLASimplementations?


Task
----

The FEniCS project provides a package for performance testing(fenicsx-performance-tests). This will be used as the starting point forthe project.

1) What runtime parameters should fenicsx-performance-tests be launchedwith in order to provide sufficiently meaningful scaling tests?2) How should the launch of tests be managed? (should we usehttps://reframe-hpc.readthedocs.io/ ?)2b)How should test machines be managed (e.g. how to configure g5kmachines to run tests)? How to manage BLAS comparisons?3) How should the record of test results be managed? (what kind ofdatabase or flatfile?)4) How should test results be presented? (FEniCS uses plotly to generatehttps://fenics.github.io/performance-test-results/. Should we followthis or use different presentation?)5) How should we integrate test result pages with Debian websites? (e.g.link from https://ci.debian.net/ or manage elsewhere?)6) Can we apply these procedures to benchmark other packages, e.g.nwchem, lammps ?



GSOC Candidate
--------------

The candidate is expected to be competent in shell scripting, generalcoding (e.g. python, C), managing installation of debian packages andbuilding software.

MPI and HPC experience is not mandatory but would be an advantage.


Drew




On 2022-02-22 13:06, Anton Gladky wrote:

Hello Drew,

It is a very good idea!

Though I would separate this task from QA-work on Debian Science
packages. If you want, we could apply one-more project (something
like HPC-testing of MPI-based science packages) and point special
requirements for possible applicants. Feel free to propose a text
for that. Thanks again!

Regards

Anton

Am Di., 22. Feb. 2022 um 12:52 Uhr schrieb Drew Parsons<dparsons@debian.org>:


On 2022-02-21 17:42, Anton Gladky wrote:
> Dear all,
>
> Google Summer of Code call for Debian is announced [1].
> I am going to apply Debian Science Team as one of the projects.
>
> Main topic is QA-Work: Autopkgtests for high-popcon packages,
> gitlab-CI for most of packages, bringing not-in-testing packages
> into the proper shape to let them migrate to testing.
>
> If somebody wants to be a co-mentor or if you have better ideas
> for the project, please let me know.
>
> [1]
> https://lists.debian.org/debian-devel-announce/2022/02/msg00002.html

It would be helpful to run parallel/HPC performance testing for ourMPI

numerical packages.

This would be a type of CI testing that we would set up to runregularly

and report.
Lucas Nussbaum is in charge of an academic supercomputing cluster that
we can access to run such tests.

Some packages have benchmarks already at hand.  The FEniCS project for
instance offers fenicsx-performance-tests (both prebuilt and source).

The project would determined how to set up MPI CI testing (how to
activate it on Lucas' system), and what parameters (test size etc) to
use to get meaningful numbers.
A suggested tools for managing test parameters and results might be
https://reframe-hpc.readthedocs.io/en/stable/

The report format could be similar to
https://fenics.github.io/performance-test-results/

or perhaps the GSoC worker could come up with a better way ofpresenting

results.

It would be useful to be able to quantify how well our HPC packages

actually scale (in cloud computing environments) and monitor ifthere's

any drop in performance (e.g. with version updates)

Also useful to report their performance with the various BLAS
alternatives.

This would be valuable GSoC project I think.

Drew

Reply to:

Prev by Date: Re: Google Summer of Code, Debian Science
Next by Date: freecad - python autotests fail on arm+powerpc
Previous by thread: Re: Google Summer of Code, Debian Science
Next by thread: freecad - python autotests fail on arm+powerpc
Index(es):
- Date
- Thread