[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: HDF5 1.10 transition, netcdf 4.4.1 co-installable




On 07/10/2016 18:44, Sebastiaan Couwenberg wrote:
> On 07/10/16 16:30, Alastair McKinstry wrote:
>> Are we going to try and do the HDF5 transition before the freeze, which
>> is the end of the month for transitions?
> I'll leave that to Gilles who has done a marvellous job on hdf5,
> although I'm not happy with the Debian specific changes to HDF5 we need
> to deal with (non-standard include & library paths mostly).
Agreed.
> Since netcdf 4.4.1 is in testing/unstable for some time, that's no
> longer a blocker for the hdf5 transition.
>
>> I'm testing out the last of my changes for co-installable netcdf which I
>> hope to have ready by the beginning of next week.  It would be worth
>> thinking about doing both.
> I'm aware of the outdated dev-coinstallable branch in the netcdf
> repository, and no offence to your effort, but I don't like what I see
> there. The symbols version script will be a pain to maintain as our
> experience with gdal has shown for example. I'm still very much against
> patching the netcdf source to make it build with HDF5 serial and its MPI
> variants. That needs to be solved upstream. I don't want to require
> changes to reverse dependencies to select a Debian specific netcdf
> variant as we do for hdf5. The situation we want to create in Debian
> should be something supported out of the box by upstream. Has there been
> any discussion with NetCDF (and HDF5) upstream about this?
>
> Kind Regards,
>
> Bas
>
I've been working on the dev-coinstallable so that it is no longer
requires a transition.

Yes, i've been talking to upstream about this, (mostly hdf5 people but
also netcdf) and my understanding is that the problem is basically HDF5
/ netcdf compression: HDF5 (and hence netcdf) can do either compression
(SZIP, etc. ) or parallel read/write, but not both simultaneously.
Fixing this is on the todo list, but has been for many years without
progress. Estimates of 6-12 months work have been quoted, as HDF5 is
effectively becoming a high-performance filesystem within a file on HPC
systems with deep memory hierarchies, and any such changes are not
trivial. This development can only be really written and tested on
top-end HPC systems like the national labs; patches written by
developers on PCs will probably impact performance and not be accepted
by upstream. What I'm proposing is a temporary workaround until this is
done, that is designed to go away later.

( Whats it means, technically: parallel write/reads work (on posix) by
dividing the file to be written into even-sized chunks handled by .eg.
MPI-IO. Compression means we don't  know in advance the size on disk a
given write will be, until after we've compressed it; if we have a chunk
of memory of fixed size, we don't know the eventual byte-range on a
'serial bunch of bytes' file representation it will map to. However in
practice people are moving to a non-POSIX based representation of HDF5
files on 'modern object-based' APIs; no longer treating the file as a
serial bunch of bytes but a set of possibly different-sized blocks
handled as objects on eg. an S3 API object-based filesystem. In this
picture compression can be added. But the high-performance work is more
important to HDF5 developers' funders than compression at the moment,
while on the netcdf front, compression is important for those of us
storing and archiving large files long-term).

We need to be able to handle both cases. Eventually in a 'deep' software
stack like Debian, we will have applications such as Visit, Paraview,
CDAT etc. that will need to be able to both (1) read compressed files or
(2) read in parallel, on different workflows. These work using netcdf,
adios and xdmf plugins for IO, and currently cannot provide parallelism
on Debian because of the lack of parallel netcdf. Given this
incompatibility the 'serial' is the right default for Debian but
handicaps us on large systems. where i'm working for example, we build
portals on Debian for HPC, but can't do so due to this lack of MPI support.

So, the solution I'm proposing: We retain one 'master' netcdf version.
libnetcdf11 and libnetcdf-dev. Co-installable libnetcdf-mpi-11 and
libnetcdf-pnetcdf-11 exist, but are not used by most libraries /
applications. While a parallel libnetcdf- fortran and C++ libraries are
also required, I do not propose or expect any applications above the
netcdf stack provide serial and parallel versions; there will be no
combinatorial explosion of packages. A handful of libraries, apps may be
linked to the MPI version of netcdf instead of the serial, and in
particular two higher-level IO libraries I maintain will be linked to
both : ADIOS and XDMF  (XDMF would provide both xdmf.py and xdmf_mpi.py
modules, user selects which; while ADIOS provides an interface where it
can decide at runtime whether to use serial or mpi).

The libraries are as follows: libnetcdf11 is as before, with NETCDF_*
symbols, library in /usr/lib/$arch/libnetcdf.so.11.3.0
The include files are in /usr/include/netcdf ; etc.
For MPI version, the library is /usr/lib/$arch/libnetcdf_mpi.so*,
symbols NETCDF_MPI_* . Note: mpi, not openmpi; the MPI dependencies are
abstracted away by this layer.
The Fortran, C++ netcdf packages would ship both libnetcdff.so and
libnetcdff_mpi.so. etc.

pkgconfig files will be of the form netcdf-$flavor.pc, with an
alternatives default netcdf.pc -> netcdf_serial.pc

Similarly for pnetcdf : /usr/lib/$arch/libnetcf_netcdf.so.*
(pnetcdf is parallel netcdf: there are two flavours of MPI netcdf:
netcdf4 using HDF5 for its parallelism, and pnetcdf using MPI but
writing the old nc3 format. There are some applications currently
outside Debian  that find this more performant).

There is a directory structure for symlinks:
/usr/lib/$arch/netcdf/$flavor/{lib,include,cmake,pkgconfig} as per HDF5.
If you use this as your location directory when building, it all does
the right thing; if you don't use this, you get the default (currently
serial) version. As only a handful of packages are expected to use the
MPI version, no build changes for most would be needed).

Eventually in a new release if compression+parallelism is  implemented,
this can all be transitioned away with a single rebuild for the "MPI
netcdf" packages.

So, in summary: for all but three/four packages, this has no effect:
binary compatibility remains intact (symbols, versioning. etc). Third
party binaries will link with Debian netcdf libs and vice versa. When
the "proper upstream" changes are made, these changes will transition
away in Debian.

Alastair

-- 
Alastair McKinstry, <alastair@sceal.ie>, <mckinstry@debian.org>, https://diaspora.sceal.ie/u/amckinstry
Misentropy: doubting that the Universe is becoming more disordered. 


Reply to: