Since netcdf 4.4.1 is in testing/unstable for some time, that's no
longer a blocker for the hdf5 transition.
I'm testing out the last of my changes for co-installable netcdf
which I
hope to have ready by the beginning of next week. It would be worth
thinking about doing both.
I'm aware of the outdated dev-coinstallable branch in the netcdf
repository, and no offence to your effort, but I don't like what I
see
there. The symbols version script will be a pain to maintain as our
experience with gdal has shown for example. I'm still very much
against
patching the netcdf source to make it build with HDF5 serial and its
MPI
variants. That needs to be solved upstream. I don't want to require
changes to reverse dependencies to select a Debian specific netcdf
variant as we do for hdf5. The situation we want to create in Debian
should be something supported out of the box by upstream. Has there
been
any discussion with NetCDF (and HDF5) upstream about this?
Kind Regards,
Bas
I've been working on the dev-coinstallable so that it is no longer
requires a transition.
Yes, i've been talking to upstream about this, (mostly hdf5 people but
also netcdf) and my understanding is that the problem is basically
HDF5
/ netcdf compression: HDF5 (and hence netcdf) can do either
compression
(SZIP, etc. ) or parallel read/write, but not both simultaneously.
Fixing this is on the todo list, but has been for many years without
progress. Estimates of 6-12 months work have been quoted, as HDF5 is
effectively becoming a high-performance filesystem within a file on
HPC
systems with deep memory hierarchies, and any such changes are not
trivial. This development can only be really written and tested on
top-end HPC systems like the national labs; patches written by
developers on PCs will probably impact performance and not be accepted
by upstream. What I'm proposing is a temporary workaround until this
is
done, that is designed to go away later.
( Whats it means, technically: parallel write/reads work (on posix) by
dividing the file to be written into even-sized chunks handled by .eg.
MPI-IO. Compression means we don't know in advance the size on disk a
given write will be, until after we've compressed it; if we have a
chunk
of memory of fixed size, we don't know the eventual byte-range on a
'serial bunch of bytes' file representation it will map to. However in
practice people are moving to a non-POSIX based representation of HDF5
files on 'modern object-based' APIs; no longer treating the file as a
serial bunch of bytes but a set of possibly different-sized blocks
handled as objects on eg. an S3 API object-based filesystem. In this
picture compression can be added. But the high-performance work is
more
important to HDF5 developers' funders than compression at the moment,
while on the netcdf front, compression is important for those of us
storing and archiving large files long-term).
We need to be able to handle both cases. Eventually in a 'deep'
software
stack like Debian, we will have applications such as Visit, Paraview,
CDAT etc. that will need to be able to both (1) read compressed files
or
(2) read in parallel, on different workflows. These work using netcdf,
adios and xdmf plugins for IO, and currently cannot provide
parallelism
on Debian because of the lack of parallel netcdf. Given this
incompatibility the 'serial' is the right default for Debian but
handicaps us on large systems. where i'm working for example, we build
portals on Debian for HPC, but can't do so due to this lack of MPI
support.
So, the solution I'm proposing: We retain one 'master' netcdf version.
libnetcdf11 and libnetcdf-dev. Co-installable libnetcdf-mpi-11 and
libnetcdf-pnetcdf-11 exist, but are not used by most libraries /
applications. While a parallel libnetcdf- fortran and C++ libraries
are
also required, I do not propose or expect any applications above the
netcdf stack provide serial and parallel versions; there will be no
combinatorial explosion of packages. A handful of libraries, apps may
be
linked to the MPI version of netcdf instead of the serial, and in
particular two higher-level IO libraries I maintain will be linked to
both : ADIOS and XDMF (XDMF would provide both xdmf.py and
xdmf_mpi.py
modules, user selects which; while ADIOS provides an interface where
it
can decide at runtime whether to use serial or mpi).
The libraries are as follows: libnetcdf11 is as before, with NETCDF_*
symbols, library in /usr/lib/$arch/libnetcdf.so.11.3.0
The include files are in /usr/include/netcdf ; etc.
For MPI version, the library is /usr/lib/$arch/libnetcdf_mpi.so*,
symbols NETCDF_MPI_* . Note: mpi, not openmpi; the MPI dependencies
are
abstracted away by this layer.
The Fortran, C++ netcdf packages would ship both libnetcdff.so and
libnetcdff_mpi.so. etc.
pkgconfig files will be of the form netcdf-$flavor.pc, with an
alternatives default netcdf.pc -> netcdf_serial.pc
Similarly for pnetcdf : /usr/lib/$arch/libnetcf_netcdf.so.*
(pnetcdf is parallel netcdf: there are two flavours of MPI netcdf:
netcdf4 using HDF5 for its parallelism, and pnetcdf using MPI but
writing the old nc3 format. There are some applications currently
outside Debian that find this more performant).
There is a directory structure for symlinks:
/usr/lib/$arch/netcdf/$flavor/{lib,include,cmake,pkgconfig} as per
HDF5.
If you use this as your location directory when building, it all does
the right thing; if you don't use this, you get the default (currently
serial) version. As only a handful of packages are expected to use the
MPI version, no build changes for most would be needed).
Eventually in a new release if compression+parallelism is
implemented,
this can all be transitioned away with a single rebuild for the "MPI
netcdf" packages.
So, in summary: for all but three/four packages, this has no effect:
binary compatibility remains intact (symbols, versioning. etc). Third
party binaries will link with Debian netcdf libs and vice versa. When
the "proper upstream" changes are made, these changes will transition
away in Debian.
Alastair