[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

RFC: Organizing HDF5 filter plugin packages



Hello,

I've been working on packaging some HDF5 filter plugins. There start to
be several of them in Debian, and it seems an area in need of some
organization. Here's a first proposal: my intention is to kickstart a
discussion with already some practical options on the table.

These are the packages containing filter plugins at the moment:

- hdf5-plugin-lzf: LZF (openmpi + serial)
- hdf5-filter-plugin: LZ4, BZip2, Bitshuffle (serial only)
- bitshuffle: LZF, Bitshuffle (openmpi + serial)
- hdf5-filter-plugin-blosc: Blosc (serial only)
- hdf5-plugin-zfp (serial only)


# Naming conventions

At the moment we have "hdf5-plugin-*", "hdf5-filter-plugin[-*]", and
plugins packaged as part of another binary (bitshuffle).

We could standardize on one. I'd open the discussion proposing
"hdf5-filter-plugin-*-{serial,openmpi}" as it seems to match how HDF5
documentation refers to them.

It can also make sense to use virtual package names to declare that a
plugin is being provided, for packages like bitshuffle that doesn't only
package a plugin, or hdf5-filter-plugin that packages multiple ones.

Note: I checked what Fedora is doing to try not to reinvent the wheel,
and I didn't recognize packages that specifically ship hdf5 plugins at
all.


# Plugins packaged multiple times

- The LZF filter plugin is currently packaged both in hdf5-plugin-lzf
  and in bitshuffle. They use differing names liblzf_filter vs libH5LZF
  though. Are they the same implementation?
- The bitshuffle plugin is currently packaged in hdf5-filter-plugin and
  in bitshuffle. This currently causes an unpack error when coinstalling
  them.

I could try to locate the reference versions of those plugins, and file
bugs for stripping them when they are bundled, in favour of the
reference version.


# openmpi versions of plugins

Should the two versions of plugins be packaged in the same package or as
separete packages?

hdf5 itself uses separate packages; bitshuffle packages both plugins,
and therefore pulls in both hdf5 libraries).

This can be solved by using *-serial/*-openmpi package names (real or
virtual), unless the extra complexity is not needed and users rather
expect to always have both plugins provided.


# testing openmpi versions of plugins

Is this a good enough way to test if the openmpi version of a plugin
works?

```
H5PY_ALWAYS_USE_MPI=1 python3
>>> import h5py
>>> h5py.h5z.filter_avail(<code>)
```

If so, I'd document it, possibly also with ready made autopkgtest
recipes, along the lines of:

```
# Test serial:    
python3 -c 'import h5py; import sys; sys.exit(0 if h5py.h5z.filter_avail(<code>) else 1)'
# Test openmpi:
H5PY_ALWAYS_USE_MPI=1 python3 -c 'import h5py; import sys; sys.exit(0 if h5py.h5z.filter_avail(<code>) else 1)'
```


# Packaging the plugins as libraries

HDF5 filters can both be dynamically loaded as plugins, or directly
linked in code.

With hdf5-plugin-lzf I tried to package both plugin and library, for
both serial and openmpi, but couldn't figure out install locations for
the serial and openmpi versions of headers and static library that
wouldn't conflict.

Is this a problem that needs solving at all, or can we just skip
packaging headers and static libraries for plugins for now?


# A HDF5 filter plugin mini-policy?

I offer to distill all that comes ouf of this thread in a mini-policy.



Thoughts? Does this all make sense?


Enrico

-- 
GPG key: 4096R/634F4BD1E7AD5568 2009-05-08 Enrico Zini <enrico@enricozini.org>

Attachment: signature.asc
Description: PGP signature


Reply to: