[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Reproducibility



Hi,

On Fri, Apr 30, 2010 at 10:01:23AM +0200, Teemu Ikonen wrote:
> On Fri, Apr 30, 2010 at 2:08 AM, Michael Hanke <michael.hanke@gmail.com> wrote:
> > Debian: The ultimate platform for neuroimaging research
> [...]
> > However, it is hard to blame the respective developers, because the
> > sheer number of existing combinations of operating systems, hardware,
> > and library versions makes it almost impossible to verify that a
> > particular software is working as intended.  Restricting the
> > ``supported'' runtime environment is one approach of making
> > verification efforts feasible.
> 
> Dear list,
> 
> This nice abstract inspired me to think about reproducibility of
> program runs. If one runs e.g. Debian unstable the OS code which can
> potentially affect the results of calculations can change almost
> daily. Reproducing results later can be close to impossible unless
> versions of all the related libraries etc. are written down somewhere.

This is not just a potential problem -- we have seen it happen already.
Part of the problem is that in Debian we prefer dynamic linking to
up-to-date shared libs from separate packages -- instead of statically
linking to ancient versions with known behavior (for good reasons of
course).

> Does anyone here have good ideas on how to ensure reproducibility in
> the long term? The only thing that comes to my mind is to run all
> important calculations in a virtual machine image which is then signed
> and stored in case the results need verification. But, maybe there are
> other options?

IMHO better than relying on a snapshot of OS and a particular software
state to get constant results, projects should have comprehensive
regression tests that ensure proper behavior. The problem is, however,
that we cannot run then during package build time, since they tend to
require large datasets and run for many hours. Therefore users need to
do that, but nobody does it.


Michael


-- 
GPG key:  1024D/3144BE0F Michael Hanke
http://mih.voxindeserto.de


Reply to: