[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Reproducibility



Teemu Ikonen <tpikonen@gmail.com> writes:

> Does anyone here have good ideas on how to ensure reproducibility in
> the long term? 

Regression testing, as mentioned, or running some fixed analysis and
statistically comparing the results to past runs.

We worry about reproducibility in my field of particle physics.  We run
on many different Linux and Mac platforms and strive for statistical
consistency (see below) not identical consistency.  I don't recall there
ever being an issue with different versions of, say, Debian system
libraries.  Any inconsistencies we have found have been due to version
shear in different copies of our own codes.

[Aside: I have seen gross differences between Debian and RH-derived
platforms.  In a past experiment I was the only collaborator working on
Debian and almost everyone else was using Scientific Linux (RHEL
derivative).  I kept getting bit by our code crashing on me.  It seems,
for some reason, my compilations tended to put garbage in uninitialized
pointers where on SL they tended to get NULL.  So, I was the lucky one
to find and fix a lot of programming mistakes.  This could have just
been a fluke, I have no explanation for it.]

> The only thing that comes to my mind is to run all
> important calculations in a virtual machine image which is then signed
> and stored in case the results need verification. But, maybe there are
> other options?

We have found that running the exact same code and same Debian OS on
differing CPUs will lead to differing results.  They differ because IEEE
FP "standard" isn't implemented exactly the same on all CPUs.  The
results will differ in only the least significant digits.  But, if you
use simulations that consume random numbers and compare them against FP
values this can lead to more gross divergences.  However, with a large
enough sample the results are all statistically consistent.

I don't know how that translates when using virtual machines on
different host CPUs, but if you care about bit-for-bit identically, this
FP "standard" may percolate up through the VM and ruin that.  Anyways,
in the end, all CPUs give the "wrong" results since FP calculations are
not infinitely precise, so striving for bit-for-bit consistency is kind
of a pipe dream.


-Brett.

Attachment: smime.p7s
Description: S/MIME cryptographic signature


Reply to: