[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Reproducibility



Those of you interested in reproducibility might be interested in
VisTrails. These is a start-up commercializing the software but most
of it is free and development is open source, available from
http://www.vistrails.org/index.php/Downloads. I remember that the
software keeps track of the libraries, OS, and CPU that the code is
using to get the results.

Best,
António Rafael C. Paiva
Post-doctoral fellow
SCI Institute, University of Utah
Salt Lake City, UT



On Fri, Apr 30, 2010 at 8:51 AM, Brett Viren <bv@bnl.gov> wrote:
> Teemu Ikonen <tpikonen@gmail.com> writes:
>
>> Does anyone here have good ideas on how to ensure reproducibility in
>> the long term?
>
> Regression testing, as mentioned, or running some fixed analysis and
> statistically comparing the results to past runs.
>
> We worry about reproducibility in my field of particle physics.  We run
> on many different Linux and Mac platforms and strive for statistical
> consistency (see below) not identical consistency.  I don't recall there
> ever being an issue with different versions of, say, Debian system
> libraries.  Any inconsistencies we have found have been due to version
> shear in different copies of our own codes.
>
> [Aside: I have seen gross differences between Debian and RH-derived
> platforms.  In a past experiment I was the only collaborator working on
> Debian and almost everyone else was using Scientific Linux (RHEL
> derivative).  I kept getting bit by our code crashing on me.  It seems,
> for some reason, my compilations tended to put garbage in uninitialized
> pointers where on SL they tended to get NULL.  So, I was the lucky one
> to find and fix a lot of programming mistakes.  This could have just
> been a fluke, I have no explanation for it.]
>
>> The only thing that comes to my mind is to run all
>> important calculations in a virtual machine image which is then signed
>> and stored in case the results need verification. But, maybe there are
>> other options?
>
> We have found that running the exact same code and same Debian OS on
> differing CPUs will lead to differing results.  They differ because IEEE
> FP "standard" isn't implemented exactly the same on all CPUs.  The
> results will differ in only the least significant digits.  But, if you
> use simulations that consume random numbers and compare them against FP
> values this can lead to more gross divergences.  However, with a large
> enough sample the results are all statistically consistent.
>
> I don't know how that translates when using virtual machines on
> different host CPUs, but if you care about bit-for-bit identically, this
> FP "standard" may percolate up through the VM and ruin that.  Anyways,
> in the end, all CPUs give the "wrong" results since FP calculations are
> not infinitely precise, so striving for bit-for-bit consistency is kind
> of a pipe dream.
>
>
> -Brett.
>
>


Reply to: