Re: numerical comparisons or not? (platform compatibility during testing)

To: Debian Science List <debian-science@lists.debian.org>
Subject: Re: numerical comparisons or not? (platform compatibility during testing)
From: Stephen Sinclair <radarsat1@gmail.com>
Date: Wed, 19 Feb 2020 11:24:13 +0100
Message-id: <[🔎] CADopUHboKZGoLxKEMpsCQZck-y18maAtwjS-pD15PDt3q2zxgg@mail.gmail.com>
In-reply-to: <[🔎] CALF6qJnE_4a-wtc0L2ig+zF8cVGtBvFFivL4sM6nmLSMdKNBKw@mail.gmail.com>
References: <[🔎] CADopUHaB81-3CUZP-vjPObwr72qZ_DGgrkJ2ZwdD535puD+x8A@mail.gmail.com> <[🔎] CALF6qJnE_4a-wtc0L2ig+zF8cVGtBvFFivL4sM6nmLSMdKNBKw@mail.gmail.com>

Thanks for your answer Anton,

On Tue, Feb 18, 2020 at 9:23 PM Anton Gladky <gladk@debian.org> wrote:
>
> Hi Steve,
>
> my personal suggestion would be to change the threshold for failing tests.
> Numerical tests can be very sensitive to the platform or even sometimes
> to the versions of dependent libraries.

It could be the way.. but since this would possibly require some
tuning, is there a way to test on different buildd platforms before
the package is actually deployed?  So far I have had updates to my
package uploaded through mentors, and only later can I check the
buildd logs, but it would be nice to be able to test different
thresholds before bothering sponsors.

Steve


> Am Di., 18. Feb. 2020 um 16:24 Uhr schrieb Stephen Sinclair
> <radarsat1@gmail.com>:
> >
> > Hi folks,
> >
> > I've got my package, siconos, working nicely now on the platform I use it on (amd64) but some inspection of the build logs has revealed that buildd fails on all other platforms.
> >
> > https://buildd.debian.org/status/package.php?p=siconos&suite=sid&comaint=yes
> >
> > In fact it does build fine, but the software includes some numerical tests that compare results of the various solvers against expected results.  These apparently yield slightly different values on other platforms, and the tests fail.
> >
> > To get the software running on other architectures, would it be better to,
> >
> > 1. eliminate such tests.
> > 2. change the numerical threshold for failing a test.
> > 3. have different thresholds or even reference files per platform.
> >
> > In some sense (1) or (2) could be acceptable since it seems overly brittle to depend on specific numerical outcomes -- the intent of these tests is really for the developers to detect any unexpected changes in the solver outputs, they are not meant to gate deployment of the software.
> >
> > So perhaps it would be better for tests for buildd to focus mostly on whether the software runs without crashing, rather than comparing specific and very precise numerical outputs.  In fact if I simply remove the reference files from the software, the tests will still run and pass without comparing the numerical output, but I am not sure if this is wise.
> >
> > regards,
> > Steve
> >

Reply to:

Follow-Ups:
- Re: numerical comparisons or not? (platform compatibility during testing)
  - From: ucko@debian.org (Aaron M. Ucko)

References:
- numerical comparisons or not? (platform compatibility during testing)
  - From: Stephen Sinclair <radarsat1@gmail.com>
- Re: numerical comparisons or not? (platform compatibility during testing)
  - From: Anton Gladky <gladk@debian.org>

Prev by Date: Re: numerical comparisons or not? (platform compatibility during testing)
Next by Date: Re: numerical comparisons or not? (platform compatibility during testing)
Previous by thread: Re: numerical comparisons or not? (platform compatibility during testing)
Next by thread: Re: numerical comparisons or not? (platform compatibility during testing)
Index(es):
- Date
- Thread