[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Auto reject if autopkgtest of reverse dependencies fail or cause FTBFS

On Mon, 16 Jan 2017 at 10:38:42 +0200, Lars Wirzenius wrote:
> A failing test means there's a bug. It might be in the test itself, or
> in the code being tested. It might be a bug in the test environment.

Nobody is disputing this, but we have bug severities for a reason:
not every bug is release-critical. If we gated packages on "has no
known bugs" we'd never release anything.

> Personally, I'd really rather have unreliable tests fixed.

Of course, but it isn't always feasible to drop everything and fix an
unreliable test, or the bug that the test illustrates - the cause of an
intermittent bug is often hard to determine. Until that can happen, I'd
rather have the test sometimes or always fail, ideally reported as
XFAIL or TODO or something (distinguishing it from "significant"
failures), so I can use the information that it produces.

For example, several of the ostree tests intermittently failed for a
long time, which turned out to be (we think) a libsoup thread-safety
bug. If I had disabled those tests on ci.debian.net altogether, then
I wouldn't have been able to tell upstream "those tests have stopped
failing since fixing libsoup, so that fix is probably all we need".

> Apart from social exclusion, unreliable tests waste a lot of time,
> effort, and mental energy.

Yes, and in an ideal world they wouldn't exist. This world is demonstrably
not ideal, and the code we release is not perfect (if it was, we wouldn't
need tests). Would you prefer it if packages whose tests are not fully
reliable just stopped running them altogether, or even deleted them?

I would very much prefer that we run tests, even the imperfect ones,
because CPU time is cheap and more information is better than less

I've opened:

autopkgtest: define Restrictions for tests that aren't suitable for gating CI

and sent a patch to:

autopkgtest: should be possible to ignore test restrictions by request

in the hope that we can use those as a way to mark certain tests as
"failure is non-critical".


Reply to: