Re: [UDD] Is there any information about failed autopkgtest in UDD?
On Thu, 16 Apr 2020 at 10:09:36 +0200, Andreas Tille wrote:
> Hmmmm, what exactly means "superficial".
Originally proposed in:
The design was originally called "trivial" but was renamed to
"superficial" during implementation.
> Are all those
> Testsuite: autopkgtest-pkg-*
Most of them yes, because if they fail, that's Very Bad, but if they
succeed, they don't give us a whole lot of confidence that the package
works as intended.
For example, if you have a package python3-dbus, the most that can be
done with knowledge of Python but no knowledge of dbus is to "import dbus"
and see what happens. If that fails, obviously the dbus module is
unusable (or missing a dependency or something); but if that succeeds,
that fact doesn't tell us whether it can actually do D-Bus successfully,
which is its real purpose. If I replaced all the code in python3-dbus
with print("hello"), obviously it wouldn't be implementing its intended
API any more, but it would still pass the "import" test. That's why the
"import" test is considered to be superficial.
If a family of packages have a convention for how to discover and
run a real test suite with non-trivial coverage (for example GNOME-style
installed-tests in /usr/share/installed-tests/**/*.test), then an
autopkgtest-* helper that detected and used *that* convention would usually
*not* be superficial.
> Do they qualify for early testing migration or not?
If *all* the tests are superficial, passing them doesn't speed up testing
Packages get faster testing migration if they have at least one test that
results in a pass status (passing and not superficial), *and* all of their
tests result in either a pass or neutral status.
Normal tests: pass -> pass, fail -> fail
"superficial" tests: pass -> neutral, fail -> fail
(the test passing doesn't really give us much confidence that it works)
"flaky" tests: pass -> pass, fail -> neutral
(the fail doesn't give us much confidence that it *doesn't* work, because
maybe it just failed randomly)
"flaky, superficial" tests: pass -> neutral, fail -> neutral
(no machine-readable effect, should normally only be used when monitoring
whether a new test is reliable on Debian infrastructure or not)
"skippable" tests: pass -> as above, exit 77 -> neutral, fail -> as above
Tests that can't be run due to restrictions: neutral