[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#1041603: automake-1.16: flaky autopkgtest on arm*: regularly times out



Source: automake-1.16
Version: 1:1.16.5-1.3
Severity: important
User: debian-ci@lists.debian.org
Usertags: flaky
X-Debbugs-Cc: debian-ci@lists.debian.org
Control: affects -1 + src:glib2.0 src:vala src:python3-defaults

The automake test suite routinely takes more than 2h30 on slower
CPU architectures (armel and armhf), which is uncomfortably close to
autopkgtest's arbitrary limit of 10k seconds (a bit more than 2h45).
If it runs a bit slower for whatever reason - perhaps extra load on the
testbed machine - then the test fails with a timeout.

This means that when packages that are depended on by the test suite,
such as glib2.0, python3-defaults and vala, are trying to migrate to
testing, the test will randomly pass or fail, causing those packages to
be detected as having caused a regression when in fact they have not.

For example, in the recent history of
<https://ci.debian.net/packages/a/automake-1.16/testing/armel/>, we can
see these:

* testing only (migration reference): pass, 2h50
* testing only (migration reference): fail, 2h51
* testing + src:glib2.0 -3 from unstable: fail, 2h52
* testing + src:glib2.0 -3 from unstable: fail, 2h51
* testing + src:glib2.0 -2 from unstable: pass, 2h37
* testing + src:glib2.0 -2 from unstable: fail, 2h52
* testing + src:vala from unstable: pass, 2h27
* testing + src:vala from unstable: fail, 2h53

and this is getting in the way of glib2.0's migration. I am quite
confident that there is no runtime regression between glib2.0/2.76.4-2
and glib2.0/2.76.4-3: they are the same codebase, with only build-time
changes to work around a Meson bug.

The times shown on ci.debian.net are a bit longer than the test timeout
of 10k seconds, which is normal: they include activities like installing
dependencies and reporting results, which are not included in the test
timeout.

automake maintainers: can this test suite be made a bit faster without
sacrificing too much coverage, perhaps by skipping some repetitive or slow
tests, or by breaking it up into batches that each take 1 hour or less?

CI infrastructure maintainers: can we mitigate this by having debci run
autopkgtest with --timeout-factor=2 on the slower architectures, or by
only testing automake-1.16 on the faster architectures? I see riscv64 is
already using --timeout-factor=2.

    smcv


Reply to: