Re: Dropping live image generation and testing for oldstable ?
[Reducing Cc to debian-cd as suggested]
On Sun, 11 Jan 2026 at 21:25:23 +0100, Roland Clobus wrote:
I understand that the testing effort (8 live images for amd64) is
huge. Many of the tests are really slow to perform (i.e. they take
more than 10 minutes each, with lots of waiting in-between).
Each live-image test done by the images testing group is really 7 steps,
after the images have been built:
1. download image and write it to bootable media (can be slow depending
on network connection and media available)
2. boot the live image (quick) and check the desktop environment works
3. install with Calamares (slow), reboot and check the installed desktop
4. reinstall with the included copy of d-i (slow), reboot and check the
installed desktop
5-7: repeat 2-4 with BIOS rather than UEFI boot mode, often done in
parallel if enough test machines and/or testers are available
So, yes, this is certainly time-consuming (and considerably more
time-consuming than the tests we do on each debian-cd installer image,
which normally only get one install per image per boot mode).
*Generating* the live images is also quite time-consuming, particularly
if it fails and has to be retried. As you mentioned elsewhere in your
message, by the time we get to the 13th point release, I wouldn't expect
there to be many surprises remaining - and yet, this time, the live image
builds failed.
One factor potentially contributing to that is that the live images
appear to be built by the latest live-build from git, and not from a
stable-branch of live-build into which only relevant fixes are
cherry-picked. While trying to help to diagnose the failing build, I
looked at the diff between the live-build that was used for 12.12 and
the live-build that used for 12.13. Nothing jumped out at me as a likely
root cause for the failure, but I did notice that some of the changes
seemed like the sort of change that, if it was up to me, I wouldn't be
applying to a stable branch (for example fixes for bugs that only affect
unofficial/customized images and not our official images, or for build
environments other than the official one).
debian-cd has a semi-frozen production branch for each Debian major
release, with the bar for backporting changes becoming increasingly high
as the branches become older (for example if I understand correctly,
13.3 images were built with
https://salsa.debian.org/images-team/debian-cd/-/tree/buildd/trixie?ref_type=heads
but the 12.13 images produced the same day were built with
https://salsa.debian.org/images-team/debian-cd/-/tree/buildd/bookworm?ref_type=heads
which contains fewer commits). Could live-build do the same? I think
that would be good for robustness.
I also wonder whether building (but not publishing!) a set of 12.x live
images during the "quiet period" in the week before the point release
(perhaps with bookworm-proposed-updates included in its apt sources to
smoke-test the pending changes) would have already exhibited the build
regression that we saw on point release day, allowing it to be
detected and investigated before it was too late to do anything about it.
Or just a handful of random tests on
real hardware could be performed instead of all of them.
I am aware that the virtual environment used by openQA will not catch
hardware-related issues (especially hardware requiring kernel modules
and/or firmware).
The kernel is one of the components most likely to be updated in an
oldstable point release, so that's significant.
The images testing group specifically doesn't use virtual machines to
test live images, precisely because in the past there have been
regressions that broke them on real hardware but didn't affect
installing into a VM. This is unlike the debian-cd (d-i) images, for
which testing "most" images in a VM is usually considered to be
sufficient (the only ones that are always tested on real hardware are
those with text-to-speech, I think).
Also, for bookworm the point release is number 13. Would it be
possible for the release team to do the full manual tests for the
first few point releases instead of for all of them?
(I assume you mean the images team rather than the release team.)
My suggestion would be to produce live images and do manual testing for
the lifetime of the stable release, but stop when it becomes oldstable.
Concretely, for bookworm, this would have meant generating and testing
these releases:
12.0: debian-cd + live
12.1: debian-cd + live
...
12.11: debian-cd + live
(at this point, 13.0 was released and 12.x became oldstable)
12.12: debian-cd only
12.13: debian-cd only
12.14 (hasn't happened yet): debian-cd only
12.15 (hasn't happened yet): debian-cd only
(at this point, we expect bookworm to be handed over to the LTS team)
My reasoning for this is that our stated reason for continuing to
support oldstable for 1 year after the stable release is to give Debian
users a grace period of 1 year to upgrade from oldstable to stable - but
live images aren't really required for that.
smcv
Reply to: