Re: Including build metadata in packages

To: debian-devel@lists.debian.org
Subject: Re: Including build metadata in packages
From: Simon McVittie <smcv@debian.org>
Date: Wed, 16 Feb 2022 16:51:38 +0000
Message-id: <[🔎] Yg0rmoRodGEAgSQC@momentum.pseudorandom.co.uk>
In-reply-to: <[🔎] cf5e0b4a47522591d3fc814ca1e031113870be5b.camel@debian.org>
References: <Ygz+w/Wo7ZCRGq/9@momentum.pseudorandom.co.uk> <[🔎] cf5e0b4a47522591d3fc814ca1e031113870be5b.camel@debian.org>

On Wed, 16 Feb 2022 at 23:25:46 +0800, Paul Wise wrote:
> Simon McVittie wrote:
> > handling build logs is not dak's job (and I don't think handling
> > things like the binutils test results should be dak's job either).
> 
> It has always felt weird to me that build logs are entirely separate to
> the archive off in a side service rather than first-class artefacts
> that people occasionally need to look at. Also that the maintainer
> build logs don't end up anywhere and are probably just deleted. I think
> the same applies to the buildinfo files and also these tests results
> and other artefacts that are mentioned in this thread.

If the maintainers of dak (our eternally overworked ftp team) want to
pick up build logs as first-class artifacts produced by both failed
and successful builds, they're welcome to do so (and then handling my
prototype of test artifacts would be a matter of adding another glob
pattern to be stored, for the tarball of artifacts that accompanies the
log); but I don't want to block on them doing that, because that seems
like a recipe for it never happening.

I am also not sure that it would be appropriate for dak to be doing
any processing on *failed* builds, which currently fail and get diverted
off into other code paths long before they get to dak.

If you are trying to solve the problem "we cannot see into the logs of
maintainer-built binaries that exist in the archive", I think a better
answer to that would be to stop letting maintainer-built binaries into the
archive, as the release team are already pushing us towards. That way,
we don't have to worry about whether maintainers' build logs and/or test
artifacts would be leaking personal or sensitive information that they
would prefer not to have shared.

> IIRC last time the build artefact discussion came up I was cycling
> between having the artefact handling in the sbuild configs on the
> buildds for quick implementation vs having it in debian/ dirs for
> distributed maintenance by maintainers.

I'm reasonably sure that the sbuild configuration is the wrong place
to specify what the artifacts are, because the interesting artifacts
depend on the build system (Autotools vs Meson vs etc.) and on how the
package uses it (in-tree vs. out-of-tree build, single vs multiple builds,
and so on), as well as on the package itself (for example GTK's ad-hoc
mechanism to store reftest results as PNG files is entirely GTK-specific).
This is something that the package maintainer already needs to know, so
that they can debug failing builds locally.

I tested my prototype with a Meson package, which has the advantage that
it's very consistent: whatever your build directory is, it will have
a meson-logs subdirectory and that's where all the logs are. However,
even Meson is not always done identically: the most obvious example
is that most Meson-built packages use the dh default build directory
./obj-${multiarch}, but if you do two builds (perhaps one for the .deb
and one for the .udeb, like GLib does), you have to find somewhere else
to put the second build.

> I think there is a fundamental question here that needs answering
> definitively: who is the audience for the artefact feature?
> 
>  * Is it individual package maintainers who want test result details?
>  * Is it build tool maintainers who want data on tool use/failures?
>  * Is it porters who want more detailed logs in case of failure?
>  * Is it buildd maintainers for some reason?
>  * Is it RC bug fixers?
>  * Is it all of the above?

As an individual package maintainer, I certainly want this feature.
The exact artifacts that I want vary between packages, which is why
I prototyped it as a new field in d/control.

When toolchain packages like binutils and gcc collect their test
results, I think that's also their maintainer acting as an individual
package maintainer. Obviously they're very important core packages,
but collecting their test results doesn't seem like it fundamentally
differs from me wanting to collect GTK test results.

If the other groups get a benefit from this too, then that's a welcome
bonus, but I think solving it for individual package maintainers and
ignoring everyone else would be a net improvement.

Porters and RC bug fixers can benefit from this information in the
same way package maintainers do; if they're looking at fixing a bug,
they are going to have to change the package *anyway* (to apply the
bug fix), so changing it to collect artifacts (if it doesn't already)
doesn't seem like a huge cost.

I am not aware of buildd maintainers having asked for more detailed
logs. Indeed, buildd maintainers are in the unique position that they
can run arbitrary privileged code on buildds, so they are in a better
position to collect information from a half-built package than mere DDs,
and presumably have less need for this feature.

Build tool maintainers seem like the only one of the groups you've named
that isn't necessarily well-served by my prototype: they don't want to
modify everyone else's packages to get more information about how their
build tool is working.

Perhaps it would make sense to have a hybrid of what I prototyped, and
something more like substvars:

- the package maintainer can write a list of patterns into
  debian/build-artifacts (or a field in d/control, as in my prototype)
- the package's build system (d/rules, debhelper or whatever) can write
  additional patterns into debian/extra-build-artifacts at runtime
- anything listed in either or both places is collected into the
  -artifacts.tar.gz

What I definitely want to avoid is a system that requires collecting
the artifacts imperatively rather than declaratively, e.g. converting

    dh_auto_test -- --parameters

into

    if ! dh_auto_test -- --parameters; then \
        cp _build/meson-logs/* debian/build-artifacts/ || :; \
        cp tests/foo/bar/*.png debian/build-artifacts/ || :; \
        exit 1; \
    fi

(with all the right makefile escaping) in every package that has its
own ad-hoc artifacts. That scales really poorly, and conflicts quite
badly with the philosophy of failing with a fatal error as soon as a
sufficiently bad problem is seen.

> Once that is answered, then we can think about how to accommodate how
> and where the list(s?) of files are to be maintained?
...
>  * in wanna-build
>  * in sbuild
>  * in sbuild.conf in dsa-puppet
>  * in sbuild overrides on buildds

I think those are a non-starter: as a maintainer of an individual package,
I do not want to have to ask the Debian sysadmins' permission to collect
test results (or, worse, ask the sbuild maintainer's permission and then
wait 2 years for the change to be in a stable release).

However, if you think those people will genuinely want to use it, then
it seems fine to have a sbuild option with the semantics "always behave
as though the package's list of build artifacts had these extra patterns
in it".

I think part of being a do-ocracy is that if there isn't an important
reason for a small and usually overworked group to be in a position to
block other people's work, then we should avoid putting extra load on them.

    smcv

Reply to:

Follow-Ups:
- Re: Including build metadata in packages
  - From: Paul Wise <pabs@debian.org>

References:
- Re: Including build metadata in packages
  - From: Simon McVittie <smcv@debian.org>
- Re: Including build metadata in packages
  - From: Paul Wise <pabs@debian.org>

Prev by Date: Re: Including build metadata in packages
Next by Date: Debian Med video conference tomorrow, Thursday 2022-02-17 18:00 UTC
Previous by thread: Re: Including build metadata in packages
Next by thread: Re: Including build metadata in packages
Index(es):
- Date
- Thread