Re: source-only builds and .buildinfo
Hi, Ximin. Thanks for your attention.
Ximin Luo writes ("Re: source-only builds and .buildinfo"):
> Also the man page for dpkg-buildpackage is out-of-date:
I think maybe you should file a bug about these ?
> >> So I think for `dgit push-source', there should be no .buildinfo ?
> >> At least, unless dgit ran the clean target.
> >> Alternatively dgit could strip out the .buildinfo, depending on
> >> whether it ran rules clean.
> > What do you think ?
> > (The background here is that `dgit push-source' wants to verify for
> > itself that the .changes file it is uploading is really source-only.
> > Because of the possible presence of extraneous (eg BY-HAND) build
> > artefacts in .changes, Guillem suggested comparing the .changes to the
> > .dsc. But of course the .changes contains not only the .dsc and the
> > files named in it, but also the .buildinfo.)
> There are a few other options for you:
> - Add a --no-buildinfo flag to dpkg-genchanges, then call dpkg-buildpackage --changes-option=--no-buildinfo
dgit would have to work around the lack of the flag anyway.
> - Ignore the buildinfo entry in the .changes file.
> - Verify that the buildinfo file contains only ".dsc" entries and that they match up with the ones in the changes file.
I did an experimental dpkg-buildpackage -S and I got a .buildinfo
containing the following fields:
Because of the weirdness with `debian/rules clean', it is logically
possible for things like the build depends and the environment to
affect the generated source package.
But, I'm not sure what this buildinfo means in the context of
reproducible builds. Is it an assertion that if the b-deps etc. are
as specified, this source package will reproduce itself (ie, will be a
fixed point) ?
That doesn't seem very useful. Sane build machinery which consumes
Debian sources will transport (and, if necessary, modify) those
sources without invoking them to regenerate themselves, so will not
mind source packges which are not a fixed point under
dpkg-buildpackage -S. (By this definition of `sane' many of our
normal tools are not; but I think any tool that is trying to do build
reproduction must be sane by this definition because otherwise it will
be constantly tripping over buggy packages.)
And of course only pretty bad packages are not a fixed point with any
reasonable combination of the build-deps. In practice bugs where the
package is simply broken will far outweigh situationns where rules
clean works properly only with certain versions of the depndencies.
Nothing normally actually verifies the fixed-point-ness. So if the
.buildinfo is such an assertion, it will be a lie in any situation
where the information in it might be useful.
Finally in the context of dgit, the information seems even less likely
to be useful. Much of the time the person generating the source
package will have avoided the use of rules clean at all. In such a
situation the build-deps were not involved in generating the source
package. And dgit does check that the .dsc being uploaded corresponds
to the source the maintainer intended; so with dgit a situation cannot
arise where what is Uploaded = S(Intended) != Intended (where S is the
transformation "unpack, run dpkg-buildpackage -S, grab out the
resulting source package"). With dgit, if S(Intended) != Intended,
either dgit will upload Intended, oblivious to the bug because it
never runs rules clean; or it will run rules clean, discover the
discrepancy, and bomb out.
> I'm actually not sure what your main problem is.
Well, we tripped over this anomaly while trying to decide what dgit
push-source should do.
dgit push-source definitely needs to verify that the .changes file it
is uploading is a source-only upload. That is a safety catch against
unintended binaryful uploads (for example caused due to some
miscommunication in the stacks of build machinery, or the user
manually specifying the wrong .changes file). That means dgit
push-source needs to account for every file in the .changes.
The obvious algorithm is to check that every file in the .changes is
either the .dsc itself, or named in the .dsc. But we discover that
there's a .buildinfo there too. So we need to decide what to do about
Ignoring the .buildinfo seems like an easy workaround but 1. I don't
understand the implications 2. this seems like it's leaving a bug (the
.buildinfo generation) unfixed and unreported 3. the .buildinfo
contains information which ought not to be disseminated (and
published!) unless necessary (or at least, useful).
Particularly (3) means I'm leaning towards arranging for the
.buildinfo to be stripped out (or not generated). But then I am
dismantling Chesterton's fence.
Is there a downside to having dgit make source-only uploads which do
not contain .buildinfo ? Is, indeed, there any downside to having
dpkg-buildpackage not generate the .buildinfo in source-only builds ?
> Does dgit by default checkout a previously-build .dsc from git?
I'm not sure what you mean, but I think not.
dgit manipulates _source trees_ in git. The .dsc is not represented
directly in git (and in general cannot be regenerated from git because
there may be missing origs etc., and also there may be deviations in
behaviour of tools like dpkg-source). dgit may need to construct a
source package, in which case the intent is that in similar
circumstances dgit will produce .dscs which are semantically
equivalent. I don't think it's necessary to generate an identical
.dsc, because actual builds take (or can take) a source directory tree
as input, not a .dsc; and an "appropriate" .dsc by this definition
implies the same source tree (which is a property dgit does check - at
least, as far as the local dpkg-source is concerned).