Re: Building Debian source packages reproducibly (was: Re: [RFC] Proposal for new source format)

To: Helmut Grohne <helmut@subdivi.de>
Cc: debian-devel@lists.debian.org
Subject: Re: Building Debian source packages reproducibly (was: Re: [RFC] Proposal for new source format)
From: Ian Jackson <ijackson@chiark.greenend.org.uk>
Date: Tue, 29 Oct 2019 12:54:57 +0000
Message-id: <[🔎] 23992.13985.680883.197476@chiark.greenend.org.uk>
In-reply-to: <[🔎] 20191028203440.GB7536@alf.mars>
References: <[🔎] 20191022032257.t6dq3xbvys547qar@shell.thinkmo.de> <[🔎] 7248482.4r3n416ZZV@odyx.org> <[🔎] 20191028134536.GA4404@mit.edu> <[🔎] 3619148.rgn0uvbNJu@l5580> <[🔎] 23991.11004.364270.639495@chiark.greenend.org.uk> <[🔎] 20191028203440.GB7536@alf.mars>

Helmut Grohne writes ("Re: Building Debian source packages reproducibly (was: Re: [RFC] Proposal for new source format)"):
> In other words, I want these formats (source package and tagged git
> tree) to be isomorphic (minus history). This requirement is too strong
> since not every source package will have a corresponding tag, but when
> there is a tag, I want to safely go from source package to tag and back
> again and arrive where I started from.

I wonder if I have misunderstood you, because:

The tag2upload proposal is based on dgit, which already provides this.
dgit indeed defines an isomorphism between source packages and git
trees, and dgit clone gives a git branch that is thus-isomorphic to
the .dsc.  This is fundamental to dgit's design.

With `dgit push', the isomorphism is checked on the maintainer's
machine during `dgit push'.  With tag2upload it is ensured by the
tag2upload service.  (When the uploader didn't use dgit, dgit clone
does a .dsc import, thus ensuring the isomorphism.)

> This property allows me to start from a git tree that is
> authenticated by dak rather than a random git tree on a random git
> server of questionable origin.

You do not need to talk to any random git servers.  The git tree is
available on a single official Debian server, the dgit git server.
The Dgit: field in the .dsc identifies the commitid.  The .dsc is of
course available via the signed apt repositories, as well as being
available from the ftpmaster data API.

It is true that this doesn't give you precisely the *tag* object -
just the commit.  Adding the objectid of the tag object to the .dsc
Dgit: field would be easy, if that would be helpful to you.  (Please
file a wishlist bug against dgit if so.)  Alternatively, dak could
publish the tag object (in a similar way to how it publishes binary
buildinfos).

Note that there are *two* tag objects: 1. the canonical view:
the dgit view tag, which is simply-isomorphic to the source package.
2. the maintainer tag, which is on the maintainer's branch and refers
to a commit in maintainer branch format.

With dgit push these are both made during dgit push with the
maintainer's key.  With tag2upload the canonical view tag is made by
the tag2upload service, because it is that service which performs the
maintainer->canonical conversion.

Each maintainer workflow defines a different mapping between
maintainer views and canonical views.  The (currently supported[1])
workflows are all isomorphisms.  So it is possible in principle to
reverse the maintainer->canonical transformation (if you know the
workflow, which can be found in the tags) but there is not currently
code to do that.  I don't get the impression, however, that this is a
thing you feel you need ?  (Some form of reverse transformation would
be needed to automatically and workflow-agnostically handle MRs whose
submitter is using the canonical view.)

> This backwards-connection seems to be missing thus far, but I do find it
> important for the reasons above. Adding it would easily allow dak to
> validate the signature on the tag.

So, I'm not sure I understand what you think is missing.

Ian.

[1] I think with monorepo workflows the maintainer->canonical
conversion is generally irreversible, because it discards information
about source packages other than the one in question.  This wouldn't
block MR processing because MRs are deltas and by definition the other
parts of the monorepo aren't edited in the MR.  It does mean you
couldn't reconstruct the whole monorepo given just the canonical view.

(Arguably this means that the .dsc representation of a source package
from a git monorepo is not a PFM.  See arguments on -legal and
-project, passim.  But the canonical view dgit branch does contain the
whole of the monorepo in its history, in a discoverable way, so
doesn't have this issue.)

-- 
Ian Jackson <ijackson@chiark.greenend.org.uk>   These opinions are my own.

If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
a private address which bypasses my fierce spamfilter.

Reply to:

Follow-Ups:
- Re: Building Debian source packages reproducibly (was: Re: [RFC] Proposal for new source format)
  - From: Helmut Grohne <helmut@subdivi.de>

References:
- [RFC] Proposal for new source format
  - From: Bastian Blank <waldi@debian.org>
- Building Debian source packages reproducibly (was: Re: [RFC] Proposal for new source format)
  - From: Didier 'OdyX' Raboud <odyx@debian.org>
- Re: Building Debian source packages reproducibly (was: Re: [RFC] Proposal for new source format)
  - From: "Theodore Y. Ts'o" <tytso@mit.edu>
- Re: Building Debian source packages reproducibly (was: Re: [RFC] Proposal for new source format)
  - From: Scott Kitterman <debian@kitterman.com>
- Re: Building Debian source packages reproducibly (was: Re: [RFC] Proposal for new source format)
  - From: Ian Jackson <ijackson@chiark.greenend.org.uk>
- Re: Building Debian source packages reproducibly (was: Re: [RFC] Proposal for new source format)
  - From: Helmut Grohne <helmut@subdivi.de>

Prev by Date: Re: Building Debian source packages reproducibly
Next by Date: Re: Building Debian source packages reproducibly (was: Re: [RFC] Proposal for new source format)
Previous by thread: Re: Building Debian source packages reproducibly (was: Re: [RFC] Proposal for new source format)
Next by thread: Re: Building Debian source packages reproducibly (was: Re: [RFC] Proposal for new source format)
Index(es):
- Date
- Thread