Summary of the current state of the tag2upload discussion
Over the past week and a half, we've had a sprawling thread of nearly
unreadable volume that I am sure many people have stopped reading for very
understandable reasons. As probably the largest single contributor to
that volume, I will attempt to do penance and summarize the thread for
everyone else.
I am not one of the tag2upload developers, but I do think tag2upload
should be deployed and I don't think the FTP team should block it.
Although I am trying to be factual, this is in no way an unbiased summary.
It is also quite possible that the summary below contains factual errors,
as I am painfully aware after having spent a couple of days posting utter
nonsense about the git-debrebase workflow. (Apologies again.) This is an
open invitation for people to correct me and post their own perspective.
Since this may therefore to turn into another long thread, I encourage
people who disagree to post their own summaries close to the root of the
thread so that people who are just trying to track the discussion can get
a multi-faceted update without digging through a lot of back-and-forth.
# Progress
Two FTP team delegates (Joerg and Ansgar) have been participating in a
detailed discussion of the architecture and merits. That discussion is
still ongoing.
## Authentication protocol change
The original tag2upload proposal had the tag2upload server performing the
OpenPGP signature check on the Git tag, and called for dak to then trust
that signature. This was one of the primary objectionable parts of the
design from the FTP team perspective. Based on a suggestion by Jessica
Clarke to turn that check into an API, there is a modified proposal on the
table for tag2upload to add the signed Git tag to its uploaded files and
for dak to redo the signature check as well as the authorization check
before accepting the upload. This ensures uploads made either directly to
dak or via tag2upload go through the same dak authentication and access
control logic via a different parser.
Compared to the original tag2upload proposal, this does require some code
changes to dak. I believe the corresponding change to tag2upload is not
possible today because dak would reject the additional file. But I think
both the FTP team and the tag2upload developers agree with this change.
There is further nice work that could be done here with a proper API, but
that raises other questions about how to do APIs between Debian project
infrastructure systems, and I think it can be deferred until later.
## dak requirements for the upload
Two FTP team delegates have agreed (I believe) that the uploader does not
need to sign the final source package, and that a signature over a hash of
the files that will be included in the source package would be sufficient.
This is not enough to unblock tag2upload (discussed below), but I think it
indicates a good-faith attempt to find a compromise and I greatly
appreciate it.
# Security review
Multiple people have gone over my accompanying security review (thank you
very much!) and have provided feedback on my analysis and conclusions.
Simon Josefsson found a Git hash collision attack that I had not
considered that avoids the hash hardening after the SHAttered attack by
exploiting Git's laziness about revalidating hashes. It's not clear to me
how feasible that attack is against Salsa even assuming that construction
of a Git hash collision is possible, but it's worth considering setting
transfer.fsckObjects to true (possibly with a whitelist of acceptable
trivial errors that have no security significance) in tag2upload in order
to ensure that such an attack is rejected.
Not every reviewer agrees with my belief that Git hash collisions are not
worth worrying much about for the next few years. However, I believe each
reviewer who raised some objections to that conclusion also said that they
didn't think it was a blocking problem for tag2upload deployment.
# Remaining points of disagreement
The FTP team's primary concern, as I understand it, is that tag2upload
uploads will not contain an uploader signature over the exact files that
comprise the source package. The uploader signs a Git tag, and the
transformation from the signed Git tree to the source package may include
synthesizing or modifying files in ways that the uploader does not have in
their Git tree in a hashable form. The tag2upload developers consider
this a critical and necessary degree of freedom in order to handle the
variety of Git workflows in use in Debian. The FTP team want the exact
contents of every file in the source package other than the constructed
*.dsc file (and maybe the *.orig.tar.gz; I'm not clear on that point) to
be covered by an uploader signature.
There is a stark disagreement over the importance of that signature, and
it appears to be the remaining blocking issue. I have argued that it
makes little difference from a security standpoint whether the source
package construction step happens before the uploader signature (the
current dak upload process) or after the uploader signature (the
tag2upload process), given that the uploader doesn't (and can't,
realistically) check the output in either case. I believe at least one
FTP team delegate disagrees. I personally don't understand the
disagreement.
Assuming that some mechanism could be found to handle all useful Git
workflows without synthesizing or modifying files (an assumption that I
don't believe is true), this proposed approach of adding an additional
hash or set of hashes to the uploader-signed object would still require a
complete redesign of tag2upload, significant new work in dak, and a
development of a hasing protocol for the files of the future source
package that I believe is considerably trickier than it may appear. (I
believe at least one FTP team delegate disagrees with me about the
difficulty.) So far as I know, there are no volunteers to do that work.
This approach would also require an additional local step before upload to
perform this new hash, which, depending on the details of the hash
construction, may undermine the tag2upload design goal of requiring
minimal local software apart from Git and an OpenPGP implementation.
# Other resolved questions
There were several other questions that came up in the course of the
discussion that I think have been resolved to everyone's satisfaction.
* There was considerable early confusion about the role of the dgit-repos
Git archive and whether it was a competitor to Salsa. This is an
append-only Git server that holds only Debian packages as uploaded to
the archive. Its role is to provide a permanent archive of the
corresponding signed Git tree, akin to snapshot.debian.org for the
archive. It does not support the features that Salsa supports and is
complementary to, not competitive with, Salsa.
* There was some confusion about the relationship between dgit and
tag2upload. tag2upload uses dgit for some operations and is developed
in the same repository, but it is unrelated from the uploader
perspective. One does not have to use dgit to use tag2upload.
* There was disagreement over the extent to which tag2upload would have
helped against the xz-utils attack. I believe the conclusion that
discussion reached is that the critical workflow change to protect
against the attack mechanism used there is to only trust upstream Git
tags rather than tarball releases. tag2upload by design does not
require that workflow change, although it does make it easier.
# Disclaimer
Again, this summary is only my opinion. I am not a tag2upload maintainer,
nor is the draft GR my GR. I cannot speak for any of the parties involved
here, only for myself.
--
Russ Allbery (rra@debian.org) <https://www.eyrie.org/~eagle/>
Reply to: