Matthias Urlichs <matthias@urlichs.de> writes: > On 01.07.24 12:46, Aigars Mahinovs wrote: >> Yes and no. See what the git tag actually contains and what the GPG >> signature actually signs is just the one hash of the commit object. >> This commit object then refers to the other files of the repo, but the >> GPG signature does not directly sign those. > So it signs them indirectly instead. I don't consider that to be a problem. > > There's no material difference whether the tag signs a commit that > hashes a tree that (eventually) hashes the files, or a list of the > files plus their hashes, or a tarball of the files in question (except > that the way we do the latter is too brittle – it depends on the file > order and compression used). > > The single advantage of including a file list would be if it included > the files' SHA256-or-better hashes, but given the difficulty of > finding *and* exploiting a SHA1 collision it's a judgment call whether > that's worth the effort. I believe you only need a SHA1 collision to corrupt the t2u scheme, and those are not difficult. Unless I'm missing something, and to help everyone get on the same page of this analysis, and to get corrections from other if my analysis is wrong, here are the concrete steps for a malicious upstream maintainer or a malicious Salsa git committer: 0) Gain commit access to target git repository. 1) Create one new commit with a SHA1 collision HEAD object (using a git carefully modified to not use SHA1CD), with two different source code files (one malicious and one harmelss). 2) Add some harmless new commit with SHA1CD safe commit id. 3) Push that into the public git repository. 4) Over time add many other unrelated commits (which shouldn't touch the same content for the malicious commit - hint: put them in a opaque binary self-test file...). 5) Create a t2u git sign on the then HEAD object. We now have a sitution where an attacker can provide a separate git repository with different content than the intentional version, with signed git tags still verify correctly. I think this will be hard to make use of successfully in any reasonable scenario in practice, but it appears cryptographically possible. You can mitigate this by re-validating all commit hashes using a SHA1CD git implementation before trusting a git repository. I have not seen confirmation that 'git fsck' actually do that. If some new attack implementation on SHA1 appears, that isn't detected by your SHA1CD variant, your validation can be by-passed. I don't think many researches bother attacking SHA1 and publishing details publically at this point, as it is proven broken already. I suppose that a SHA1 collision-generating algorithm that by-pass SHA1CD still has market value. Note: I don't think this problem is a deal-breaker for the t2u scheme, nor that its design even has to change due to this. We already live decently with many theoretical risks. > If we do decide that a second hash is worth the effort, I *strongly* > recommend to simply add an (optional) field with the output of "git > ls-files -z | xargs -0 sha512sum | sort | sha512sum" to the tag. This > has the exact same security implications as a list of paths and their > sha512sum but is a heap of orders of magnitude smaller. Something like this adds more strength, I like it. I think we should compare how other distributions handle this; I believe Guix hashes of the file content of source tarballs, instead of hashing the source tarball. Maybe the details how to compute these are generic enough to be reusable by Debian. /Simon
Attachment:
signature.asc
Description: PGP signature