Re: t2u in the archive

To: Matthias Urlichs <matthias@urlichs.de>
Cc: debian-vote@lists.debian.org
Subject: Re: t2u in the archive
From: Simon Josefsson <simon@josefsson.org>
Date: Mon, 01 Jul 2024 18:10:06 +0200
Message-id: <[🔎] 87cynxf601.fsf@kaka.sjd.se>
In-reply-to: <[🔎] 4324a3f6-f47c-4edd-a597-7b4ca98ed57d@urlichs.de> (Matthias Urlichs's message of "Mon, 1 Jul 2024 14:36:20 +0200")
References: <87h6dehn4x.fsf@melete.silentflame.com> <87a5j249xi.fsf@hope.eyrie.org> <CABpYwDX73cT1RmxWAEmjOOky3s80esFRBZcvZR_wxEC4mDajcw@mail.gmail.com> <1999443.CcadfifcXU@zini-1880> <CABpYwDWvRuxVnhgoFRUOAv2kc7d=c+dNsUQiBJ0Q1b-ZCBH16w@mail.gmail.com> <[🔎] d9cbad61-c8ef-4378-ae89-19c30689488e@urlichs.de> <[🔎] CABpYwDV2p1iqr6b7q1KMYXb_nVZqz9bgbB1SYmWU8kFzowbY8g@mail.gmail.com> <[🔎] 4324a3f6-f47c-4edd-a597-7b4ca98ed57d@urlichs.de>

Matthias Urlichs <matthias@urlichs.de> writes:

> On 01.07.24 12:46, Aigars Mahinovs wrote:
>> Yes and no. See what the git tag actually contains and what the GPG
>> signature actually signs is just the one hash of the commit object.
>> This commit object then refers to the other files of the repo, but the
>> GPG signature does not directly sign those.
> So it signs them indirectly instead. I don't consider that to be a problem.
>
> There's no material difference whether the tag signs a commit that
> hashes a tree that (eventually) hashes the files, or a list of the
> files plus their hashes, or a tarball of the files in question (except
> that the way we do the latter is too brittle – it depends on the file
> order and compression used).
>
> The single advantage of including a file list would be if it included
> the files' SHA256-or-better hashes, but given the difficulty of
> finding *and* exploiting a SHA1 collision it's a judgment call whether
> that's worth the effort.

I believe you only need a SHA1 collision to corrupt the t2u scheme, and
those are not difficult.  Unless I'm missing something, and to help
everyone get on the same page of this analysis, and to get corrections
from other if my analysis is wrong, here are the concrete steps for a
malicious upstream maintainer or a malicious Salsa git committer:

0) Gain commit access to target git repository.

1) Create one new commit with a SHA1 collision HEAD object (using a git
carefully modified to not use SHA1CD), with two different source code
files (one malicious and one harmelss).

2) Add some harmless new commit with SHA1CD safe commit id.

3) Push that into the public git repository.

4) Over time add many other unrelated commits (which shouldn't touch the
same content for the malicious commit - hint: put them in a opaque
binary self-test file...).

5) Create a t2u git sign on the then HEAD object.

We now have a sitution where an attacker can provide a separate git
repository with different content than the intentional version, with
signed git tags still verify correctly.  I think this will be hard to
make use of successfully in any reasonable scenario in practice, but it
appears cryptographically possible.

You can mitigate this by re-validating all commit hashes using a SHA1CD
git implementation before trusting a git repository.  I have not seen
confirmation that 'git fsck' actually do that.  If some new attack
implementation on SHA1 appears, that isn't detected by your SHA1CD
variant, your validation can be by-passed.  I don't think many
researches bother attacking SHA1 and publishing details publically at
this point, as it is proven broken already.  I suppose that a SHA1
collision-generating algorithm that by-pass SHA1CD still has market
value.

Note: I don't think this problem is a deal-breaker for the t2u scheme,
nor that its design even has to change due to this.  We already live
decently with many theoretical risks.

> If we do decide that a second hash is worth the effort, I *strongly*
> recommend to simply add an (optional) field with the output of "git 
> ls-files -z | xargs -0 sha512sum | sort | sha512sum" to the tag. This
> has the exact same security implications as a list of paths and their 
> sha512sum but is a heap of orders of magnitude smaller.

Something like this adds more strength, I like it.

I think we should compare how other distributions handle this; I believe
Guix hashes of the file content of source tarballs, instead of hashing
the source tarball.  Maybe the details how to compute these are generic
enough to be reusable by Debian.

/Simon

Attachment: signature.asc
Description: PGP signature

Reply to:

Follow-Ups:
- Re: t2u in the archive
  - From: Russ Allbery <rra@debian.org>

References:
- Re: t2u in the archive
  - From: Matthias Urlichs <matthias@urlichs.de>
- Re: t2u in the archive
  - From: Aigars Mahinovs <aigarius@debian.org>
- Re: t2u in the archive
  - From: Matthias Urlichs <matthias@urlichs.de>

Prev by Date: Re: t2u in the archive
Next by Date: Re: t2u in the archive
Previous by thread: Re: t2u in the archive
Next by thread: Re: t2u in the archive
Index(es):
- Date
- Thread