On Fri Aug 22, 2025 at 9:45 PM CEST, Otto Kekäläinen wrote:
Uploads to the Debian archive, and what is stored in the Debian archive is still tarballs. Same applies for Ubuntu, and various tooling we have around uploads. Running dput to ftp-master, security, mentors, Launchpad etc will push tarballs and those checksums better match or uploads get rejected or the archive ends up having two versions of the same upstream tarball with different checksums.
Sure. I was discussing about the security implications of pristine-tar. I actually wrote a paragraph mentioning the archive, but deleted it for the sake of brevity.
I argue that even for this use case, pristine-tar does not make *much* sense. Yes, it's convenient, but you can still download the orig from the archive if it's already there.
Moreover, and I want to emphasize that **I'm not sure about this**, but looking more closely at pristine-tar while integrating in t2u, I have the impression that this point of keeping the hash equal to the one in the archive does not hold; pristine-tar applies a binary diff *on top* of what git-archive creates. If git-archive changes and generates a slightly different tarball, pristine-tar will not help. And at that point, pristine-tar isn't useful for this purpose.
As I understand it, pristine-tar's goal is to generate a tarball which is equal to a tarball which *was not* generated by git-archive, because you wouldn't be able to generate it otherwise (think of empty directories, which are not representable in Git and hence impossible to create with git-archive).
Also, by not having to work with tarballs, you not only stop needing the pristine-tar branch, but you also no longer need the upstream/latest branch.This is true. But it includes a very large assumption that all upstreams use git and all imports can be based on git tags.
Sorry, my point wasn't clear. I mean that, *when using upstream vcs tags*, pristine-tar isn't useful, etc. etc.
For cases where upstream doesn't use Git, I'm a strong supporter of the use of pristine-tar.
Even for upstreams that use git, some of them still publish tarballs that have for example documentation added that is from a separate repository, or large binary test files removed as the project does not intend to ship them to end users.
I've actually seen the opposite in many projects! For example, mbedtls ships huge generated test files in their release tarball, but not in Git (even though they actually recommend me to use Git tags and avoid tarballs). Or think of the various autotools-generated stuff, which is absent in Git.
In other packages, yes, the tarball contains extra things like documentation or test files which would normally be downloaded on-the-fly during build (which is of course undesirable). One such case I personally maintain is muon, which has a simple script which concatenates three tarballs into one: https://salsa.debian.org/debian/muon-meson/-/blob/archive/debian/0.4.0-1/tools/ci/prepare_tarball.sh?ref_type=tags
Different packages, different optimal choice!
For the time being, a tarball with files is the only common denominator across all upstream software.
I fully agree with this statement. It is the common denominator. For a high number of upstreams, though, it is far from the best option.
Maybe it'd be better to optimize for the majority of upstreams (promoting a Git-first workflow), at the expense of less common cases?
On the topic of the Go Team workflow specifically, it should be noted that for Go's dependency management tools to work, upstreams have to use a VCS.
Attachment:
signature.asc
Description: PGP signature