[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [RFC] DPT Policy: Canonise recommendation against PyPi-provided upstream source tarballs

On 2021-06-26 02:04:40 +0000 (+0000), Paul Wise wrote:
> On Fri, Jun 25, 2021 at 11:42 PM Jeremy Stanley wrote:
> > 2. Cryptographically signed tarballs of the file tree corresponding
> >    to a tag in the Git repository, with versioning, revision
> >    history, release notes and authorship extracted into files
> >    included directly within the tarball.
> I would like to see #2 split into two separate tarballs, one for the
> exact copy of the git tree and one containing the data about the other
> tarball. Then use dpkg-source v3 secondary tarballs to add the data
> about the git repo to the Debian source package.

You might like to see them split, but why is the exact copy of the
work tree the only legitimate way to export data from a Git
repository? Adding egg-info to the tarball creates a *Python Source
Distribution* which is a long-standing standard method for
distributing source code of Python software. Those files could even
be checked directly into the repository, so that the work tree was
itself also a valid sdist. The only reason the projects I work on
don't do that is because some of it would be redundant with the
metadata from the revision control system.

You could of course create your own split tarballs of the work tree
and the additional metadata files, but to what end? If upstream is
already delivering them together in a release tarball, how is making
your own beneficial when it still has to be done by the package
maintainer before assembling the source package? Users of Debian
don't benefit, because they still can't recreate your split tarball
if they wanted without also having a copy of the upstream Git
repository anyway. It just seems like make-work.

> Probably we should start systematically comparing upstream VCS repos
> with upstream sdists and reacting to the differences. So far, I've
> reacted by ignoring the sdists completely.

I highly recommend it. We explicitly test that our sdists don't omit
files from the Git worktree (sans .git* files like .gitignore and
.gitreview which make no sense outside the context of a Git
repository). On the other hand, I've found at least one case where a
copyright statement in a Debian package refers to an AUTHORS file
shipped as part of the sdist, but since the maintainer chose to
package it from Git instead and did not generate that file when
doing so, it's not included in the packaged version distributed in
Debian. (Not linking the bug report here as I don't want it to seem
like I'm picking on the maintainer.)

Just to reiterate, as an upstream we don't consider the work trees
of our Git repos to be complete source distributions. They can be
used along with the versioning and history tracked as part of the
repository to generate a complete source distribution, and that's
what we officially release. Downstream distributions are encouraged
to either use our release tarballs or clones of our Git repositories
to recreate the same files we would release, but if you choose to do
neither of those you're likely to miss something.
Jeremy Stanley

Attachment: signature.asc
Description: PGP signature

Reply to: