[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [RFC] DPT Policy: Canonise recommendation against PyPi-provided upstream source tarballs

On 2021-06-25 18:29:19 -0400 (-0400), Nicholas D Steeves wrote:
> A recommendation is non-binding, and the intent of this proposal is to
> say that the most "sourceful" form of source is the *most* suitable for
> Debian packages.  The inverse of this is that `make dist` is less
> suitable for Debian packages.  Neither formulation of this premise
> applies to a scope outside of Debian.  In other words, just because a
> particular form of source packaging and distribution is not considered
> ideal in Debian does not in any comment on its suitability for other
> purposes.  Would you prefer to see a note like "PyPi is a good thing for
> the Python ecosystem, but sdists are not the preferred form of Debian
> source tarballs"?

To reset this discussion, take the case of an upstream like the one
I'm involved with. For each project, two forms of source release are
made available:

1. Cryptographically signed tags in a Git repository, with
   versioning, revision history, release notes and authorship either
   embedded within or tied to the Git metadata.

2. Cryptographically signed tarballs of the file tree corresponding
   to a tag in the Git repository, with versioning, revision
   history, release notes and authorship extracted into files
   included directly within the tarball.

If some alternative mechanism is used to grab only the work tree
from a checkout of the Git repository, critical information about
the software is lost, making it uninstallable in some cases (can't
figure out its own version), or even illegal to redistribute
(missing authors list referenced from the copyright license).

So in this case you have a few options: package from upstream's Git
repository, package from upstream's "release tarball" (which happens
to be in Python sdist format because the egg-info is used to hold
information extracted from their Git metadata), or use something
which is neither of those and then have to rely on one of them
anyway to supply the missing bits.

> It's also worth mentioning that upstream's "official release"
> preference is not necessarily relevant to a Debian context.  Take
> for example the case where upstream exclusively supports a Flatpak
> and/or Snap package...

The problem is that you seem to want to talk in absolutes. Sure some
(I'll wager many) Python projects can be reasonably packaged from a
flat dump of the file content in their revision control. There are
many which can't. Sure some upstreams may only want to release
Flatpaks or Snaps, or may even be openly hostile to getting packaged
in distributions at all. There are also quite a few which don't host
their revision control in platforms which provide raw tarball
exports generated on the fly. Some sdist tarballs leave out files, I
agree, but they don't have to (ours don't, we only add more in order
to supply the exported revision control metadata).

Saying that a raw dump of the file content from a revision control
system is recommended over using upstream's sdists presumes all
upstreams are the same. They're not, and which is preferable (or
doable, or even legal) differs from one to another. Just because
some sdists, or even many, are not suitable as a basis for packaging
doesn't mean that sdists are a bad idea to base packages on. Yes,
basing packages on bad sdists is bad, it's hard to disagree with

> Thinking about an ideal solution, and the interesting PBR case, I
> remember that gbp is supposed to be able to associate gbp tags with
> upstream commits (or possibly tags), so maybe it's also possible to do
> this:
> 1. When gbp import-orig finds a new release
> 2. Fetch upstream remote as well
> 3. Run PBR against the upstream release tag
> 4. Stage this[ese] file[s]
> 5. Either append them to the upstream tarball before committing to the
>    pristine-tar branch, or generate the upstream tarball from the
>    upstream branch (intent being that the upstream branch's HEAD should
>    be identical to the contents of that tarball)
> 6. Gbp creates upstream/x.y tag
> 7. Gbp merges to Debian packaging branch.

You'll either need a copy of the upstream Git repository or at least
some of the files generated from that repository's metadata which
has been embedded in the release tarball. I understand the desire to
not put files into Debian source packages which can be generated at
package build time from other files in Debian, but when those files
can't be generated without the presence of the Git repository itself
which *isn't* files in Debian, using the generated copies supplied
(and signed!) by upstream seems no different than many other sorts
of data which get shipped in Debian source packages.
Jeremy Stanley

Attachment: signature.asc
Description: PGP signature

Reply to: