[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [RFC] DPT Policy: Canonise recommendation against PyPi-provided upstream source tarballs



Hi Simon,

Simon McVittie <smcv@debian.org> writes:

> On Fri, 25 Jun 2021 at 16:42:42 -0400, Nicholas D Steeves wrote:
>> I feel like there is probably consensus against the use of PyPi-provided
>> upstream source tarballs in preference for what will usually be a GitHub
>> release tarball
>
> This is not really consistent with what devref says:
>
>     The defining characteristic of a pristine source tarball is that the
>     .orig.tar.{gz,bz2,xz} file is byte-for-byte identical to a tarball
>     officially distributed by the upstream author
>
>     — https://www.debian.org/doc/manuals/developers-reference/best-pkging-practices.en.html#best-practices-for-orig-tar-gz-bz2-xz-files
>
> Sites like Github and Gitlab that generate tarballs from git contents
> don't (can't?) guarantee that the exported tarball will never change -

I agree 100%

> I'm fairly sure `git archive` doesn't try to make that guarantee - so it
> seems hard to say that the official source code release artifact is always
> the one that appears as a side-effect of the upstream project's git hosting
> platform.
>

Also agreed 100%.  This line of inquiry is actually why I think using
upstream tags is best, but even then it's possible upstream will delete
the tag and push a new one.  Does PyPi provide immutable releases?  If
so, yes, I agree there's a strong argument to be made for using PyPi vis
à vis DevRef within a DPT context where upstream git tags (and history)
are not merged :-)

> That doesn't *necessarily* mean that the equivalent of a `git archive`
> is always the wrong thing (and indeed there are a lot of packages where
> it's the only reasonably easily-obtained thing that is suitable for our
> requirememnts), but I don't think it's as simple or clear-cut as you
> are implying.
>

Also agreed 100%, but I've learned people often look at comprehensive
proposals as tldr, so I wanted to try a discussion-based approach ;-)

> devref also says:
>
>     A repackaged .orig.tar.{gz,bz2,xz} ... should, except where impossible
>     for legal reasons, preserve the entire building and portablility
>     infrastructure provided by the upstream author. For example, it is
>     not a sufficient reason for omitting a file that it is used only
>     when building on MS-DOS. Similarly, a Makefile provided by upstream
>     should not be omitted even if the first thing your debian/rules does
>     is to overwrite it by running a configure script.
>
> I think devref goes too far on this - for projects where the official
> upstream release artifact contains a significant amount of content we
> don't want (convenience copies, portability glue, generated files, etc.),
> checking the legal status of everything can end up being more work than
> the actual packaging, and that's work that isn't improving the quality of
> our operating system (which is, after all, the point).
>

I agree, and will support a proposal to modify DefRef to this end,
because as far as I know the source tarballs in our archive aren't part
of a secondary project to archive upstream tarballs as-released (eg: a
kind of "ark" or source-bank, like a seed-bank, for DFSG software)...but
maybe that is a secondary objective?

> However, PyPI sdist archives are (at least in some cases) upstream's
> official source code release artifact, so I think a blanket recommendation
> that we ignore them probably goes too far in the other direction.
>
> I'd prefer to mention both options and have "use your best judgement,
> like you have to do for every other aspect of the packaging" as a
> recommendation :-)
>

So far the text I've been able to come up with to address this is
something like:

    In some cases PyPI sdist archives may be the most appropriate
    upstream source tarball (then your "use your best judgement..."
    as a conclusion) :-)

It would be really nice to include technical reasons that describe cases
where PyPI is more appropriate, but I don't know any.  My experience in
Debian thus far has been that "what most closely fulfils Debian ideals"
is always preferable to upstream preference.  Yes, that's arguably
insular, but I thought there was consensus on this.

And yes, I agree moderate is better, but I must sadly confess ignorance
to the technical reasons why PyPI is sometimes more appropriate.
Without technical reasons it seems like a case of ideological compromise
(based on the standards I've been mentored to and the feedback I've
received over the years).

Thanks!
Nicholas

Attachment: signature.asc
Description: PGP signature


Reply to: