[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [RFC] DPT Policy: Canonise recommendation against PyPi-provided upstream source tarballs



On Fri, 25 Jun 2021 at 16:42:42 -0400, Nicholas D Steeves wrote:
> I feel like there is probably consensus against the use of PyPi-provided
> upstream source tarballs in preference for what will usually be a GitHub
> release tarball

This is not really consistent with what devref says:

    The defining characteristic of a pristine source tarball is that the
    .orig.tar.{gz,bz2,xz} file is byte-for-byte identical to a tarball
    officially distributed by the upstream author

    — https://www.debian.org/doc/manuals/developers-reference/best-pkging-practices.en.html#best-practices-for-orig-tar-gz-bz2-xz-files

Sites like Github and Gitlab that generate tarballs from git contents
don't (can't?) guarantee that the exported tarball will never change -
I'm fairly sure `git archive` doesn't try to make that guarantee - so it
seems hard to say that the official source code release artifact is always
the one that appears as a side-effect of the upstream project's git hosting
platform.

That doesn't *necessarily* mean that the equivalent of a `git archive`
is always the wrong thing (and indeed there are a lot of packages where
it's the only reasonably easily-obtained thing that is suitable for our
requirememnts), but I don't think it's as simple or clear-cut as you
are implying.

devref also says:

    A repackaged .orig.tar.{gz,bz2,xz} ... should, except where impossible
    for legal reasons, preserve the entire building and portablility
    infrastructure provided by the upstream author. For example, it is
    not a sufficient reason for omitting a file that it is used only
    when building on MS-DOS. Similarly, a Makefile provided by upstream
    should not be omitted even if the first thing your debian/rules does
    is to overwrite it by running a configure script.

I think devref goes too far on this - for projects where the official
upstream release artifact contains a significant amount of content we
don't want (convenience copies, portability glue, generated files, etc.),
checking the legal status of everything can end up being more work than
the actual packaging, and that's work that isn't improving the quality of
our operating system (which is, after all, the point).

However, PyPI sdist archives are (at least in some cases) upstream's
official source code release artifact, so I think a blanket recommendation
that we ignore them probably goes too far in the other direction.

I'd prefer to mention both options and have "use your best judgement,
like you have to do for every other aspect of the packaging" as a
recommendation :-)

    smcv


Reply to: