[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Adding support for LZIP to dpkg, using that instead of xz, archive wide


First, thanks for your reply and taking the time to reply on every
point. This really is helpful.

While I believe all of your argumentation is correct, I am still not
convince about the reproducibility, which is my main issue here. Could
you please reply to that point, and that one only? I've removed from the
quote all what doesn't concern it, because it is my feeling that it is a
distraction in this thread.

On 06/14/2015 05:46 AM, Guillem Jover wrote:
> Hi,
> On Sun, 2015-06-14 at 01:08:29 +0200, Thomas Goirand wrote:
>> On 06/13/2015 10:55 AM, Paul Wise wrote:
>>> On Sat, Jun 13, 2015 at 4:23 PM, Thomas Goirand wrote:
>>>> I've been using xz compression for a long time, but I see a big defect
>>>> which is today pushing me to turn it off for the .orig.tar file. The
>>>> issue is that depending on the version of xz-utils, it produces a
>>>> different output.
> Well if you want reproducible output, then use the same tool version.

That's not possible: Jessie, Sid and Trusty don't have the same version,
and we need to generate the orig.tar file in all of them. The
contributors for the Debian OpenStack packaging are mostly using Ubuntu,
and they need to keep a workflow with the orig.tar file in the Git

I did tell them to just get the file from the Debian archive, but it
doesn't work. One of the reason it doesn't is because sometimes, they
upload first to Ubuntu, and then I do in Debian, and we end up with
different orig.tar.xz files, meaning it's hard for them to sync back
with Debian. I would like this to not be an issue anymore.

> That's the equivalent of expecting that using a different gcc version
> will give you the same output.

I fail to see what gcc and a lossless compressor have in common.

> As long as the bitstream is compatible with previous versions, I don't
> see it as a problem

The problem, I just explained it: I can't use xz in a pristine-tar like
workflow, because it wouldn't reproduce the same output. And I'd like to
use something better than the 20 years old gzip.

>>>> We use "git archive" within the PKG OpenStack team to generate this
>>>> tarball (which is more or less the same as pristine-tar, except we use
>>>> upstream tags rather than a pristine-tar branch). The fact that xz
>>>> produces a different result makes it not reproducible. As a
>>>> consequence, it is very hard for us to use this system across
>>>> distributions (ie: use that in both Debian and Ubuntu, or in Sid &
>>>> Jessie). We need consistency.
> If you generate it once, as part of the release process, why do you
> need to generate it on different systems with different versions?

Because I'd like the Git repository to contain it, without the need to
pick-up the file from the Debian archive. And to be exact: that's mostly
a need from contributors, I could live with the issue, but they can't.
This is mostly a need expressed by Ubuntu/Canonical server team working
with me on OpenStack packaging on Alioth.

> And how does that have anything to do with what gets packaged in Debian.
> For Debian you only need to generate it once, why would you want to
> generate it anew every time you build a new Debian revision instead
> of just reusing the same tarball that is on the archive, if you don't
> keep source tarball releases around?

See above. It's a pristine-tar like workflow. Your question is
equivalent to: "why do people use pristine-tar?". The answer is: because
it's convenient to just use git, without having to look into the Debian
archive. And by the way, xz wouldn't be usable with pristine-tar for the
same reason.

>>>> So it'd be super nice to have LZIP support in dpkg, and use that
>>>> instead of xz, archive wide.
>>>> Your thoughts everyone? Is there any reason why we wouldn't do that?
> Yes, replacing xz with lzip on .deb or .dsc packages does not make any
> sense.

That isn't what I care about. I only care about the orig.tar file here.

> Adding lzip support for source packages *might* make some sense, as
> I pointed out in the bug report. But doing so does have a very high cost:
>   <https://wiki.debian.org/Teams/Dpkg/FAQ#Q:_Can_we_add_support_for_new_compressors_for_.dsc_packages.3F>

I do understand the cost. But there's a valid reason. If you believe
there's something better than lz, with the same properties, and that we
had support for it, I'd happily adopt it. It is just that xz doesn't
work right now, and most likely will break again in future versions of

> Whenever considering to add a new compressor, all surrounding tools need
> to be modified to support it as well:
>   <https://wiki.debian.org/Teams/Dpkg/DebSupport>
>   <https://wiki.debian.org/Teams/Dpkg/DscSupport>
> That's a non-zero amount of work and time, and that does not take into
> account external tools and users. It would also not be usable until the
> next stable release. Also notice that for example there are still tools
> that do not support data.tar.xz in .deb, which has been the default for
> a while, which should give you an idea of what it takes.
> Adding a new compressor, that does not bring any significant benefit in
> compression ratio, speed or container format, that is either not widely
> used or widely available in many systems, just for the benefit of very
> few packages that might be releasing as well in other formats, or that
> can be easily recompressed, still does not seem worth it, no.

Well, xz can't be used for pristine-tar, and gzip is old and doesn't
compress as well. This alone is IMO a good reason.

Thomas Goirand (zigo)

P.S: I'd prefer a consensus here than a CTTE bug.

Reply to: