[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Adding support for LZIP to dpkg, using that instead of xz, archive wide



Thanks for the detailed explanations.

Russ Allbery wrote:
Inversely, by not accepting .lz source tarballs Debian is sending the
message that lzip is not a good format to use,

While it's possible that people may decide to read such messages into our
decisions, that's not really something we have control over, and I don't
believe we should alter our decisions on that basis.

IMHO, if two options are equally good for the Debian use case but one of them is better for most of the users, Debian should choose the one that is better for most of the users. Else, what is the use of the social contract?

"We will be guided by the needs of our users and the free software community. We will place their interests first in our priorities.
[...]
we will provide an integrated system of high-quality materials with no legal restrictions that would prevent such uses of the system."

I can assure you that the first interest of most users is to not lose their data. I am also pretty sure that neither lzma-alone nor xz count as "high-quality materials".


(I believe the whole decision-making process for xz required a year or
two.)

In one year or two nobody noticed (or cared about) the deficient design of xz? Wow!

I agree with what you said in another message that it is depressing to have so many compression formats. But the bad decisions of Debian (among others) certainly contribute to this situation. A selection process that chooses a defective format (lzma-alone), then skips over a flawless format that improves on gzip and bzip2 (lzip), and later replaces the first defective format with another defective format (xz), really needs some improvement.

It is also depressing for me to have to point out the defects in both the xz format and the Debian decision-making process. The current situation should never have happened. Xz should have been evaluated by an expert and rejected outright, instead of wasting one year or two in popularity contests just to reach the wrong decision. If high-quality is desired, then democracy is not the best way of deciding about technical questions. Anybody thinking that popular equals good should be using Windows instead of bringing crappy Windows formats to posix systems.


So, there are two things in Debian that could switch to lzip, and two
separate paths forward for making an argument for those two things.  I
don't think this thread is really helping with either path, and would
recommend that you focus gathering of evidence (backed by real numbers and
reproducible statistics, the way the xz discussion previously was)

I agree that this thread is not helping, but I think it is because Debian is not the place I thought it was.

In particular I am very surprised by the insistence on statistics on this thread. IMHO, statistics, specially of the "popularity contest" kind, are out of place in this decision. This is an ethical and technical decision about using in Debian packaging either: a) a defective format implemented by public domain code that can be easily made non-free, or b) a flawless format implemented by truly free code that will remain free[1].

[1] http://www.debian.org/intro/free

But I'll gladly provide some real numbers:

1) LZMA2 is just plain LZMA wrapped in yet another container format. The variant of LZMA implemented by lzip is better than xz for general-purpose compression (for example, of heterogeneous tarballs like those distributed by Debian). Vincent Lefevre already provided a summary[2] of the differences between lzip and xz. In the lzip benchmark[3] can be found more examples from gnu.org.

[2] http://lists.debian.org/debian-devel/2015/06/msg00256.html
[3] http://www.nongnu.org/lzip/lzip_benchmark.html

Here are a couple tests from Debian packages:

-rw-r--r-- 1   2058 Jul 29 17:11 gddrescue_1.19-2_control.tar.gz
-rw-r--r-- 1   2038 Jul 29 17:11 gddrescue_1.19-2_control.tar.lz
-rw-r--r-- 1 101365 Jul 29 17:12 gddrescue_1.19-2_data.tar.lz
-rw-r--r-- 1 101608 Jul 29 17:12 gddrescue_1.19-2_data.tar.xz
-rw-r--r-- 1   1124 Jul 29 17:04 lzip_1.17-1_control.tar.gz
-rw-r--r-- 1   1125 Jul 29 17:04 lzip_1.17-1_control.tar.lz
-rw-r--r-- 1  70343 Jul 29 17:05 lzip_1.17-1_data.tar.lz
-rw-r--r-- 1  70440 Jul 29 17:05 lzip_1.17-1_data.tar.xz
-rw-r--r-- 1  60240 Jul 29 16:30 lzip_1.17.orig.tar.lz
-rw-r--r-- 1  60484 Jul 29 16:30 lzip_1.17.orig.tar.xz

2) As lzip requires by default much less memory than xz to decompress small files, it would be possible for lzip to replace both gzip and xz in .deb packages, simplifying the format and the tools managing it. The memory required to decompress the files above is:

( 11 kB)   gddrescue_1.19-2_control.tar.gz
( 82 kB)   gddrescue_1.19-2_control.tar.lz
(377 kB)   gddrescue_1.19-2_data.tar.lz
(8.4 MB)   gddrescue_1.19-2_data.tar.xz
( 12 kB)   lzip_1.17-1_control.tar.gz
( 82 kB)   lzip_1.17-1_control.tar.lz
(180 kB)   lzip_1.17-1_data.tar.lz
(8.4 MB)   lzip_1.17-1_data.tar.xz
(377 kB)   lzip_1.17.orig.tar.lz
(8.4 MB)   lzip_1.17.orig.tar.xz

3) In spite of being the official lzma successor and receiving a lot of publicity (xz has 2.5 pages in the Wikipedia), xz is not so much more popular than lzip. For example, about 15% of GNU projects distribute xz tarballs vs 5% that distribute lzip tarballs. If xz were any good, that percentage would be 100%, and I would be using it just as everybody else.

In conclusion:

From 1) and 2) it can be concluded that lzip is ideal for the use case of Debian packaging. Because lzip requires little memory to decompress small files it could also replace gzip in other uses like compressing man pages, maybe allowing Debian to use just one format for all its needs.


Best regards,
Antonio.


Reply to: