Re: /usr/share/doc/ files and gzip/xz/no compression
* Andreas Barth [2011-08-15 23:59 +0200]:
> * Lars Wirzenius (firstname.lastname@example.org) [110815 23:27]:
> > On Mon, Aug 15, 2011 at 11:04:51PM +0200, Carsten Hey wrote:
> > > * Lars Wirzenius [2011-08-15 18:33 +0100]:
> > > > raw gz xz
> > > > 584 163 134 file sizes (MiB)
> > > > 0 421 450 savings compared to raw (MiB)
> > > > -421 0 29 savings compared to current gz (MiB)
> > In other words, it's 130 MiB against xz's 134 MiB. I'll leave it to
> > others to decide if it's a significatn difference.
> bzip2 is definitly a more conservative choice than xz. If it's
> smaller, than it's superior to xz.
bzip2 has a better compression on average for some filetypes, xz has
a better compression on average for others:
gzip bzip2 xz bzip2+xz
text files 94312922 73496587 77783076 73496587
other files 16577181 14609893 14275484 14275484
sum 110890103 88106480 92058560 87772071
Among the "other files" are also a lot of text files, if we would
compress Debian packages instead, xz would win presumably.
Anyway, I don't think this difference of 4 MiB on a desktop system is
I would prefer to avoid bloating the set of pseudo essential packages
without a good reason and I think users should be able to decompress all
files in /u/s/d. There are plans to let dpkg depend on liblzma2 instead
of xz and it already depends on libbz2-1.0. If dpkg's dependency on
libbz2 is planned to be removed in future, I would prefer to let libbz2
vanish from the pseudo essential set and use xz also for /u/s/d,
otherwise I would prefer using bzip2 over xz for /u/s/d.
 I did not use -e nor -9, but the difference should not be that big
on files in /usr/share/doc.
 find ... -regex '.*\(changelog\|copyright\|README\|TODO\|NEWS\).*[.]gz'
 bzip2 for text files and xz for other files. This is of course
nothing we should consider doing.