[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: /usr/share/doc/ files and gzip/xz/no compression

On Mon, Aug 15, 2011 at 11:59:07PM +0200, Andreas Barth wrote:
> * Lars Wirzenius (liw@liw.fi) [110815 23:27]:
> > On Mon, Aug 15, 2011 at 11:04:51PM +0200, Carsten Hey wrote:
> > > * Lars Wirzenius [2011-08-15 18:33 +0100]:
> > > >      raw     gz      xz
> > > >      584    163     134     file sizes (MiB)
> > > >        0    421     450     savings compared to raw (MiB)
> > > >     -421      0      29     savings compared to current gz (MiB)
> > In other words, it's 130 MiB against xz's 134 MiB. I'll leave it to
> > others to decide if it's a significatn difference.
> bzip2 is definitly a more conservative choice than xz. If it's
> smaller, than it's superior to xz.

AFAIK, bzip2 has much worse decompression performance than xz: I have
taken dpkg's changelog, concatenated it to itself 10 times (11MB size),

gzip: 0.377s, down to 2.7MB
gunzip: 0.077s

bzip2: 1.45s, down to 1.8M
bunzip2: 0.420s

xz: 4.4s(!), down to 204K(!)
xz -d: 0.035s

So here bzip is an order of magnitude slower at decompression.

I've repeated the test on uncompressible data (/dev/urandom), 10MB, and
the numbers are even worse for bzip2:

gzip:  0.410s / 0.060s
bzip2: 2.400s / 0.960s
xz:    4.040s / 0.027s

So while xz is costly for compression, it's faster than even gzip for
decompression. bzip2's cost for decompresion (huge!) is what kept me
personally from using it seriously before xz appeared.

There is also information on Wikipedia about various compression
benchmarks, but IMHO if we want to switch from gzip then bzip2
doesn't make sense for /usr/share/doc.


Attachment: signature.asc
Description: Digital signature

Reply to: