[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: get debian smaller



> Linux. After all there comes the file contents. BZip2 compresses the
> whole archive. At the beginning I thought to reach a much better
> compression rate than before but gzip is not so bad as I thought.

Bzip2 is indeed good, but being based on the Burrows-Wheeler transform it 
probably performs worse than current 'state of the art' PPMD compressors. It would 
be really interesting to see the effect of your patch with a PPM compressor. A while 
ago I did a test on several different compressors on the source archive:

Packer  total size      compression              improvement from gz  loss from best  time (s)
tar     25764.5Mb       0                            -2.42284                 0.786226                0
gzip    7527.25Mb      0.707845                 0                            0.268287           41010.3
zzip    5987.13Mb      0.767621                 0.204605                 0.0800635       441934
szip    6501.28Mb      0.747666                 0.1363                    0.152816           37523.5
bzip2   6479.37Mb     0.748516                 0.139211                 0.149951           43998.6
PPMd2   8631.81Mb  0.664973                 -0.146742                0.36192             36657.1
PPMd3   7199.65Mb  0.72056                   0.0435215               0.234993           39515.3
PPMd4   6549.61Mb  0.74579                   0.129879                 0.159067           41225.7
PPMd5   6223.04Mb  0.758465                 0.173265                 0.114937           43005
PPMd6   6027.49Mb  0.766055                 0.199245                 0.0862218         44711.3
PPMd7   5892.44Mb  0.771297                 0.217185                 0.0652795         46276.5
PPMd8   5796.15Mb  0.775034                 0.229978                 0.0497504         47621.4
PPMd9   5731.01Mb  0.777562                 0.238631                 0.0389513         48880.5
PPMd10  5688.98Mb 0.779193                 0.244215                 0.0318506         50777.1
PPMd11  5661.65Mb 0.780254                 0.247846                 0.0271773         50990.6
PPMd12  5638.57Mb 0.78115                   0.250912                 0.0231944         51868.9
PPMd13  5625.2Mb   0.781669                 0.252688                 0.0208738         52697.2
PPMd14  5613.63Mb 0.782118                 0.254225                 0.0188553          53441
PPMd15  5609.34Mb 0.782285                 0.254795                 0.0181046          54141.2
PPMd16  5605.05Mb 0.782451                 0.255366                 0.0173525          54776.9

bzip2                    = bzip2 -9
PPMdx                 = ppmd -o x -m 220
compression          = relative compression
improvement from gz = relative improvement from gzip compression
loss from best         = relative size difference compared to using different compressors for each 
                               package and using the individually best compressor for each package.
time(s)                   = total compression time
 
PPM compressors have been to slow to use until recently[0]. PPMd is a demonstration 
program based on the article[0], I even got as far as creating a demo package[1]. However, a more usable 
program that contains PPMd code is 7zip[2], but being written originally for windows I'm not sure 
about the current state of the port, if there even is any now (comments Radim?).

0. http://DataCompression.info/Miscellaneous/PPMII_DCC02.pdf
1. oxtan.campus.luth.se/debian/ppmd
2. http://www.7-zip.org/
-- 
Magnus Ekdahl 0739-287181 magnus@debian.org maguno@ludd.luth.se
public key available at http://oxtan.campus.luth.se/magnus.public
ftp://ftp.se.debian.org/debian-non-US/pool/non-US/main/d/debian-keyring/
Key fingerprint = 18DE CB62 8A86 374E 824E  09ED 1987 4B18 1213 79F6



Reply to: