Re: compressor
On Fri, Aug 24, 2012 at 11:04:56AM +0200, Gaël DONVAL wrote:
> Le jeudi 23 août 2012 à 20:24 +0800, lina a écrit :
> >
> > Sorry, here you mean,
> >
> > once tar -Jcf a.tar.xz a
> >
> > again
> > tar -Jcf a.tar.xz a.tar.xz
> > ?
> No, I think this was a joke :)
Yes it was a joke :) but it was based on a recent article where someone
expressed surprise that multiple manual passes of a compressor (I think
gz) resulted in smaller file sizes. (I couldn't find a copy of the article
to link to)
> In most programs, there is a "depth" or "pass number" parameter that
> does just this already. If you try to compress again, the overhead
> induced by the container (headers and such) will ultimately increase the
> file size.
Most compressors work on a block-cipher model in order to support stream
operation and so the compressor doesn't have a global view of the data being
compressed. That's why subsequent manual passes can (sometimes) have a good
effect, especially with e.g. enormous log files with a lot of repetition: local
areas of the file being compressed are treated in isolation, but the resulting
compressed blocks have a lot of (compressed!) repetition. In practise it's
almost certainly very rarely worth bothering.
Reply to: