Re: Solving the compression dilema when rsync-ing Debian versions

To: debian-devel@lists.debian.org
Subject: Re: Solving the compression dilema when rsync-ing Debian versions
From: Peter Eckersley <pde@cs.mu.oz.au>
Date: Mon, 8 Jan 2001 23:58:26 +1100
Message-id: <[🔎] 20010108235825.C4912@ignuthuam>
Mail-followup-to: debian-devel@lists.debian.org
In-reply-to: <[🔎] 20010108082752.A1525@topic.com.au>; from sam@topic.com.au on Mon, Jan 08, 2001 at 08:27:53AM +1100
References: <[🔎] 1emvji6.1aphmvlbwlmioM%otto.wyss@bluewin.ch> <[🔎] 20010108082752.A1525@topic.com.au>

On Mon, Jan 08, 2001 at 08:27:53AM +1100, Sam Couter wrote:
> Otto Wyss <otto.wyss@bluewin.ch> wrote:
> > 
> > So why not solve the compression problem at the root? Why not try to
> > change the compression in a way so it does produce a compressed result
> > with the same (or similar) difference rate as the source? 
> 
> Are you going to hack at *every* different kind of file format that you
> might ever want to rsync, to make it rsync friendly?
> 
> Surely it makes more sense to make rsync able to more efficiently deal with
> different formats easily.

I think you reach the right conclusion, but for the wrong reason.

Either you fix rsync for each of n file formats, or you fix n file formats
for rsync :)

The advantage of doing it in rsync-land is that you can do a better job; you
apply the inverse of the compression format at both ends, calculate the
differences, and re-apply compression (probably gzip rather than the original
algorithm, but it depends) to these.  Trying to hack compression algorithms to
fit rsync is in general a bad idea.  Rusty could probably get away with it for
gzip, because it is very simple - decompression of gzip is interpreting codes
like "repeat the 17 characters you saw 38 characters ago".

Other, more sophisticated algorithms, like bzip2 (go and read about the
Burrows-Wheeler Transform, it's amazing ;) would be much harder to hack in any
reasonable way.

--

|> |= -+- |= |>
|  |-  |  |- |\

Peter Eckersley
(pde@cs.mu.oz.au)
http://www.cs.mu.oz.au/~pde

for techno-leftie inspiration, take a look at
http://www.computerbank.org.au/

Attachment: pgpvObLp1skLI.pgp
Description: PGP signature

Reply to:

Follow-Ups:
- Re: Solving the compression dilema when rsync-ing Debian versions
  - From: Peter Eckersley <pde@cs.mu.oz.au>
- Re: Solving the compression dilema when rsync-ing Debian versions
  - From: John O Sullivan <johno@cruithne.org>

References:
- Solving the compression dilema when rsync-ing Debian versions
  - From: otto.wyss@bluewin.ch (Otto Wyss)
- Re: Solving the compression dilema when rsync-ing Debian versions
  - From: Sam Couter <sam@topic.com.au>

Prev by Date: Re: ITP: mboxgrep -- Grep through mailboxes
Next by Date: Re: ITP: mboxgrep -- Grep through mailboxes
Previous by thread: Re: Solving the compression dilema when rsync-ing Debian versions
Next by thread: Re: Solving the compression dilema when rsync-ing Debian versions
Index(es):
- Date
- Thread