[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Debian's problems, Debian's future

On Tue, 2002-04-09 at 17:25, Martijn van Oosterhout wrote:
> What you are suggesting is that the server store checksums for precalculated
> blocks on the server. This would be 4 bytes per 1k in the original file or
> so. The transaction proceeds as follows:
> 1. Client asks for checksum list off server
> 2. Client calculates checksums for local file
> 3. Client compares list of server with list of client
> 4. Client downloads changed regions.
> Note, this is not the rsync algorithm, but the one that is possibly
> patented.

This looks like an interesting algorithm, so I decided to compare it to
the diff scheme analyzed in 

The above message also gives my analysis methodology.

The results:

- The following table summarizes the performance of the checksum-based
scheme and the diff-based scheme under the assumption that users tend to
perform apt-get update often.  I think disk space is cheap and bandwidth
is expensive, so 20 days of diffs is the best choice.

Scheme                         Disk space         Bandwidth
Checksums (bwidth optimal)            26K               81K
diffs (4 days)                        32K              331K
diffs (9 days)                        71K               66K
diffs (20 days)                      159K               27K

- The analysis is unfairly favorable to the checksum scheme, because I
do not count the bandwidth required to request all the changed blocks,
only the bandwidth used to transmit the changed blocks.

- For the user model in the message above, the optimal block size for
this algorithm is around 245 bytes .

- In the diff-based scheme, each mirror can decide on a
diskspace/bandwidth tradeoff by simply keeping more old diffs or
deleting some old diffs.  The checksum-based scheme doesn't really
support tweaking at the mirror.

- I tend to update every day.  For people who update every day, the
diff-based scheme only needs to transfer about 8K, but the
checksum-based scheme needs to transfer 45K.  So for me, diffs are
better. :)


To UNSUBSCRIBE, email to debian-devel-request@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org

Reply to: