[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: effectiveness of rsync and apt



Goswin von Brederlow <brederlo@informatik.uni-tuebingen.de> wrote:
> Bittorrent has a per chunk hash so it can validate each chunk when it
> recieves it instead of waiting for the full file. It won't see if a
> chunk is present at some other position in the file, not even if that
> position is also on chunk boundaries.
> 
> Rsync has a per chunk Alder-32 and md4 checksum. Those chunk checksums
> are compared to a chunk at every byte position in the file. The
> Adler-32 checksum is fairly weak but it can be updated from one
> position to the next with minimal work. Only when it matches does
> rsync compute the expensive md4 checksum for the block.
> 
> 
> The only thing that is simmilar is the "per block" when generating the
> checksum, which is basicaly nothing.

	Actually it's quite a bit of similarity... but you're right, they
still are very different. From the article, it sounds like the author is
suggesting storing these checksum values for quick retrieval, which gets
closer to what BitTorrent is doing. If an rsync daemon were to spit out IP's
of clients that were mirroring the exact same thing (which is technically
feasable, given that an rsync client could easily send it's relevant
command-line arguments upstream), then rsync clients could talk to
eachother, which would lower the bandwidth requirements of top-level debian
mirrors significantly.

> MfG

	?

	Cheers,
		Tyler



Reply to: