Re: Apt & rsync
On Sat, 16 Oct 1999, Jason Gunthorpe wrote:
> Part of the problem is the wide dispersion in load that different rsync
> users can cause. Someone using rsync for just a single file would
> not cause much server loading, but someone transfering a whole archive, or
> doing file lists or doing md5's or whatever generates huge amounts of
> load.
>
> Since they are grouped all together it is really hard to fairly balance
> things. Keep in mind that debian.org FTP sites do a huge amount of
> traffic, so small load issues are quite important. Right now most sites
> limit to around 10 rsync connections at once.
Let me see if I understand the issue. rsync is optimized to make a single
transfer run as quickly as possible, doing all sorts of bandwidth and CPU
hogging things like doing many transfers at once. For Debian package
distribution to end-users, all the business with correctly dealing with
directories is irrelevant; the only thing we might care about is the base
algorithm for transferring files, one or maybe two at a time. So the
rsync daemon is not really the right starting point. Is that right?
> > You need access to the previous _compressed_ version, yes. So people
> > tracking unstable would probably want to do this, while people sticking
> > with stable distributions probably wouldn't. It's a disk space/bandwidth
> > tradeoff.
It occurred to me after I wrote this that with tools like dpkg-repack it's
unnecessary to keep around the original compressed archives. This creates
more client-side load, which is acceptable. Note that it's not necessary
to recreate the initial archive exactly.
> The person generating the .deb would need to use this, not the person
> downloading.
I'm not sure we're talking about the same thing here. Let me summarize
the algorithm as I currently see it:
a) Debian developer creates the .deb file, using a modified rsync-friendly
gzip. (Let's call it 'rzip' for now.)
b) Developer uploads the .deb to incoming using ftp.
c) Archive is distributed to the mirrors using the current rsync setup,
which would run faster because of the friendly compression.
d) User uses dpkg-repack to create a .deb close to the .deb currently on
the mirror.
e) User uses the rsync algorithm (but not the current rsync protocol, for
the reasons you point out) to update .deb to current .deb.
Do you think it's unfeasible to get developers to switch to the new
compression technique?
--Dylan Thurston
dpt@math.berkeley.edu
Reply to: