[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Synching mirrors and clients (was: Re: apt-proxy v2 and rsync)



> > Can anyone explain why rsync is no longer considered an appropriate
> > method for fetching Packages files?
> 
> IIRC the problem is that rsync is quite CPU-heavy on the servers, so while
> the mirrors have the (network) resources to feed downloads to 100s of
> users, they don't have the (CPU) resources for a few dozen rsyncs.
> 
Why do you keep on saying this without providing _any_ figures!

Why always synching the full mirror when only about 1% of the files
changes daily? Just change to single file synching and most of your CPU
load is gone. Single file rsync doesn't need any CPU power to discover
the changed files. Single file rsync touches only the changed files,
only about 1%, so at least disk access is much less while probably also
lowers the CPU load.

If gzip --rsyncable would be used the CPU load would dramatically be
lowered, much lower than with _any_ other synching. As a side effect the
use bandwidth would be equally well be lowered. IMO rsync is very useful
if don't right.

Prove of concept
================
To finally produce some figures and prove this concept two servers are
needed, the first one an ordinary source mirror, the second a secondary
mirror with different mirror directories for each of the test cases. On
the first server the CPU load is measured, on the second the different
sync scripts are run:

- Ordniary full mirroring rsynch as today in use
- DpartialMirror sync script ("http://dpartialmirror.sourceforge.net/";)
- Deb-mirror sync script
- ???
- Sync with wget, etc.

IMO this will show which is the best solution for full mirrors. 

Now limit the secondary mirror to support only one architecture and do
the test above again. This will show the best solution for the commonly
used mirrors.

In a third step limit the packages to what an ordinary user has, just
use popularity-contest or I could provide my dpkg --getselections. This
will show the best solution for servers from clients impact.

Now if you feel advantous, repack as many package on the source mirror
with gzip --rsyncable and notice the difference.

O. Wyss

-- 
See a huge pile of work at "http://wyodesktop.sourceforge.net/";



Reply to: