[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: apt repository usage



On 4/24/06, Goswin von Brederlow <brederlo@informatik.uni-tuebingen.de> wrote:
> "Brian Eaton" <eaton.lists@gmail.com> writes:
> > The main objection to using rsync seems to be that it increases the
> > CPU usage on the file server.  However, full fledged rsync shouldn't
> > be necessary.  If you know the two files to be synchronized in
> > advance, then you could do the CPU intensive work up front.  You could
> > prepare a patch-like binary format that apt clients could then use to
> > update an older copy of the package.
>
> No you can't with rsync. rsync generates the checksums on the client
> side and the server then runs a block sized window over the file
> looking for any matching block. Caching that would require 20 times
> the filesize.
>
> You have to reverse roles to get block checksums precached and that is
> what zsync does.
>
> Another thing that has been suggested is to provide patch packages
> that only contain the differences between two versions of a
> package. You would generate them once on ftp-master or something and
> get apt-get to pick either a full package or the patch package
> depending on the already installed version. But it looks like you have
> to change quite a bit in apt, aptitude and dpkg for this as well as
> figure out how exactly to build those patch debs.

Thanks for the pointer to zsync; that's exactly the kind of tool I was
talking about.  I'm glad to see someone has already written it.

Do you know whether anyone has ever tried to measure the benefits of
shipping the patch packages instead of the full packages?  If the
development effort would only save 5% of the bandwidth on the apt
repository, it probably isn't worth the effort.

Regards,
Brian



Reply to: