[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: IDEA to SERIOUSLY reduce download times!



Jason Gunthorpe <jgg@ualberta.ca> writes:

> On Wed, 7 Jul 1999, John Travers wrote:

> > Is there a way to make patches between package
> > updates/upgrades. i.e. the way the kernel is done? So instead of
> > downloading whole new binaries, we could just download source
> > patches and recompile the package...

> > I have sent this idea before...

> > Any possibility?

> There is a faint hope.. The best way to go about this is to design a new
> transfer protocol that can do the rsync method on the INTERIOR of gzip
> compressed files, this is not a trivial thing, but if someone can
> demonstrate it working then there is hope :> 

(Actually, the original poster is asking to download diffs of the
source, which we essentially provide already.)

Als, xdelta already does binary diffs of gzipped files (it used to do
.deb and .rpm, but this feature was removed.)  It is available as a
library.

> Some variations are rsyncing the files in a gzip .deb using the local
> files and other stuff like that - all very doable, but difficult.

At one point in time, as an experiment, I wrote a perl script that
extracted two rpm files, generated lists of new, modified, and deleted
files, stored the list of deleted files, along with a tar.gz of the
new files and a tar.gz of xdelta's of the modified files.  (I think I
also tried detecting moved files via md5sums.)  I stuffed the
resulting pieces into a .ar archive and compared the sizes to the new
packages.  (My sample was the then current updates for RedHat - I
can't remember the exact version.)  I ended up with a net savings of
about 50%.  Not too shabby, but to me it didn't justify moving away
from the simpler status-quo of just distributing new packages.

According to google, my original post is in the April 1998 archives of
rpm-list, but the links are dead because they are reorganizing
archive.redhat.com.

ok, a power search of "dunham xdelta rpm" on dejanews does the trick.

:  unpacks the cpio part of two rpm archives
:  constructs a file list with md5sum's in perl hashes
:  identifies deleted, renamed, new and changed files
:
:  generates an xdelta of each changed file
:  generates a list of delete and rename operations (gzipped)
:  builds a cpio.gz of the new files
:  builds a cpio of the xdelta files (xdelta files are already compressed)
:
:  makes an "ar" archive of the results of the previous three steps

: Package        Orig Ver  O.Size  New Ver N.Size Diff Size
: =============  ========= =====  ======== =====  =====
: ncurses        1.9.9e-6   524k  1.9.9e-8  524k    47k
: ncurses-devel  1.9.9e-6   382k  1.9.9e-8  382k     4k
: perl           5.004-1   3128k  5.004-4  3260k  1186k
: util-linux     2.7-11     297k  2.7-15    344k    57k
: pine           3.96-3     895k  3.96-7    897k   281k
: mh             6.8.4-4   1152k  6.8.4-4  1156k   489k
: glibc          2.0.5c-10 2559k  2.0.7-6  3855k  2149k
: =====================================================
: Total                    8937k          10418k  4213k

Anyways, this could be expanded to make patches that patch multiple
versions of a package up to the current one (I had envisioned a
collection of patches wrapped in a self extracting shell script to be
used like Solaris' patch clusters or MS's service packs.)  In the end,
I decided that it wasn't worth the effort - a collection of normal
package files is a perfectly fine solution to system updates.


Steve
dunham@cse.msu.edu


Reply to: