[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Binary patch update


  This is a followup to the earlier thread about binary patches. ("Idea to
SERIOUSLY reduce download times!!")  I have, as promised, created a new version
in a higher-level language than shell scripts (Python).  I've compressed
and attached the files. (I'd include them verbatim but they were a little large
before I compressed them -- 15K-18K -- that's what I get for using a wordy
language :) )  They'll untar into a subdirectory called 'pkgpatch' with three
files: debfile.py, debdiff.py and debpatch.py.

  debfile.py is a [hopefully] general-purpose library that I wrote to facilitate
the other two programs (in fact if you look you'll see that they're currently
just wrappers around calls into it).  It provides proper Python objects for
Debian binary archives and my form of binary patches, with a number of useful

  debdiff.py is an analog to the 'diff' program.  It operates on two .deb
files (I could trivially let the source or dest packages be references to
installed packages on the system..see debpatch.py) and generates a patch
between them in the current directory.

  debpatch.py is an analog to the 'patch' program.  It operates on a patchfile
and either a .deb file or a package on the system (in the second case it
calls dpkg-repack).  ie, either 'debpatch <diffile> package_ver_arch.deb' or
'debpatch <diffile> package'.  It generates both a new package and a reverse
patch (from the new package to the old one).  (the second step might be removed
or it might be more tightly integrated with the first..the thing is that we can
probably infer the reverse patch more efficiently from information available
while we're applying the forward one, but doing it with the builddiff() routine
is safer and efficient enough, so I'm inclined to decouple it even more)

  I don't have lots of versions of packages hanging around on my system and
I'd appreciate it if people could test these and see if (a) they generate
sufficiently small patches most of the time and (b) they correctly regenerate
the patched files (note that gzipped files may have a different compression
level when the files are patched, so you may have to run 'zdiff' to compare
them)  With (a) I'm especially interested in changes just between Debian
builds; I suspect that new upstream releases will often make enough changes
that the patches get really big.  Data on the topic is of coure welcome :), I
imagine that, for example, games with lots of datafiles that don't change
between versions will generate nice compact patches from one version to the

  The only currently pending backend feature that I haven't implemented is
looking for files which are identical but not in the same place.  It shouldn't
be too hard to do..

   Crossing my fingers and hoping this program is a good idea :),

  Whoever created the human body left in a fairly basic design flaw.  It has a
tendency to bend at the knees.

             -- Terry Pratchett, _Men at Arms_

Attachment: pydeb.tar.gz
Description: Binary data

Reply to: