Hello, This is a followup to the earlier thread about binary patches. ("Idea to SERIOUSLY reduce download times!!") I have, as promised, created a new version in a higher-level language than shell scripts (Python). I've compressed and attached the files. (I'd include them verbatim but they were a little large before I compressed them -- 15K-18K -- that's what I get for using a wordy language :) ) They'll untar into a subdirectory called 'pkgpatch' with three files: debfile.py, debdiff.py and debpatch.py. debfile.py is a [hopefully] general-purpose library that I wrote to facilitate the other two programs (in fact if you look you'll see that they're currently just wrappers around calls into it). It provides proper Python objects for Debian binary archives and my form of binary patches, with a number of useful features. debdiff.py is an analog to the 'diff' program. It operates on two .deb files (I could trivially let the source or dest packages be references to installed packages on the system..see debpatch.py) and generates a patch between them in the current directory. debpatch.py is an analog to the 'patch' program. It operates on a patchfile and either a .deb file or a package on the system (in the second case it calls dpkg-repack). ie, either 'debpatch <diffile> package_ver_arch.deb' or 'debpatch <diffile> package'. It generates both a new package and a reverse patch (from the new package to the old one). (the second step might be removed or it might be more tightly integrated with the first..the thing is that we can probably infer the reverse patch more efficiently from information available while we're applying the forward one, but doing it with the builddiff() routine is safer and efficient enough, so I'm inclined to decouple it even more) I don't have lots of versions of packages hanging around on my system and I'd appreciate it if people could test these and see if (a) they generate sufficiently small patches most of the time and (b) they correctly regenerate the patched files (note that gzipped files may have a different compression level when the files are patched, so you may have to run 'zdiff' to compare them) With (a) I'm especially interested in changes just between Debian builds; I suspect that new upstream releases will often make enough changes that the patches get really big. Data on the topic is of coure welcome :), I imagine that, for example, games with lots of datafiles that don't change between versions will generate nice compact patches from one version to the next. The only currently pending backend feature that I haven't implemented is looking for files which are identical but not in the same place. It shouldn't be too hard to do.. Crossing my fingers and hoping this program is a good idea :), Daniel -- Whoever created the human body left in a fairly basic design flaw. It has a tendency to bend at the knees. -- Terry Pratchett, _Men at Arms_
Attachment:
pydeb.tar.gz
Description: Binary data