Guys (especially Richard), I've been looking for a while at ways to make jigdo run faster when generating template files from iso images. (See http://lists.debian.org/debian-cd/2003/12/msg00041.html for my original mail). The core problem is that once we have an ISO image, jigdo essentially has to brute-force that image back into a binary blob (template file) and a list of files needed to rebuild the image (jigdo file). On my home system with a very good disks and a reasonable processor, creating jigdo files for a DVD iso image can take several hours. Multiply that up by 11 architectures... There are a few ways to improve this that I can see: 1. Modify jigdo so it knows about the internals of ISO images and can efficiently scan them (bad, not very generic for jigdo) 2. Write a helper tool to dump extra information for jigdo to use alongside the ISO image (helper tool written, but modifying jigdo to use this looks HARD) 3. Patch mkisofs to write .jigdo and .template files alongside the ISO image I've now done #3, and the patch for mkisofs is at http://www.einval.com/~steve/software/CD/mkisofs-JTE.patch.gz In the same directory I have a tool to dump the contents of (and rebuild images from) .jte files and another one to dump the contents of .template files. How to use it: ============== To use this code, specify the location of the output .jigdo, .template and .jte files alongside the ISO image. The .jte file is an intermediate helper file that I'll probably lose for the next release. You can also specify the minimum size beneath which files will just be dropped into the binary template file data rather than listed as separate files to be found on the mirror. For example: mkisofs -J -r -o /home/steve/test1.iso \ -jigdo-helper /home/steve/test1.jte \ -jigdo-jigdo /home/steve/test1.jigdo \ -jigdo-template /home/steve/test1.template \ -jigdo-min-file-size 16384 \ /mirror/jigdo-test If the -jigdo-* options are not used, the normal mkisofs execution path is not affected. The above invocation will create 4 output files. I've tested extensively with various input data and I can recreate ISO images using jigdo-file and the wrapper jigdo-mirror. How it works: ============= I've hooked all the places in mkisofs where it will normally write image data. All the normal data write calls (dir entries etc.) I simply pass through and build into the template file. Any *file* data entries are passed through with information about the original file. If that file is large enough, I grab the filename and the MD5 of the file's data so I can just write a file match record into the template file (and then the jigdo file). How fast is it? =============== On my *laptop* (600MHz P3, slow laptop disk) I can make a template file in parallel with the ISO image from a typical 500MB data set in about 2 minutes. By simply not creating the ISO (-o /dev/null), this time halves again. The data set I'm using here is a copy of the woody i386 r2 update CD, as it's a handy image I had lying around. What's left to do? ================== 1. Testing! :-) This is where you lot come in! Please play with this some more and let me know if you have any problems, especially with data corruption. More features: 2. Add support for -jigdo-exclude option(s), so that we can exclude (from the jigdo) README.* etc and other files that go on Debian CDs but often change on the mirrors. Reasonably easy to do, and I'm playing with this now. 3. Add pattern-matching in the .jigdo file (e.g. /mirror/debian -> Debian:). Again, should be easy. 4. Cosmetic cleanup of the .jigdo output. Easy 5. MUCH harder: re-reading and re-encoding .iso images that have been modified since they were first written. This is necessary for the boot code used on several architectures in debian-cd. I see how to do it - basically diff the image on disk to the one we would recreate from the .template file and write a new template file to match that. It's going to take some work... I hope people find this useful - at the moment I shudder at the thought of releasing sarge (10+ CDs, netinst, business card, 2 DVDs per arch) without making this kind of change. It'll take a week to generate the release images otherwise... -- Steve McIntyre, Cambridge, UK. steve@einval.com "It's actually quite entertaining to watch ag129 prop his foot up on the desk so he can get a better aim." [ seen in ucam.chat ]
Attachment:
signature.asc
Description: Digital signature