Hi, I've been watching debian-cd running on open struggle with the trivia of generating things like Packages files, and md5sums, and it occurs to me that much of this could do with some work. For example the generation of the md5sum files would be done a lot quicker by a perl script that picks the pre-calculated md5sums out of the ftp archive's md5sum.gz file. If we replaced the whole for loop with a perl script, it would only need to read the md5sums once, could load them into a hash, and simply write them out on the basis of the files found in the CD directory. Before I start this, does anyone have any reasons not to do this? IMO it's actually better to use the master md5sums, because it gives a simple end-to-end sanity check. If the CD building machine had a tendency to corrupt data comming off it's disks (as happened with a previous incarnation of open) or the mirroring run had failed in a subtle way, then a simple loop mount of each image, and running md5sum -c on each, should catch that straight away if the md5sums were not locally recalculated. A slightly more ambitious replacement would be to generate the Packages files by taking the records out of the main archive's Packages files. Another thing that might help, but would need a significant redesign, would be to do all this for all the CD directories from all architectures at the same time (i.e. reorder the loop nesting) so that the perl would only need to look at the md5sums and Packages files once before splurging out the individual CD files. Opinions? In the mean time, I'll knock up an md5sum file generator, and give it a try. Cheers, Phil. -- Say no to software patents! http://petition.eurolinux.org/ |)| Philip Hands [+44 (0)20 8530 9560] http://www.hands.com/ |-| HANDS.COM Ltd. http://www.uk.debian.org/ |(| 10 Onslow Gardens, South Woodford, London E18 1NE ENGLAND
Attachment:
signature.asc
Description: This is a digitally signed message part