Bug#855357: rsyncable tweaks for find|cpio|gzip; add cpio --reproducible
Package: initramfs-tools
Version: 0.120+deb8u2
Severity: wishlist
File: /usr/sbin/mkinitramfs
Several times a month I build Debian Live images and rsync them to
remote sites that use them on diskless kiosk farms.
I was annoyed because if there have been few security updates,
rsync will take a few seconds to upload 300MB of fs.squashfs,
but a few MINUTES to upload 15MB of initrd.img.
I did some proof-of-concept testing of some hacks to improve rsync's
ability to detect similarities between near-identical initrd.img files:
* 3x speedup: call gzip with --rsyncable.
* 11x speedup: sort the CPIO contents by mtime (oldest first).
* 5x speedup: both together.
* <inconclusive>: try to align rsync --block-size and cpio --io-size
My proof-of-concept (breaks error detection) patch was this:
--- /usr/sbin/mkinitramfs 2016-04-18 02:58:00.000000000 +1000
+++ - 2017-02-17 16:04:36.984798989 +1100
@@ -353,1 +353,1 @@
- find . 4>&-; echo "ec1=$?;" >&4
+ find . -printf "%T@ %p\\n" | sort -n | cut -d" " -f2 4>&-; echo "ec1=$?;" >&4
I also compared how Dracut does the find|cpio|gzip, and
these improvements look like they are worth stealing for mkinitramfs:
find . -print0 | sort -z | cpio -0 --reproducible
* more determinism (--reproducible),
* guarantees lexicogaphic ordering (sort), &
* doesn't choke \n in filenames (-print0 / -z / -0).
Are any of these worth baking into the Debian defaults for mkinitramfs?
The main benefit I see for "regular" users is when updating debian-installer ramdisks, e.g.
rsync://mirror.internode.on.net/debian/dists/stable/main/installer-amd64/current/images/netboot/debian-installer/amd64/initrd.gz
...although maybe these aren't build with mkinitramfs.
(And most people seem to prefer the giant DVD .ISO installers anyway.)
Reply to: