[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#855357: rsyncable tweaks for find|cpio|gzip; add cpio --reproducible



Package: initramfs-tools
Version: 0.120+deb8u2
Severity: wishlist
File: /usr/sbin/mkinitramfs

Several times a month I build Debian Live images and rsync them to
remote sites that use them on diskless kiosk farms.

I was annoyed because if there have been few security updates,
rsync will take a few seconds to upload 300MB of fs.squashfs,
but a few MINUTES to upload 15MB of initrd.img.

I did some proof-of-concept testing of some hacks to improve rsync's
ability to detect similarities between near-identical initrd.img files:

  * 3x speedup: call gzip with --rsyncable.

  * 11x speedup: sort the CPIO contents by mtime (oldest first).

  * 5x speedup: both together.

  * <inconclusive>: try to align rsync --block-size and cpio --io-size

My proof-of-concept (breaks error detection) patch was this:

    --- /usr/sbin/mkinitramfs	2016-04-18 02:58:00.000000000 +1000
    +++ -	2017-02-17 16:04:36.984798989 +1100
    @@ -353,1 +353,1 @@
    -		find . 4>&-; echo "ec1=$?;" >&4
    +		find . -printf "%T@ %p\\n" | sort -n | cut -d" " -f2 4>&-; echo "ec1=$?;" >&4



I also compared how Dracut does the find|cpio|gzip, and
these improvements look like they are worth stealing for mkinitramfs:

  find . -print0 | sort -z | cpio -0 --reproducible

    * more determinism (--reproducible),
    * guarantees lexicogaphic ordering (sort), &
    * doesn't choke \n in filenames (-print0 / -z / -0).



Are any of these worth baking into the Debian defaults for mkinitramfs?
The main benefit I see for "regular" users is when updating debian-installer ramdisks, e.g.

    rsync://mirror.internode.on.net/debian/dists/stable/main/installer-amd64/current/images/netboot/debian-installer/amd64/initrd.gz

...although maybe these aren't build with mkinitramfs.
(And most people seem to prefer the giant DVD .ISO installers anyway.)


Reply to: