[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Please use gzip "--no-name" flag on ftp.debian.org



The timestamp stored in the gzip file header results in a non-deterministic output even when the input is identical. Many files (e.g. under indices/) are frequently regenerated with identical contents but their compressed versions end up slightly different. This unnecessarily inflates the number of unique file hashes that snapshot.debian.org has to deal with, for example. It may also make mirror updates less efficient.

I encountered this when I tried to find when certain changes were made by comparing the checksum of indices/files/components/suite-stable.list.gz and found that it changes on every update. This obviously applies to many other compressed files.

The quick fix is to add "--no-name" to the gzip command (or GZIP=-n to the environment). A better fix would be to generate a temporary file, compare it to the current file and replace the file only if not identical. This will preserve the timestamp of the original file and should help some mirroring protocols.

Oren


Reply to: