[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#831759: fixed in backup-manager 0.7.12-2



On Wed, Aug 17, 2016 at 11:55:03AM +0200, Maximiliano Curia wrote:

> > + /usr/bin/nice -n 10 /bin/tar -p -c -z -f ./repository/uranio1-tmp-BM-backup-manager-0.7.12-t-testdir.20160815.master.tar.gz /tmp/BM/backup-manager-0.7.12/t/testdir
> > ++ get_md5sum ./repository/uranio1-tmp-BM-backup-manager-0.7.12-t-testdir.20160815.master.tar.gz
> > ++ md5='ee176eaec8040e7230c18d2b8404d313  ./repository/uranio1-tmp-BM-backup-manager-0.7.12-t-testdir.20160815.master.tar.gz'
> 
> > + debug '/usr/bin/nice -n 10 /bin/tar     -p -c -z -f ./repository/uranio1-tmp-BM-backup-manager-0.7.12-t-testdir.20160816.master.tar.gz "/tmp/BM/backup-manager-0.7.12/t/testdir" > /tmp/backup-manager/bm-tarball.log.FwLAvP 2>&1'
> > ++ md5='f39797a9ee7c033e291d34b4304386eb  ./repository/uranio1-tmp-BM-backup-manager-0.7.12-t-testdir.20160816.master.tar.gz'
> 
> Mmh, the script is creating two "identical" tarballs but they get different
> md5sum (which is what's used to detect the duplicated tarballs). tar is know
> to introduce some "undefined" bits in the files, that's what pristine-tar's
> delta files manage.

Actually, I think tar is currently deterministic if you give it the same
files with the same timestamps and the files are in the same order.

(But if you tar a directory, then the order is undefined and may be
anything).

> From the tar invocation, I would suspect that a difference might occur if by
> any reason the file order in which tar processes the directory varies. This
> could be the case if a filesystem reorders/rebalances its directories after
> the first transversal, for example.

Oh, I see. There is simply not a canonical order to traverse a directory
and people should just not assume that traversing a directory two times in a
row will yield the files in the same order.

You don't need to imagine reorders or rebalances in the filesystem,
there is simply not a guarantee anywhere that such thing will happen.

So, making a tar from the "contents of a directory" and always
expecting the same result is just wrong.

This is already a known problem for people working in reproducible
builds. Please take a look at this:

https://wiki.debian.org/ReproducibleBuilds/FileOrderInTarballs

There was even a fuse filesystem called "disorderfs" which intentionally
made the order of files in the directories to be random. I think it was
disabled because it was not stable enough, but I think it helped
to catch quite a bunch of bugs like this one.

So, I really believe that's the problem. Just follow the wiki page
and you will have a fix. I don't see the need to perform more tests.

But for completeness, I'll answer your questions:

> [...]
> How many times was needed to run the test to trigger a fail? What kind of
> filesystem are you using? Is that filesystem using any special mounting
> options?

A lot, maybe 30 or 40. I'm using ext4, nothing special, and I'm not
using any special mounting options.

But that's not relevant, really. You have the key of the problem
and the wiki page has the solution.

Thanks.


Reply to: