[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Bug#1004557: man-db: please make index.db installations reproducible



On Mon, Jan 31, 2022 at 10:23:53PM +0100, Johannes Schauer Marin Rodrigues wrote:
> Quoting Colin Watson (2022-01-31 03:28:07)
> > Another approach might be to modify filesystem timestamps after postinsts
> > have finished running but before mandb runs to clamp timestamps to
> > SOURCE_DATE_EPOCH; a bit like your proposed patch, but actually modifying the
> > filesystem timestamps as well.  I'm not sure where that could go, though.  It
> > can't be in mandb because the postinst deliberately doesn't run mandb as
> > root; and of course mandb is itself run from a postinst.  Maybe some kind of
> > dpkg hook, or maybe it would be simplest to just run a post-processing step
> > that clamps all the filesystem timestamps and then runs the equivalent of
> > "sudo -u man mandb -cq"?  (This might be more palatable with man-db 2.10.0,
> > where this will take more like 10 seconds rather than several minutes; see
> > #1003089.)
> 
> I don't like the idea of moving functionality like that into chroot-creating
> scripts. If we want the chroot to have a certain property, we should add that
> to the packages involved using declarative methods.
> 
> So another way to fix this would be to add a "touch" call to every maintainer
> script calling update-alternatives involving man pages and let them set the
> symlink mtime to SOURCE_DATE_EPOCH if that variable is set. But I think that's
> a bad idea and we should rather do this centrally.

Fair enough.

> > This puzzled me for a while too, but it's because
> > /usr/share/man/man7/builtins.7.gz is a symlink created by
> > update-alternatives and references bash-builtins in its NAME, which
> > provoked https://bugs.debian.org/691643.  I've now fixed that upstream:
> > 
> >   https://gitlab.com/cjwatson/man-db/-/commit/37ab864354c1d0ac09e27d2346a1221bf4628509
> > 
> > This may cause your comparisons to show more differences, but it should
> > mean that they're more reliably the *same* differences.  Previously, the
> > behaviour depended on directory iteration order (actually usually the
> > location of the first physical extent of each file on disk, since mandb sorts
> > by that for improved performance on rotational disk drives).
> 
> Thanks for the fix!
> 
> I talked with Guillem about the possibility of changing update-alternatives to
> produce reproducible mtimes. I'm adding debian-dpkg@lists.debian.org to discuss
> having a reproducible index.db by changing unattended-upgrades.
> 
> Reading the commit you quote above it seems that using the symlink's mtime is
> on purpose? I think the problem would not exist if the mtime of the link target
> would be used. But there is probably a reason why this is not done already?

I'm not saying there are no other ways things could work, but just
storing the mtime of the link target would definitely cause problems.
If mandb did that, then it would fail to detect when the symlink changed
to point to something else (at least in its current model where it
compares the mtime against the inode to see whether it needs to rescan a
page, and doesn't store the symlink data itself in the DB).

> Guillem also brought up that using SOURCE_DATE_EPOCH is wrong in this context
> because this is about runtime behaviour.
[...]
> Guillem was thinking about introducing a new variable in addition to
> SOURCE_DATE_EPOCH to indicate that some software should produce reproducible
> output in scenarios like this.

I don't think I have much of an opinion about the details of variable
naming there.

> We also thought about letting unattended-upgrades use the mtime of the symlink
> target as the mtime of the symlink. But this would be a bad idea because backup
> software will likely not notice a change of the symlink in case the symlink
> switches to a target with a lower mtime.

Yes, definitely a bad idea for that sort of reason.  Depending on the
exact details, it might also cause some backup software to think that
there's filesystem corruption (compare
https://bugs.launchpad.net/ubuntu/+source/man-db/+bug/1411633 /
https://bugs.debian.org/1004355).

-- 
Colin Watson (he/him)                              [cjwatson@debian.org]


Reply to: