[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Bug#1004557: man-db: please make index.db installations reproducible



Control: reassign -1 dpkg 1.21.1

Quoting Colin Watson (2022-02-01 11:37:34)
> > I talked with Guillem about the possibility of changing update-alternatives to
> > produce reproducible mtimes. I'm adding debian-dpkg@lists.debian.org to discuss
> > having a reproducible index.db by changing unattended-upgrades.
> > 
> > Reading the commit you quote above it seems that using the symlink's mtime is
> > on purpose? I think the problem would not exist if the mtime of the link target
> > would be used. But there is probably a reason why this is not done already?
> 
> I'm not saying there are no other ways things could work, but just
> storing the mtime of the link target would definitely cause problems.
> If mandb did that, then it would fail to detect when the symlink changed
> to point to something else (at least in its current model where it
> compares the mtime against the inode to see whether it needs to rescan a
> page, and doesn't store the symlink data itself in the DB).
> 
> > Guillem also brought up that using SOURCE_DATE_EPOCH is wrong in this context
> > because this is about runtime behaviour.
> [...]
> > Guillem was thinking about introducing a new variable in addition to
> > SOURCE_DATE_EPOCH to indicate that some software should produce reproducible
> > output in scenarios like this.
> 
> I don't think I have much of an opinion about the details of variable
> naming there.
> 
> > We also thought about letting unattended-upgrades use the mtime of the symlink
> > target as the mtime of the symlink. But this would be a bad idea because backup
> > software will likely not notice a change of the symlink in case the symlink
> > switches to a target with a lower mtime.
> 
> Yes, definitely a bad idea for that sort of reason.  Depending on the
> exact details, it might also cause some backup software to think that
> there's filesystem corruption (compare
> https://bugs.launchpad.net/ubuntu/+source/man-db/+bug/1411633 /
> https://bugs.debian.org/1004355).

Thanks for your input! It then seems that we should indeed adjust the mtime of
the symlinks instead.

I'm thus reassigning this bug to dpkg and attached a tested patch which fixes
the problem.

Thanks!

cheers, josch
>From f50d1f95c32854520a5652e77576661b4781e3b2 Mon Sep 17 00:00:00 2001
From: Johannes Schauer Marin Rodrigues <josch@mister-muffin.de>
Date: Tue, 1 Feb 2022 16:02:32 +0100
Subject: [PATCH] utils/update-alternatives.c: respect SOURCE_DATE_EPOCH when
 u-a is used in the context of building a reproducible chroot

---
 utils/update-alternatives.c | 55 +++++++++++++++++++++++++++++++++++++
 1 file changed, 55 insertions(+)

diff --git a/utils/update-alternatives.c b/utils/update-alternatives.c
index 15aab25ec..b2638faf1 100644
--- a/utils/update-alternatives.c
+++ b/utils/update-alternatives.c
@@ -540,6 +540,59 @@ log_msg(const char *fmt, ...)
  * Filesystem access for alernative handling.
  */
 
+static void
+fsys_lutimes(const char *filename)
+{
+#ifndef HAVE_LUTIMES
+	return;
+#else
+	/*
+	 * If SOURCE_DATE_EPOCH is set, then this means that update-alternatives
+	 * is run in the context of building a reproducible system image. During
+	 * normal operation, the variable must not be set because otherwise
+	 * modification times might be set to a lower value than the current one
+	 * which will confuse backup software.
+	 *
+	 * We use the SOURCE_DATE_EPOCH environment variable to distinguish
+	 * between update-alternatives being used in a reproducible chroot
+	 * bootstrap build (the variable is set) and it being used normally (the
+	 * variable is unset).
+	 *
+	 * Reproducible symlink timestamps are necessary for reproducible man-db
+	 * index.db which stores manual page mtimes. See #1004557 for context.
+	 */
+	const char *sde = getenv("SOURCE_DATE_EPOCH");
+	if (!sde) {
+		return;
+	}
+	unsigned long long epoch;
+	char *endptr;
+	errno = 0;
+	epoch = strtoull(sde, &endptr, 10);
+	if ((errno == ERANGE && (epoch == ULLONG_MAX || epoch == 0))
+			|| (errno != 0 && epoch == 0)) {
+		syserr(_("Environment variable $SOURCE_DATE_EPOCH: strtoull: %s"), strerror(errno));
+	}
+	if (endptr == sde) {
+		syserr(_("Environment variable $SOURCE_DATE_EPOCH: No digits were found: %s"), endptr);
+	}
+	if (*endptr != '\0') {
+		syserr(_("Environment variable $SOURCE_DATE_EPOCH: Trailing garbage: %s"), endptr);
+	}
+	if (epoch > ULONG_MAX) {
+		syserr(_("Environment variable $SOURCE_DATE_EPOCH: value must be smaller than or equal to: %lu but was found to be: %llu"), ULONG_MAX, epoch);
+	}
+	struct timeval tv[2];
+	tv[0].tv_sec = epoch;
+	tv[0].tv_usec = 0;
+	tv[1].tv_sec = epoch;
+	tv[1].tv_usec = 0;
+	if (lutimes(filename, tv)) {
+		syserr(_("Error setting timestamp of %s"), filename);
+	}
+#endif
+}
+
 static char *
 fsys_get_path(const char *pathpart)
 {
@@ -641,6 +694,8 @@ fsys_symlink(const char *filename, const char *linkname)
 	if (symlink(filename, root_linkname))
 		syserr(_("error creating symbolic link '%.255s'"), root_linkname);
 
+	fsys_lutimes(root_linkname);
+
 	free(root_linkname);
 }
 
-- 
2.33.0

Attachment: signature.asc
Description: signature


Reply to: