[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Second take at DEP17 - consensus call on /usr-merge matters



Hi Sam,

Thanks for trying to wrap your head around the complexity.

On Sat, Jul 08, 2023 at 07:57:40AM -0600, Sam Hartman wrote:
> So for me, a 3C proposal would have two components:
> 
> 1) An explanation of what the archive looks like at time of bootstrap
> (and changes to any bootstrap programs) so I can reason about whether
> bootstrap works.

I hope this one is simple.
 * All packages ship all of their files in canonical locations.
 * base-files ships all of the aliasing symbolic links and their target
   directories.
 * Given that base-files installs the symbolic links, all programs are
   immediately working after unpack prior to running any maintainer
   scripts.
 * Consequently, cdebootstrap and mmdebstrap just work without any
   modification.
 * An unmodified debootstrap fails (unpacking base-files due to -k).
   + We modify debootstrap such that it first unpacks and then merges.
   OR
   + We stop passing the -k flag to tar. (Though we need a better
     understanding for why that was added post jessie.)

> 2) An argument of safety of upgrades focused on the changes and why
> those changes are safe both for unstable upgrades and for bookworm
> upgrades.

As far as I understand the question here, it is about those aspects that
are specific to the 3C solution as opposed to a 3B solution. In both
cases, we move all of the files to their canonical locations. I'm not
sure whether the protective diversions for aliasing links (DEP17-M4) are
something we ultimately need in all scenarios, but in case of 3C, we
quite certainly need them to make upgrades safe, so that's an aspect to
consider here. The other aspect of course is shipping the symlinks in
base-files (DEP17-M11). So what could go wrong here?

In an upgrade scenario from unstable or from bookworm, we'd have to
unpack and configure usrmerge-support before unpacking base-files, since
that becomes a Pre-Depends of base-files. usrmerge-support.preinst would
verify that the filesystem is merged already (much like usr-is-merged
does) except that it does not tolerate
/etc/unsupported-skip-usrmerge-conversion anymore, so any system using
that mechanism will fail this preinst. Then usrmerge-support.preinst
would install the protective diversions (DEP17-M4) on behalf of
base-files. Since these are --no-rename, the filesystem is not modified
and since we just verified that all the affected locations really are
symbolic links rather than directories, dpkg-divert wouldn't error out
about diverting a directory. In any case, usrmerge-support is eventually
configured (without a postinst), which allows unpacking base-files.
Whenever we unpack base-files (now or for subsequent upgrades), dpkg
will create each aliasing symlink with a temporary name and rename(2) it
to the final destination. Since rename(2) atomic, the aliasing symlinks
will be never go missing. When upgrading or removing any other package,
dpkg may consider removing an aliasing symlink as that package may be
the last package to ship an aliased file. When this happens, the removal
of the directory will be redirected to an innocent location via the
protective diversion. Since diversions only match exactly (they are not
meant to be used for directories), files installed "below" the diverted
aliasing links (i.e. aliased files) will be entirely unaffected by the
protective diversions and dpkg will operate on them as usual.

The most common failure mode during upgrades seen by users likely will
be when /etc/unsupported-skip-usrmerge-conversion exists and the system
isn't actually merged.

I have a hard time figuring out what else could go wrong here and that's
probably because I'm biased towards 3C. On the other hand, the reason
for me to like it is because I see very little that could go wrong (in
addition to what can already go wrong due to moving all the files). I
hope that others can use this detailed description of what happens to
construct possible failure cases such that we can better understand the
risk here.

If this procedure sounds risky to you anyway, please be aware of two
other aspects. For one thing, this protective diversion mechanism will
be required for other files (DEP17-M8) if we follow the current
consensus of moving files (DEP17-M2), so we'll bear the general risks of
protective diversions in any case - not just when doing 3C. Therefore it
is questionable whether we can attribute these risks to the choice of 3C
at all. Then, if you consider DEP17-M4 too risky in general, my current
understanding is that the only safe alternative mitigation for the loss
of aliasing symlinks (DEP17-P9) is shipping all of the aliasing symlinks
as directories in some package (DEP17-M5). If going that route, we must
distribute those links to multiple packages to escape from Pre-Depends
loops and we technically violate the mostly agreed principle that no
packages ship any aliased paths (DEP17-M2) for these few cases in
trixie. Is that really less complex and less risky?

> * Your debootstrap changes seem overly complicated and would in and of
>   themselves push me against 3C.  First, you don't seem to be thinking
>   about buster, which also needs to bootstrap usrmerged, doesn't it.

I'm not sure in which way you think about buster here. Do you refer to
using a buster system for bootstrapping a trixie chroot? I think we do
not have to support this. Very likely, a buster kernel will not run a
trixie glibc at all. Do you concur here? Do you refer to using a
trixie/sid system for bootstrapping buster? That is supposed to work in
the very same (merged) way as bootstrapping bullseye and bookworm (by
merging after unpack).

>   Second, is there a way we could simply change how debootstrap calls
>   tar?

I captured this possibility above due to your question here. Thank you.
Additionally, I dug into the history of debootstrap to figure out why
and when it was added. The first commit mentioning it was
https://salsa.debian.org/installer-team/debootstrap/-/commit/6b79352a205a96cee441ae0c6247c4616097a517

    Pass -k to tar when extracting packages

    When installing with a merged /usr, the symlinks in / should not be
    replaced with real directories when extracting the packages.

in 2016. As far as I understand it, dropping -k for any of buster,
bullseye or bookworm would be broken. In the absence of -k, tar would
replace the aliasing symlinks with actual directories. In the new world
proposed by 3C, this aspect no longer is a problem as no package
installs any directory onto an aliasing link anymore. So we must pass -k
as long as any (essential) package ships any aliased location and we
must not pass -k whenever base-files ships the aliasing symlinks in its
data.tar.  While it would probably be possible to detect this somehow it
feels very wrong.  The alternative of merging after unpack works with -k
both before and after.

Do you see some other way to fix debootstrap for this that I don't see
here?

>   I think asking debootstrap to not create the symlinks before is a big
>   ask.

Would you be able to go into detail as to why you think so? The way I
currently see it, this is a very logical consequence of shipping the
symlinks in base-files, which in turn gained quite some agreement.

And really, you got me hooked as to how hard it could be. So rather than
arguing about the feasibility of modifying debootstrap in the proposed
way, a patch seems to be the easiest way to settle that question. Hence
I'm attaching one and note that the post-merging approach also removes
the complexity of having a per-architecture list of aliased directories.

At this point, I'm really interested in understanding that additional
complexity and the involved risk that is attributed to 3C. The more I
dig into this this approach, the more it seems to be the safest approach
that also removes complexity in my (biased) view.

Helmut
--- a/debian/changelog
+++ b/debian/changelog
@@ -1,3 +1,11 @@
+debootstrap (1.0.128+nmu3) UNRELEASED; urgency=medium
+
+  * Non-maintainer upload.
+  * Implement merged-/usr by merging after initial extraction to allow
+    shipping the aliasing symlinks in a binary package's data.tar.
+
+ -- Helmut Grohne <helmut@subdivi.de>  Sun, 09 Jul 2023 22:13:37 +0200
+
 debootstrap (1.0.128+nmu2) unstable; urgency=low
 
   * Non-maintainer upload.
--- a/functions
+++ b/functions
@@ -1358,15 +1358,40 @@
 	esac
 }
 
-# Find out where the runtime dynamic linker and the shared libraries
-# can be installed on each architecture: native, multilib and multiarch.
-# This data can be verified by checking the files in the debian/sysdeps/
-# directory of the glibc package.
-#
-# This function must be updated to support any new architecture which
-# either installs the RTLD in a directory different from /lib or builds
-# multilib library packages.
-setup_merged_usr() {
+merge_usr_entry() {
+	local entry canon
+	canon="$TARGET/usr/${1#"$TARGET/"}"
+	test -h "$canon" &&
+		error 1 USRMERGEFAIL "cannot move %s as its destination exists as a symlink" "${1#"$TARGET"}"
+	if ! test -e "$canon"; then
+		mv "$1" "$canon" >/dev/tty 2>&1
+		return 0
+	fi
+	test -d "$1" ||
+		error 1 USRMERGEFAIL "cannot move non-directory %s as its destination exists" "${1#"$TARGET"}"
+	test -d "$canon" ||
+		error 1 USRMERGEFAIL "cannot move directory %s as its destination is not a directory" "${1#"$TARGET"}"
+	for entry in "$1/"* "$1/."*; do
+		# Some shells return . and .. on dot globs.
+		test "${entry%/.}" != "${entry%/..}" && continue
+		# Absolute symlinks and relative in-tree symlinks are fine.
+		if test -h "$entry"; then
+			case "$(readlink "$entry")" in
+				../*)
+					error 1 USRMERGEFAIL "cannot move relative cross-directory symlink %s" "${entry#"$TARGET"}"
+				;;
+			esac
+		fi
+		# Ignore glob match failures
+		if test "${entry%'/*'}" != "${entry%'/.*'}" && ! test -e "$entry"; then
+			continue
+		fi
+		merge_usr_entry "$entry"
+	done
+	rmdir "$1"
+}
+
+merge_usr() {
 	if doing_variant buildd && [ -z "$MERGED_USR" ]; then
 		MERGED_USR="no"
 	fi
@@ -1389,30 +1414,16 @@
 	    return 0;
 	fi
 
-	local link_dir=""
-	case $ARCH in
-	    amd64)	link_dir="lib32 lib64 libx32" ;;
-	    i386)	link_dir="lib64 libx32" ;;
-	    mips|mipsel)
-			link_dir="lib32 lib64" ;;
-	    mips64*|mipsn32*)
-			link_dir="lib32 lib64 libo32" ;;
-	    loongarch64*)
-			link_dir="lib32 lib64" ;;
-	    powerpc)	link_dir="lib64" ;;
-	    ppc64)	link_dir="lib32 lib64" ;;
-	    ppc64el)	link_dir="lib64" ;;
-	    s390x)	link_dir="lib32" ;;
-	    sparc)	link_dir="lib64" ;;
-	    sparc64)	link_dir="lib32 lib64" ;;
-	    x32)	link_dir="lib32 lib64 libx32" ;;
-	esac
-	link_dir="bin sbin lib $link_dir"
-
 	local dir
-	for dir in $link_dir; do
-		ln -s usr/"$dir" "$TARGET/$dir"
-		mkdir -p "$TARGET/usr/$dir"
+	# This is list includes all possible multilib directories. It must be
+	# updated when new multilib directories are being added. Hopefully,
+	# all new architectures use multiarch instead, so we never get to
+	# update this.
+	for dir in bin lib lib32 lib64 libo32 libx32 sbin; do
+		test -h "$TARGET/$dir" && continue
+		test -e "$TARGET/$dir" || continue
+		merge_usr_entry "$TARGET/$dir"
+		ln -s "usr/$dir" "$TARGET/$dir"
 	done
 }
 
--- a/scripts/amber
+++ b/scripts/amber
@@ -54,8 +54,8 @@
 	MERGED_USR="yes"
 	EXTRACT_DEB_TAR_OPTIONS="$EXTRACT_DEB_TAR_OPTIONS -k"
 
-	setup_merged_usr
 	extract $required
+	merge_usr
 
 	mkdir -p "$TARGET/var/lib/dpkg"
 	: >"$TARGET/var/lib/dpkg/status"
--- a/scripts/debian-common
+++ b/scripts/debian-common
@@ -42,9 +42,9 @@
 	esac
 
 	# On suites >= bookworm, either we set up a merged-/usr system
-	# via setup_merged_usr, or we deliberately avoided that migration
-	# by creating the flag file. This means there's no need for the
-	# live migration 'usrmerge' package and its extra dependencies:
+	# via merge_usr, or we deliberately avoid that migration by creating
+	# the flag file. This means there's no need for the live migration
+	# 'usrmerge' package and its extra dependencies:
 	# we can install the empty 'usr-is-merged' metapackage to indicate
 	# that the transition has been done.
 	case "$CODENAME" in
@@ -73,8 +73,8 @@
 		MERGED_USR="no"
 	fi
 
-	setup_merged_usr
 	extract $required
+	merge_usr
 
 	mkdir -p "$TARGET/var/lib/dpkg"
 	: >"$TARGET/var/lib/dpkg/status"
--- a/scripts/gutsy
+++ b/scripts/gutsy
@@ -138,8 +138,8 @@
 			;;
 	esac
 
-	setup_merged_usr
 	extract $required
+	merge_usr
 
 	mkdir -p "$TARGET/var/lib/dpkg"
 	: >"$TARGET/var/lib/dpkg/status"

Reply to: