[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#1071462: installing/upgrading libc6 does not work in sbuild when systemd is installed as ischroot declines



Package: sbuild,debianutils,libc6,systemd-sysv
Severity: important

Hello lots of maintainers,

I am faced with a very crazy interaction bug. Roughly speaking, when you
use sbuild to build a package and your build-depends happen to include
systemd-sysv and you happen to install (cross building) or upgrade
libc6, installing build-depends reliably fails. Since upgrading libc6 is
a thing, I guess that this now affects buildds and is why I file this at
important severity. Regenerating buildd chroots, will "heal" buildds, so
it is self-recovering there.

Without further ado, let's dive into the details. The issue is
reproducible using mmdebstrap:

mmdebstrap unstable --verbose --architectures amd64,arm64 --variant=apt /dev/null --include=systemd-sysv,libc6:arm64 --essential-hook='ln -sf /bin/false $1/usr/bin/ischroot'

This is using a cross build setting, because libc6 is installed early
during bootstrap and reproducing the bug takes configuring libc6 after
systemd-sysv has been unpacked. So I simply install a foreign libc6 and
apt happens to configure it late enough in my tests. So we now look into
libc6.postinst. We take the "$1" = "configure" branch. We eventually run
into:

|     # Restart init.  Currently handles chroots, systemd and upstart, and
|     # assumes anything else is going to not fail at behaving like
|     # sysvinit:
|     TELINIT=yes
|     if ischroot 2>/dev/null; then
|         # Don't bother trying to re-exec init from a chroot:
|         TELINIT=no

I note that mmdebstrap creates a number of namespaces and then
externally runs apt. If I understand things correctly, it also runs an
external dpkg --root ... without --force-scripts-chrootless. Hence dpkg
performs a chroot for every maintainer script and ischroot correctly
detects this, so we would be setting TELINIT=no if I were not replacing
it in the --essential-hook.

In sbuild, the namespace setup is different. apt is entirely run inside
the namespace. ischroot compares /proc/1/mountinfo to
/proc/self/mountinfo. If both are readable and equal, it concludes that
we're not in a chroot. If they differ, it concludes that we are in a
chroot. For mmdebstrap, pid 1 happens to be a mmdebstrap process in the
initial namespace and the ischroot process sees fewer mounts. Hence it
concludes success there. For sbuild, pid 1 is a runuser process already
running chrooted. Hence the mountinfo files equal and ischroot concludes
that we are not running in a chroot.

|     elif [ -n "${DPKG_ROOT:-}" ]; then
|         # Do not re-exec init if we are operating on a chroot from outside:
|         TELINIT=no

In neither case DPKG_ROOT is non-empty.

|     elif [ -d /run/systemd/system ]; then
|         # Restart systemd on upgrade, but carefully.
|         # The restart is wanted because of LP: #1942276 and Bug: #993821
|         # The care is needed because of https://bugs.debian.org/753725
|         # (if systemd --help fails the system might still be quite broken but
|         # that seems better than the kernel panic that results if systemd
|         # cannot reexec itself).
|         TELINIT=no

In neither case /run/systemd/system exists.

|         if systemd --help >/dev/null 2>/dev/null; then
|             systemctl daemon-reexec
|         else
|             echo "Error: Could not restart systemd, systemd binary not working" >&2
|         fi
|     fi
|     if [ "$TELINIT" = "yes" ]; then
|         telinit u 2>/dev/null || true ; sleep 1
|     fi

And finally we run telinit u when running inside sbuild or faking
ischroot in mmdebstrap. Running telinit u doesn't go well. This actually
has been a known problem with different symptoms recently. Earlier,
cross build nodes would get stuck in libc6.postinst hanging in telinit
forever. The reason was that telinit was re-executing itself over and
over again attempting to forward to another init system but always
returning back to itself. This has been fixed by Luca Boccassi:

https://github.com/systemd/systemd/pull/31251 and #1063147

telinit no longer reexecs itself and rather does what it is supposed to
do: kill(1, SIGTERM). Sadly this doesn't go well. In case of sbuild, we
kill the runuser process. It exits non-zero and sbuild considers this a
failure to install Build-Depends. This is bad.

So I'm not exactly sure which part is broken here. We might argue that
sbuild is setting up a container that looks too much like a container
and should have pid 1 outside the chroot area or that the init process
should handle SIGTERM more like an init system would handle that. We
might argue that ischroot should detect init-less application container
environments. We might argue that libc6 should ischroot is not meant for
detecting application containers and libc6.postinst asks the wrong
question and should be skipping telinit for such environments as well.
We might argute that telinit should not kill a pid 1 that isn't systemd.

At this time, I am really unsure which of these four packages we
consider at fault. Possibly, we select multiple options to harden things
in depth. I am now seeking feedback from the various maintainers:
 - debianutils
 - glibc
 - sbuild
 - systemd

Do you think that your package handles this situation correctly and that
some other package should change or do you see your package behaving
wrongly?

Thanks in advance for replying

Helmut


Reply to: