Re: System-critical package management
[ CCing #910377 for some context. ]
Hi!
On Thu, 2023-09-07 at 11:59:47 +0900, Simon Richter wrote:
> > The lack of any system of recognition for packages that are critical to system operation impedes the reliability of Debian-based systems. For example, a reboot during a background package upgrade process on critical system packages unbeknownst to the user may result in the system unable to boot as expected, with little readily-available feedback to the user as to the cause.
>
> Locking out reboots while the package manager is active is a policy that
> needs to be provided by the policy layer that allows ordinary users to
> reboot -- so this is the responsibility of the desktop environment.
>
> The base system and package manager require superuser privileges for both
> reboot and invoking the package manager. For single-user systems, it is the
> responsibility of the administrator to not issue a reboot command while a
> package upgrade is in progress, which is not an onerous requirement because
> the package upgrade must be manually commanded as well.
I guess one case that this does not cover is when the user initiates a
shutdown/reboot, and an external superuser might be performing remote
package management on that system.
But in that case, that same superuser can use a frontend that already
inhibits those actions (such as apt), or on systems using systemd can
run dpkg via systemd-inhibit.
> > A potential middle-ground solution to this is to allow packages to
> > be marked as "system-critical" to DPKG by external system components
> > - for example a standard desktop Ubuntu system might mark the Gnome
> > Display Manager, Networking drivers, and others in this way during
> > installation. These system-critical packages could then be protected
> > by DPKG in the following ways:
>
> > - They are automatically reverted to a known good state on upgrade
> > failure (e.g. previous version)
>
> Generally, packages are expected to go from one functional state to another
> in a very quick operation after verifying that the operation can be
> performed.
Well dpkg is supposed to be resilient against system crashes, abrupt
shutdowns or reboots anyway, breakage in dpkg in those cases (but not
due to specific handling from within some other package) should be
considered bugs in dpkg.
> > - They cannot be removed without being unmarked as "system-critical"
>
> We have "Essential: yes", which dpkg protects, and "Protected: yes", which
> are protected by apt.
Both are protected by dpkg and apt, for dpkg just with different
--force-<foo> options, so that you can control which one to let through.
> > - The system could check during every shutdown that system-critical
> > packages are in a consistent state, reverting to a known good
> > state if not
>
> Again, this would need to be inside the policy layer that defines "shutdown"
> -- there are many of those, and most of them are outside the Debian system
> (e.g. if you run Debian in a container under Kubernetes, then Kubernetes is
> the policy layer that would be responsible for that.
Agreed.
> On desktop systems, systemd is the appropriate policy layer to decide about
> reboots, and (if I remember correctly) packagekit is the policy layer that
> invokes dpkg, so packagekit would need to inhibit reboots while it is
> working, and it can do so easily because it can assume systemd to be present
> and running.
apt already does this, so any other frontend using libapt or apt
directly will get this automatically.
> > I am interested in knowing the communities' thoughts on this, and if
> > these ideas have any merit to them.
>
> On the lower levels, what can be reasonably implemented already is. The
> lockout you describe belongs into the desktop system, but it would require
> new UI to be developed to be useful -- rejecting the reboot is easy, but
> indicating to the user why the reboot was rejected or disabling the option
> requires a new communication channel, and without that functionality, the
> user experience would be "I tried to reboot and it didn't do anything."
>
> Breaking the layer separation would be a horrible complicated mess -- adding
> new low level errors means adding appropriate error handlers to all
> intermediate layers until the error can bubble up to the user. This is
> something component systems have historically struggled with -- every time
> Windows displays some "error code c0312313" type dialog, this is a missing
> handler chain.
While dpkg on systems using systemd _could_ by default take an
system inhibitor lock, and could provide a good enough reason like say
"Packaging system upgrade" or whatever, my concern has been with the
added dependency chain, and after reading your mail and thinking about
this now, I have to agree this seems like a higher level policy.
(Of course dpkg could also do that and grow a new --no-inhibit,
or --refuse-inhibit or similar option, but still.)
But then, I recalled I had a git branch adding a dpkg-db-lock command
with a --wait-lock option, that I could recover and polish to provide
an example pre-hook script that would call that via a background
systemd-inhibit if systemd is running and the command is available,
where an admin that wanted to do that for their system or fleet of
systems could hook into the dpkg config. I've done that locally, and
will check whether that's viable and probably merge it for 1.22.1
or 1.22.2, so that people that want to do it can easily do so.
Thanks,
Guillem
Reply to: