[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Bug#904558: What should happen when maintscripts fail to restart a service



Someone asked for an example, here is one I've often seen when doing a release upgrade on many webservers I administer: Apache will fail to start. I don't recall if that currently causes Apache postinst to fail, but if not, it really ought to continue.

Apache has a complicated config, and upstream makes backwards-incompatible changes often enough that every Debian release seems to have some. It's often not possible to automatically update the config (and even if it were, the variety of configuration management systems in use mean you wouldn't want that to happen automatically). It's much easier to fix after the upgrade. And to the extent anything depends on Apache, Apache being completely broken doesn't generally break them (unless they try to restart apache themselves, e.g., apache modules).

Now, if my local DNS cache failed to start, that needs to be fixed before continuing (since, e.g., even apt-get won't work). Same with an LDAP (etc.) server, you may no longer have user accounts. Some things definitely lead to a cascade of failures.

I think in an ideal world, there would be two separate failure states for postinst: one for failed but probably safe to continue the upgrade, one for failed and probably going to cause a cascade of failures (or worse). dpkg (and the various frontends) would let you know about fail-but-continue errors after finishing, and maybe before starting, but still continue to work.

At least for daemon failed to start and with systemd, we already can have pretty close: have the postinst ignore the failed to start error (when it's of the safe to continue the upgrade variety), then use `systemctl --failed` to get the list of daemons that failed to start.


Reply to: