[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: long daemon outages on upgrade



On Mon, 11 Mar 2002, Adrian Bridgett wrote:

> My apologies if this has been raised an settled before.
>
> Today I upgraded my firewall (62MBs worth - it's still serving as a backup
> box ATM).  All went well except that for about 10mins I lost most of my
> internet access.

Yeah, I hate that too.  It's a real downside to Debian.

> Why?  Well both bind and squid where stopped as part of the upgrade, then
> the upgrade continued on it's merry way for a while, finally restarting the
> daemons 10 mins later.  Is there any need for this? Can't we just issue a
> "restart" once everything is done to minimize the service outage to a few
> seconds?

start-stop-daemon checks that the binary on disk is the same as the binary
currently running.  This fails, when the binary has been upgraded.

> I suspect that part of the issue is that this is how debhelper does things
> by default (that's explains bind9 at least). dh_installinit can take a -r
> flag to say "don't restart on an upgrade (just start if it's a new install).
> Would it be worthwhile changing the default to "restart on an upgrade" and
> maybe adding a "stop in prerm, start in postinst" option which does the
> current behaviour?   The only problem I can forsee is daemons which watch
> their configuration files all the time rather than when they are sigHUPd or
> started.

bind does not use start-stop-daemon, so yes, bind should just do the restart
thing.

> Whilst it'd be nice to have a minimum outage for most daemons, we should do
> it for the most useful ones - bind, squid, email, gpm.  Note that gpm
> already follows this idea (no long outage on upgrades).

The real problem, imho, is that apt leaves things in the unpacked and
unconfigured state too long.  I see no problem, with it doing partial
configures.

Ie, during an upgrade, when there are X number of unconfigured packages, go
and do a configure run.  However, the apt author doesn't want to do this, and
has given no good reason as to why.



Reply to: