"Petter Reinholdtsen" <pere@hungry.com> wrote in message [🔎] 2flwsqqzmg4.fsf@klodrik.uio.no">news:[🔎] 2flwsqqzmg4.fsf@klodrik.uio.no...
[Roger Leigh]On a heavily loaded or slow system, I suspect it would be highly likely some would get SIGKILL before they could shut down properly. I can't say I'm a big fan of the proposal for this reason.I do not understand this objection. The only way I can get it to make sense is by assuming that you believe my proposal is to remove most packages init.d scripts from the shutdown runlevels, even the ones that need special care when taking the service down. And that is not what I am proposing. I am claiming that there are daemons around that _do not_ need any special care when the service is taken down, and that these daemons do not need a script in runlevel 0 and 6 to take them down as it is faster to let the sendsigs script kill them.
Indeed it seems redundant for the initscripts to send signals that will be sent by a later script anyway.
Obviously removing those scripts should have no impact on the other initscripts.
However, I think the concern is that more processes would end up getting SIGKILL'ed, as an initscript that more or less what sendsigs does (SIGTERM, 5 seconds, SIGKILL) would be more likely to result in the program shutting down cleanly. Why? Well odds are that there are plenty of idle cycles, so the program basically has the full 5 seconds to shutdown. However, if that process has to share those 5 seconds with several other processes being SIGTERM'ed it is somewhat less likely to reach the end of the clean shutdown cycle than if it were the only process being shutdown. We are of course discussing processes where being SIGKILL'ed is not a really big deal, but it is still preferable to have as few SIGKILL'd processes as reasonably feasible during shutdown.
Btw, if the 5 second wait isn't long enough for sendsigs, we can extend it. There is code there to make sure sendsigs terminates as soon as the last process it tries to kill is dead, so we could increase the timeout without affecting the normal shutdown times. It will wait from 0 to 5 seconds at the moment, depending on how long it take for the processes to die. It would not be a problem to let it wait from say 0 to 10 seconds, or 0 to 30 seconds.
That does sound like a reasonable solution to the concern.How feasible would it be to make the pause time a function of the number of processes sendsig must reclaim? That seems to make some sense to me. Obviously there should be a sane upper and lower bound (5 seconds and 30 seconds sound fine to me).