Bug#778913: openssh-server: init (at least systemd) doesn't notice when sshd fails to start and reports success

To: Russ Allbery <rra@debian.org>, Colin Watson <cjwatson@debian.org>
Cc: pkg-systemd-maintainers@lists.alioth.debian.org, Christoph Anton Mitterer <calestyo@scientia.net>, 778913@bugs.debian.org
Subject: Bug#778913: openssh-server: init (at least systemd) doesn't notice when sshd fails to start and reports success
From: Michael Biebl <biebl@debian.org>
Date: Mon, 30 Mar 2015 01:17:59 +0200
Message-id: <[🔎] 55188827.3000708@debian.org>
Reply-to: Michael Biebl <biebl@debian.org>, 778913@bugs.debian.org
In-reply-to: <87k2z9rdih.fsf@hope.eyrie.org>
References: <20150221180856.12090.48131.reportbug@heisenberg.scientia.net> <20150222114626.GO3020@riva.ucam.org> <1424624968.7188.2.camel@scientia.net> <20150222175306.GP3020@riva.ucam.org> <87k2z9rdih.fsf@hope.eyrie.org>

Am 22.02.2015 um 19:45 schrieb Russ Allbery:
> That's the problem with forking services that don't have status
> notification.  The default is Type=simple, which per systemd.service(5):
> 
>     If set to simple (the default value if neither Type= nor BusName=
>     are specified), it is expected that the process configured with
>     ExecStart= is the main process of the service. In this mode, if the
>     process offers functionality to other processes on the system, its
>     communication channels should be installed before the daemon is
>     started up (e.g. sockets set up by systemd, via socket activation),
>     as systemd will immediately proceed starting follow-up units.
> 
> That last clause is exactly the problem that you're running into.  A
> Type=simple service says to run the command and immediately assume
> success.
> 
> Type=forking plus PIDFile should be a minor improvement, assuming sshd
> does all of its checking before it forks.  The best systemd behavior,
> though, would come from adding sd_notify support so that sshd can
> affirmatively tell systemd whether it succeeded in startup or not, and
> then using Type=notify.  Then sshd startup won't be considered complete
> until the sshd daemon calls sd_notify, and the correct status will be
> reported if it exits for some reason before doing so.
> 

Russ' explanation is excellent and exactly the problem you are running into.

If you change the service file to Type=forking as Russ advised, this
would be a slight improvement, as systemd would now detect the failure
to start the daemon. It has a gotcha though. Since you use
Restart=on-failure, systemd would restart the ssh daemon on config
errors, until it hits the restart limit. This would look like this then:

# systemctl start ssh.service
Job for ssh.service failed. See 'systemctl status ssh.service' and
'journalctl -xn' for details.
# systemctl status ssh.service
● ssh.service - OpenBSD Secure Shell server
   Loaded: loaded (/etc/systemd/system/ssh.service; enabled)
   Active: failed (Result: start-limit) since Mo 2015-03-30 00:59:07
CEST; 2s ago
  Process: 11646 ExecStart=/usr/sbin/sshd $SSHD_OPTS (code=exited,
status=255)
 Main PID: 3849 (code=exited, status=255)

Mär 30 00:59:06 pluto systemd[1]: Failed to start OpenBSD Secure Shell
server.
Mär 30 00:59:06 pluto systemd[1]: Unit ssh.service entered failed state.
Mär 30 00:59:07 pluto systemd[1]: ssh.service start request repeated too
quickly, refusing to start.
Mär 30 00:59:07 pluto systemd[1]: Failed to start OpenBSD Secure Shell
server.
Mär 30 00:59:07 pluto systemd[1]: Unit ssh.service entered failed state.


Not really what we want either.

As Russ pointed out, using sd_notify would be the best option, but it's
too late for jessie, but maybe something to consider for stretch.

So I suggest using the Type=forking option but also setting
RestartPreventExitStatus=255 [1], since 255 seems to be the return code
on config errors and I don't think it makes sense to restart in that case.

The resulting ssh.service would look like

[Unit]
Description=OpenBSD Secure Shell server
After=network.target auditd.service
ConditionPathExists=!/etc/ssh/sshd_not_to_be_run

[Service]
EnvironmentFile=-/etc/default/ssh
ExecStart=/usr/sbin/sshd $SSHD_OPTS
ExecReload=/bin/kill -HUP $MAINPID
KillMode=process
Restart=on-failure
Type=forking
PIDFile=/var/run/sshd.pid
RestartPreventExitStatus=255

[Install]
WantedBy=multi-user.target
Alias=sshd.service


With those changes, ssh.service ssems to behave "as expected" on failures.


Michael

[1]
http://www.freedesktop.org/software/systemd/man/systemd.service.html#RestartPreventExitStatus=
-- 
Why is it that all of the instruments seeking intelligent life in the
universe are pointed away from Earth?

Attachment: signature.asc
Description: OpenPGP digital signature

Reply to:

Follow-Ups:
- Bug#778913: openssh-server: init (at least systemd) doesn't notice when sshd fails to start and reports success
  - From: Christoph Anton Mitterer <calestyo@scientia.net>
- Bug#778913: openssh-server: init (at least systemd) doesn't notice when sshd fails to start and reports success
  - From: Michael Biebl <biebl@debian.org>

Prev by Date: Bug#781469: openssh-client: ssh.1 manpage does not document changed ForwardX11Trusted default
Next by Date: Bug#778913: openssh-server: init (at least systemd) doesn't notice when sshd fails to start and reports success
Previous by thread: Bug#781469: openssh-client: ssh.1 manpage does not document changed ForwardX11Trusted default
Next by thread: Bug#778913: openssh-server: init (at least systemd) doesn't notice when sshd fails to start and reports success
Index(es):
- Date
- Thread