Bug#727708: socket activation

To: 727708@bugs.debian.org, Steve Langasek <vorlon@debian.org>
Subject: Bug#727708: socket activation
From: Zbigniew Jędrzejewski-Szmek <zbyszek@in.waw.pl>
Date: Mon, 30 Dec 2013 06:57:06 +0100
Message-id: <[🔎] 20131230055705.GM13454@in.waw.pl>
Reply-to: Zbigniew Jędrzejewski-Szmek <zbyszek@in.waw.pl>, 727708@bugs.debian.org
In-reply-to: <[🔎] 20131229091055.GB32043@virgil.dodds.net>
Steve Langasek <vorlon@debian.org> wrote:
> However, I think this gets to the heart of why upstart upstream has avoided
> ever recommending the use of socket-based activation.  There are some fairly
> fundamental problems that basically halted development of socket-based
> activation in upstart, and a look at systemd usage on
> Fedora led me to believe that systemd had not overcome these problems at
> all.
Your e-mail raises serious doubts about socket activation, so I'll
respond to try to clear up some issues even though most or all points
have been rebutted in other replies in this thread.

Before I do that though, let me say that your mail doesn't show the
level of dilligence that I'd expect from a seasoned Debian developer
and a member of the Technical Committee.

> If I'm not mistaken (no references to hand - sorry), systemd upstream has
> claimed in the course of discussions on debian-devel that lazy activation is
> not the purpose of socket-based activation, and that using socket-based
> activation does not require you to pay the service startup penalty at the
> time of first connection.
You get the *possiblity* of lazy activation, it is one of the
reasons. [1, 2, 3] list many good reasons (rootless access to ports < 1024,
simplification of daemons, no explicit dependencies between services,
upgrades and restarts while keeping the socket open, and also lazy activation).
They are all non-conflicting. [3] shows great things that can be done
with lazy activation.

[1] http://0pointer.de/blog/projects/socket-activation.html
[2] http://0pointer.de/blog/projects/systemd.html
[3] http://0pointer.de/blog/projects/socket-activated-containers.html

There are also further plans (at the implemented-but-not-merged stage)
for further functionality: spawning multiple workers using SO_REUSEPORT [4].

[4] http://thread.gmane.org/gmane.comp.sysutils.systemd.devel/15364/focus=15598

> However, this is not borne out by my experiments
> with systemd on Fedora (which I would presume to be the go-to source for
> best practices of systemd service activation).
Fedora has something like 800 unit files. I'm sure that quite a few
could be improved. I don't think that proves anything.

> On Fedora 20, after enabling the sshd and rsync service+socket units (both
> installed but disabled by default on Fedora per their policies on running
> services out-of-the-box) and rebooting, I find that both port 22 and port
> 873 are bound by pid 1.  Only upon connecting to the socket does systemd
> actually spawn the server (in the case of sshd, it spawns it as 'sshd
> -i', so has to start up the server anew on each connection;
You can switch to non-lazy single-instance mode by 'systemctl disable
sshd.socket; systemctl enable sshd.service', and back by 'systemctl
disable sshd.service; systemctl enable sshd.socket'.

It'd have been much better to simply ask "How do I do (lazy
activation|inetd-style activation| non-lazy activation)?" without
drawing all those premature conclusions. I added some more
documentation do systemd.socket(5) [5]. I'm also happy to take any
further documentation rfe's.

[5] http://cgit.freedesktop.org/systemd/systemd/commit/?id=3cf148f

> in the case of
> rsyncd, the .service definition is completely incompatible with socket-based
> activation and any attempt to connect results in the rsyncd.socket unit
> landing in a 'failed' state).
Like Russ, Uoti, and Josselin said, it a bug. Actually rsyncd.service
is fine, but rsyncd.socket is broken. This has been fixed for F19 [6],
but apparently not for F20. F20 came out a week ago, so probably
nobody has noticed so far.  Bug filed [7].

[6] https://bugzilla.redhat.com/show_bug.cgi?id=1018520
[7] https://bugzilla.redhat.com/show_bug.cgi?id=1047236

> If what one is trying to accomplish here is to provide a replacement for
> inetd, then this is perfectly sufficient.  But if one is trying to use
> socket-based activation for the claimed purpose of ensuring service startup
> ordering without the need to declare explicit dependencies, then you must
> accept the penalty of lazy activation
No, this is a complete misunderstanding.

> - which is almost never acceptable in
> a server context (where the purpose of the machine is to run the services
> that you have configured, and they should therefore not be started lazily),
Complete non sequitur. For many services the initialization time
(usually in miliseconds) is an acceptable delay on the first connection.
Probably it is even unnoticable compared to network latency. Please remember
that this is not for sysvinit scripts which do sleep in a loop. [8] is a
nice example of lazy socket activation of services on a massive scale.

[8] https://www.getpantheon.com/blog/pantheon-running-over-500000-systemd-units

> and suboptimal even in a client context (since not starting services that
> are on the critical path for boot until the client requests them will
> potentially lead to a significant boot-time slowdown, if all the services
> are doing this).
No. If you have all services started lazily, which could happen on a server,
the boot-time is great, the system is up in a couple hundred milliseconds.
When a connection comes in, things necessary for this connection are
started and nothing else. In some cases you want that, in others not.
Lazy activation you can turn on and off, so there's really no issue
here.

> As far as I've been able to tell, the only solutions that would allow
> non-lazy socket-based-activation of services in systemd all introduce
> significant boot-time races, whereby it is no longer assured that systemd
> will bind to the socket
Please look at [9]. Sockets will be bound before services are started.
Also, even if it happened that systemd and the service itself, or two
different services for that matter, would try to bind the same socket,
it's hardly the end of the world. You'd get an error in one of the units.
In the normal case automatic ordering requirements added by systemd
will be enough.

[9] http://www.freedesktop.org/software/systemd/man/bootup.html#System%20Manager%20Bootup

> (and passing the socket information via the
> environemnt) before starting the service.  Indeed, when I looked at this
> problem on an earlier version of Fedora, I found what I believe to be a
> latent security problem in the cups units, because it was nondeterministic
> whether the service would start with sockets passed from systemd, or a
> different set of sockets as defined in the cups config!
I fail to see a "security problem" here. It's not like cups opening its
sockets the way it has done for years suddently starts being insecure.
Even if systemd would compete with cups here. All that said, cups.socket
has Before=cups.service, so systemd should always bind the socket earlier
or not at all.

> When I mentioned this to Lennart at DebConf this year, his response was that
> "cups was special".  Well, after further investigation, I don't think it's
> true that cups is special. 
cups.service is (a) socket activated (at least sometimes, (b) hardware activated
when usb printers are plugged in, (c) dbus activated when necessary, and/or
(d) started as part of the boot process. It *is* special because most services
don't have this complexity.

And actually doing this all in the init manager makes sense, because if
any two of those things happen at once, there's still no race condition.

> I think systemd socket-based activation is snake
> oil, that does not do what was promised without introducing hidden
> trade-offs which no one has been forced to acknowledge because too few
> developers are making use of this feature today to expose the integration
> problems.
Wow.

> Of course, it's entirely possible that I've misunderstood something here, so
> I welcome your investigations with lbcd.  I'm very interested to see if your
> understanding of systemd socket-based activation best practices matches my
> own, and to have an opportunity to experiment with socket-based activation
> in the more relevant environment of Debian unstable rather than Fedora.
> 
> FWIW, if indeed socket-based activation in systemd can't actually be used
> for anything besides a glorified inetd,
Wow.

> I think that has implications for
> the discussion about daemon readiness protocols.  The argument for systemd
> seems to be "use sd_notify, or if you don't like having a library dependency
> then just use socket-based activation which is better anyway".  But I'm sure
> there will be upstreams who don't want lazy initialization any more than
> they want an external library dependency.
Certainly true (if we ignore the part about lazy activation) for some
upstreams. But it's not an argument for *this* discussion, unless
those fears have basis in truth.

Uoti Urpala <uoti.urpala@pp1.inet.fi> wrote
> On Sun, 2013-12-29 at 10:37 -0800, Steve Langasek wrote:
> > diligently keeping the two implementations in sync.  Since
> > LISTEN_FDS/LISTEN_PID is the defined API for systemd passing the socket
> > information to the service, for systemd to ever fail to pass this socket
> > information, resulting in the service deciding that it's not *actually*
> > running under systemd and should fall back to a different mode, is
> > potentially a very serious problem.
> 
> If you want to make sure your service never tries to start without
> socket activation, it should have Requires=foo.socket; none of the
> default relations are strong enough to strictly prohit starting
> without a socket.
Exactly. All this is because systemd doesn't *force* you to use socket
activation. If you have a .socket file with Accept=no and a .service
file, you can have the service enabled alone (so no socket activation),
both enabled (socket activation), just the socket enabled (lazy
socket activation). Then you could even have a second .socket
file with Accept=yes and .service template file (with @ in the name),
and have inetd-style per-connection activation. With choice comes
complexity.

Zbyszek
Reply to:
References:
- Bug#727708: upstart and upgrading from sysvinit scripts
  - From: Steve Langasek <vorlon@debian.org>
Prev by Date: Bug#727708: init system other points, and conclusion
Next by Date: Bug#727708: upstart and upgrading from sysvinit scripts
Previous by thread: Bug#727708: upstart and upgrading from sysvinit scripts
Next by thread: Bug#727708: upstart and upgrading from sysvinit scripts
Index(es):
- Date
- Thread