[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#727708: init system thoughts



Steve Langasek <vorlon@debian.org> writes:

> The purpose of failsafe.conf is to ensure that services which have not
> been converted to the native format, but instead provide initscripts
> that are called upon reaching runlevel 2, are started at the right time
> - so that they aren't unreliable due to racing the network stack.  This
> is an existing bug on sysvinit systems, but the race is hit much less
> frequently there because sysvinit is slower.

Okay, thanks, that's pretty much what I'd thought.  Yes, that's what in
systemd one should address via network-online.target and some sort of
local integration that implements whatever "network is up" policy that you
want to enforce.

Given that ifupdown is still by far the best way to manage networks on
servers, and most of these init issues are most likely to happen on
servers, I think we should add some sort of ifupdown integration with the
network-online.target in the Debian systemd package that matches Debian's
current definition of the LSB $network target.

systemd's upstream is entirely correct that $network is rather
underspecified from an LSB perspective, but Debian *does* have a
definition, and the principle of least surprise says that we should
duplicate that definition in a new init system.  I assume that's what
failsafe.conf is effectively doing for upstart.

> I am left with the concern that I seem to be the first person to ask
> this question, in discussions with the TC, six months after AIUI the
> systemd maintainers considered systemd ready to be made the default.

Well, one, that's why we have these discussions.  More eyes on things like
this are going to find issues that we need to deal with.  And your
expertise in the sorts of issues Ubuntu encountered is very helpful.

And, second, we're talking about problems that will happen with badly
written local init scripts and are less likely to happen with packages in
the archive (which are more likely to be well-written).  I'm not
particularly surprised that systemd early adopters don't have a lot of
badly-written local init scripts that they continue to use.

> So I fear that switching to systemd by default is going to result in
> easier package maintenance for early adopters, but a much buggier
> experience for our users.  If we decide for systemd, what do you think
> is the right way to mitigate such problems for jessie?  Some of these
> issues are only going to be seen once people start making use of systemd
> in anger with their existing server configs (e.g., an ec2 VM with a
> simple disk and network config is not going to expose these problems),
> and I don't really think we want to do this by way of switching the
> default in unstable and waiting for the bugs to roll in.

I think there are multiple tiers of answers to this question.

Changing init systems is going to be disruptive.  There's simply no way
around that.  It was disruptive when we switched to dependency-based boot,
which also surfaced tons of problems with local init scripts and caused a
lot of users to complain.  It's going to be disruptive when we switch to
any other new init system.  That's just the nature of the beast.

This is one of the reasons why I think we should support booting jessie
with sysvinit.  This parallels the migration path that we took for
dependency-based boot.  We make it clear what the new default is, but if
people run into trouble, they can always fall back to sysvinit to get
their stuff working again.  It gives people a release cycle of leeway
before they *have* to make sure their systems work with the new init
system since, indeed, problems with local hacks are unlikely to start
showing up until we release the new init system in stable.

I therefore think we should use a very similar approach to what we did
with dependency-based boot.  We're already in the first stage of that:
systemd is available as an option in unstable.  A bunch of people are
using it, and have been using it for a while, and are reporting problems.
The next step will be to start pushing for broader adoption, and possibly,
if we can figure out a good way to do it so that people can switch back,
have dist-upgrade switch systems to systemd.  (Of course, we would do this
after we've hammered out the Policy work.)

Then, when we release, there will obviously need to be a discussion of
this in the release notes, as well as instructions on how to fall back to
sysvinit, and possibly additional notes about common problems based on
what we uncover from early upgrade reports.

So, in other words, I do think a large component of the solution is to,
indeed, switch in unstable and let the bugs roll in, which is how Debian
tests everything.  We can stage things somewhat more (for example, I think
we should actively encourage Debian developers to switch to the new
default in advance and report problems), but at the end of the day that's
going to be a large part of the testing process, just like it was with
dependency-based boot.

Now, you are entirely correct that integration with upstart would probably
be easier and moderately less disruptive than integration with systemd
simply because Ubuntu has already done a large portion of that work.  Red
Hat and SuSE are also doing systemd conversions, and we will be able to
share experiences with them and do some amount of mutual testing, but that
isn't the same as doing a conversion based on the packages that we share
with Ubuntu.  upstart certainly has a head start there.

However, that said, I believe the integration of systemd will actually be
easier in the long run because upstart is rather... weird.  Once we get to
the point where we're not just trying to get legacy init scripts working
the same way and Debian developers are writing native configurations, I
think systemd will do much better.  As discussed at some length in the
other branch about upstart's event model, I think upstart's way of
handling dependencies is very strange and difficult to wrap one's mind
around.  There are a range of weird gotchas that are inobvious if you've
not used it extensively and if you've not wrapped your mind around the way
it's supposed to work.

See, for example, the other thread about how one should declare a
dependency in an upstart configuration such that the service won't be
started (including by manual administrator action) unless the dependency
has been made available.  This is straightforward in systemd and appears
to be surprisingly confusing in upstart.

-- 
Russ Allbery (rra@debian.org)               <http://www.eyrie.org/~eagle/>


Reply to: