Bug#727708: systemd (security) bugs (was: init system question) [and 1 more messages]
Uoti Urpala writes ("Bug#727708: systemd (security) bugs (was: init system question)"):
> [Ian Jackson:]
> > Here are a couple of exciting snippets:
> > https://bugzilla.redhat.com/show_bug.cgi?id=708537
> > Problems with runlevel emulation doing mad things. It isn't clear
> > to me whether this bug is a symptom of a fundamental problem with
> > systemd's state-based dependency model, or whether it's simply a
> I think it's completely obvious that there is no "fundamental problem".
> I wonder what could make you consider it a possible symptom of one.
> > missing feature or perhaps even wrong default configuration. But
> > the bug has been open for some time.
> My guess is that most people do not consider that "exciting" or really
> care - thinking of system states in terms of "runlevels" is mostly
> obsolete, and the flaws do not bother many people in the cases where
> backwards compatibility is still needed.
Statements like this are part of what make me think this might be a
fundamental problem. When a systemd proponent tells me that a
particular use pattern is unimportant or wrongheaded, I tentatively
infer that systemd cannot support it properly.
It seems to me that the difficulties with the runlevel emulation are
likely to affect other similar use patterns too. The problems don't
seem specific to the nature of runlevels. Perhaps they are specific
to the way runlevels are emulated by systemd but in that case that
emulation should surely be fixed.
Michael Stapelberg writes ("Bug#727708: systemd (security) bugs (was: init system question)"):
> [Ian Jackson:]
> > There are a couple of filesystem races (CVE-2012-1174, CVE-2013-4392)
> > which I think a concurrent init system author ought really to be
> > competent to avoid. (And the system should be designed, so far as
> > possible, to reduce the opportunity for such races.)
> “a concurrent init system author” sounds strange on multiple counts:
> systemd was not written by one author. It is also not concurrent (in
> fact it is single-threaded and only links to pthreads to call sync
> asynchronously on shutdown), but event-based.
systemd, like upstart, is concurrent in that it does more than one
thing at once. In particular, it does concurrent startup of
One of the main potential advantages of a new approach to service
management is that it is less racy. This advantage depends on the
authors of the replacement having thought about the problems in the
right way, and not written racy code.
> As for competency, I am sure that everybody involved has learned
> their lesson and will avoid such issues in the future.
My point was that someone who is writing an init system for concurrent
startup and dynamic service management needs to have a good
understanding of concurrent system design, and in particular of race
hazards. I wouldn't expect a person or people who had such an
understanding to make many mistakes of the kind seen here.
> > The "systemctl status" resource usage DoS (CVE-2012-1101) is an
> > understandable resource leak, given systemd's design. But for me it
> > raises this question: why is the system designed in such a way that
> > the critical pid 1 is required to implement functionality (and
> > unprivileged-facing interfaces) in which such a resource leak is (a) a
> > likely programming error and then (b) exposed so as to be exploitable.
> a) I think “a likely programming error” is an exaggeration. Do you have
> data on how often there were resource leaks in systemd?
AIUI from reading the advisories (I haven't read the code) this was a
simple memory leak bug.
> b) I am unclear on how exactly this was exploitable, and the bugs lack
> explanation unfortunately.
AIUI the exploit works as follows: the attacker runs "systemctl status
<blah>" where <blah> is a random invented unit name. They then repeat
this millions of times. Each time systemd allocates memory to record
this phantom unit; this memory is never freed. Eventually some kind
of resource is exhaused (e.g. RAM, address space, ulimit) and systemd
stops working properly.
> Furthermore, I think Lennart’s explanation of why arbitrary units must
> be able to be created is fair:
How was CVE-2012-1101 fixed ? That bug doesn't show the patch.
I went and looked and found this:
It looks like the appropriate gc function was simply not called. As I
thought, an understandable programming error, but one which exists
only because of (perhaps essential) complexity in the code.
More importantly it is one which is exploitable only as a consequence
of the questionable design decision to expose pid 1 to ordinary users.
> > AIUI the journald integer overflow (CVE-2013-4391) is a remotely
> > exploitable bug, if you have configured journald to allow remote
> > logging. (I assume this isn't the default configuration but haven't
> > checked.)
> journald does not provide remote logging. See
Err, so I don't understand then why CVE report this vulnerability as
| Integer overflow in the valid_user_field function in
| journal/journald-native.c in systemd allows remote attackers to
| cause a denial of service (crash) and possibly execute arbitrary
| code via a large journal data field, which triggers a heap-based
| buffer overflow.
If journald doesn't support remote logging, how is this vulnerability
remotely exploitable ?
> Personally, I don’t know about every little detail in fd passing
> either. If I read you correctly, you seem to be saying one needs to be
> an expert in a given area before being allowed to write code in it. I
> think it works the other way around: by writing code in that area, you
> become an expert in it.
What a startling statement. This is not some desktop toy we are
talking about; this is critical core system infrastructure.
I would prefer my pid 1 to have been written by experts. It appears
that you are saying that systemd wasn't and that this isn't important!
> Instead of focusing on the actual security issues, what I’d much rather
> look at is the process with which such bugs are fixed. I.e. are security
> problems acknowledged as such, are they fixed clearly and in a timely
> manner? Are there enough eyes looking at the project to uncover, report
> and collaborate in fixing the issues?
I don't think having a functioning security response process is a
substitute for good system design, and a high initial code quality.
And I don't think that "many eyes" is as helpful as you apparently
think. Security code review is bloody hard work.
> Also, and I think that should go without saying, if this branch of the
> discussion is considered as reasoning against systemd in the decision
> process, I’d like to see similar data on the other init systems :).
You are of course welcome to go and find that information.