[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: engineering management practices and systemd (Re: Installing an Alternative Init?)



I have been informed off-list that some might misinterpret something I
wrote here, so I will attempt to clarify a few things.

On Fri, Nov 14, 2014 at 8:59 AM, Joel Rees <joel.rees@gmail.com> wrote:
> On Thu, Nov 13, 2014 at 11:04 PM, Tanstaafl <tanstaafl@libertytrek.org> wrote:
>> On 11/12/2014 5:18 PM, Andrei POPESCU <andreimpopescu@gmail.com> wrote:
>>> On Mi, 12 nov 14, 15:43:09, Tanstaafl wrote:
>>>>
>>>> Sounds good to me, but in reality, since the default *and only* init
>>>> system for the last very many years was Sysvinit (this extremely salient
>>>> point seems to be completely and totally lost on the systemd
>>>> proponents), I think only systemd and sysvinit need to be there... but
>>>> allowing for additions once required bugs implementing them are resolved
>>>> as fixed.
>>>
>>> You're forgetting about:
>>
>> It doesn't matter Andrei...
>>
>> 1. The *default* is what we are discussing.
>>
>> The *default* for Debian has been sysvinit since - forever?
>>
>> 2. The systemd proponents pushed to make systemd the *new* default - a
>> massively major change from *all* previous releases since... forever?
>>
>> 3. A bug was opened to allow for the ability to allow a clean install to
>> be performed with systemd on wheezy, while sysvinit was still the default.
>>
>> It should have been made mandatory that the systemd folks get this bug
>> fully resolved and functional *on wheezy*, *and* commit to maintaining
>> this ability in jessie, as a pre-condition to even getting the question
>> of a change of the default init system for jessi on the ballot.
>>
>> Anything else, as I said, makes no sense.
>
> To explain to the systemd advocates who refuse to understand the
> engineering questions, this is the real engineering mistake in
> systemd.
>
> The engineering question keeps getting sidetracked by people who
> assert that we are talking about technical details, and then proceed
> to question (foolishly) the necessity of modularity, or (rightly) the
> meaning of modularity, etc. That all was and is still relevant, but if
> proper engineering principles had been followed in bringing systemd
> in, the open development practices our larger community claims as its
> reason for existence would have taken care of the technical details.
>
> Maybe it would help if I said, "engineering management", instead of
> just "engineering", although you really can't separate management from
> engineering.

This person says that I have misrepresented the Fedora community's
reaction in my description of events.

This is not an attempt to be a linear history of systemd adoption in
Fedora. It is simply intended as a few of my observations there when I
was a user, and from here in the two years since I left.

> It was clear much longer than four years ago how deeply the changes
> would effect the infrastructure which defines the system, and on which
> the stability of the system depends. Every daemon package would be
> effected, even if the systemd project had restrained themselves to
> working only on the init part of the infrastructure. Every daemon
> package needed to be fixed to the new interface, and tested under it.
> (Many still need that.)

This is not disparaging, it is acknowledging reality. If I were to
develop an alternative init, add full daemon/service management, tie
it to device management, login management, error logging, etc., the
result would impose the same level of re-implementation and testing
burden across the OS.

I wouldn't do it that way, of course, but that's the level of
engineering cost the approach they take incurred.

> They didn't, of course they didn't,

... restrain themselves, that is.

> they've admitted many times that
> the init system was not their ultimate target.

Links to Poetterings blog have been posted. It's hard to assume that
he was intending to speak in the absurd, or that he was
misrepresenting the goals of the project he leads.

> Therefore, every package that uses or provides authentication got
> entangled in the changes and needed both careful editing and extensive
> testing. The testing is still to be completed, because we are not
> talking about context-free grammar simplicity here in any of the
> parts.

I know that the systemd proponents want to claim that testing is
almost finished, but, hey, we all know how it is when we tell them
that the project is 90% complete. It's 90% of what we can see, and
more than half the time we aren't seeing anything close to the real
extent of what remains. Top-down was supposed to fix that, objects
were supposed to fix that, declarative programming was supposed to fix
that, but programming projects tend to be like cave systems. The more
we get done, the deeper we dig, the more we discover has to be done
before we are finished.

This is one of the very reasons for the existence of open source
software, that we can decide, when it is our own project, this is
where we stop for now. But just because we stop doesn't mean we are
finished.

I know the systemd proponents really want the job to be mostly
complete, and most of what they see is mostly complete. It's what they
don't see that is the problem, and I and many others here think we see
a lot more than they are admitting to seeing.

> Then every tool, package, application, etc., that used the
> system-supplied copy/paste buffer got entangled, and, while they were
> at it, they decided to try to absorb pretty much the entirety of
> inter-process communication.
>
> Careful re-write, extensive testing. The testing won't be complete yet
> by the very nature of where they are changing things.

This is the reason. Defining an API for things that are already being
done touches basically everything.

> This all would have been okay for them, if they had followed proper
> engineering (management) principles. As long as they were an
> independent maverick, they could do what they want. That was correct,
> that was good.

I want to repeat that. As long as they kept their work out of the
mainstream, it was no problem. They could refine their API as they
went and the repercussions were limited to their own source tree. That
means they could redefine the API as necessary without interfering
with the day-to-day operations of thousands, or even hundreds of
thousands of users.

The more users you have, the harder it is to fix an API error.

> For Fedora, where it was first brought into a major distribution, the
> proper way to bring it in would have been to break policy and set up a
> parallel fork.

They did not make a parallel fork. Policy was against such behavior
from a long time ago. They sort-of considered it, but it was against
policy, and the systemd project people were optimistic.

Optimism is not always a bad thing, of course. But it would have been
better to ...

> Keep the damage that necessarily occurs with this kind
> of thing restricted to a sub-community willing _and_ _able_ to deal
> with the damage by cooperating in the separate bug tracking, triage,
> etc. Keep the questions of direction somewhat independent so that the
> systemd side and the "legacy" side don't have to be in lock-step on
> every tiny detail. Allow separate of source so that regressions and
> merges can be safely scheduled and safely carried out. Etc.

They did not do that. It was assumed to be "too expensive" and against
policy, anyway.

My assertion is that, by the time this story ends, the costs of this
approach will have well outweighed the costs of the parallel fork that
the rejected.

> If they had done it right from the start, just about now, they would
> be ready for beginning the integration process in earnest,

What I'm saying here is that they were about four years too early in
starting the integration process.

> which would
> mean that about the beginning of this year, when the question came
> formally before the committee here, Fedora would have been
> implementing their own version of an installer that would allow
> choosing the new init system on install.

One reason is that they refused to even consider making room for any
alternative init.

Up until systemd, there was no API for inits. Swapping them in and out
was not that hard because the problems were mostly not in package
dependencies. Sure, modifying the init scripts for the daemons you had
running was a bit of a pain, and a bit of lost and repeated work, but
you didn't really have to touch the upstream source in most cases.

Well, except for the init scripts themselves.

With systemd there is now an API. That goes deeper into upstream
source, and that's why systemd had to push so hard in the community,
to convince all the upstream authors to make the changes.

I think that was a tactical error and an engineering (management)
error, the reasons why are beginning to become apparent. Users now,
upstream developers shortly.

systemd proponents will disagree with me.

We'll see, but we already see a tendency to define the hard problems
as "irrelevant", "too small an audience", "they're just a bunch of
idealist luddites and flat-earthers", ...

http://free-is-not-free.blogspot.jp/2013/06/the-world-is-flat.html

... and "if you aren't giving us bug reports, it's your fault!"

It's my fault, even if I don't approve of the design, the API, the
implementation, or the project management.

The systemd project is way too ambitious.

Maybe I'm just gun-shy, but I've watched a few death-march projects
start with just this kind of over-reach. If you don't know what a
death-march project is, there is currently a description on wikipedia:

http://en.wikipedia.org/wiki/Death_march_%28project_management%29

> The systemd folks were too impatient for whatever reason. They pointed
> out that Linux itself was not done that way, but their version of
> history is most politely described as colored by their desires for
> quick success for their project.
>
> "Throw it against the wall and see what sticks!" engineering is only
> appropriate for maverick projects.

I assume, from what some have said off-list, that there are some
systemd proponents who don't want to believe that systemd was
developed this way. Sorry, but I've watched the code change, the APIs
change, the design expand as things seem to be working. I'm calling it
as I see it.

> (And it is very appropriate for
> maverick projects.) Fedora may be testing for Red Hat, but it is still
> mainstream in terms of the number of users and the broad spectrum of
> the user base.
>
> So Fedora is not, itself, really ready yet, except for two groups, a
> certain group of workstation users who want and are willing to use
> fairly new, relatively high-end hardware, including enough RAM and
> processors to use VMs for certain things, and a certain group of
> server-farm users who want and have budget for similarly recent,
> relatively high-end hardware and lots of RAM and processors for lots
> of VMs.

This part seems to be causing serious angst.

I am not saying that it doesn't work for anyone not in these two
groups, anymore than I am saying that anyone in these two groups is
now safe. The odds improve significantly if you are in the groups that
the systemd project has been giving attention to, currently.

The problem is that, as the API changes, they have to revisit every
case they have visited. Their regression testing infrastructure has
been good for it so far, is the most optimistic thing I can say.

> The rest of the Fedora users jumped ship.

I guess that sounds more dramatic than is comfortable for some people.

I don't know how to test my measurements, and I won't claim more than
ten percent jumped ship, but I was not alone.

> Now, you who complain that Fedora and Red Hat are off-topic here,
> remember that Debian is inheriting the results of Red Hat's work. Work
> that did not allow a choice of inits on install, as one example of
> where their work is incomplete. That choice was something we still
> haven't got quite right yet, after how many months?

If systemd had it's own time-table and so forth, the integration
issues would be mostly matters of diffs and some macro-preprocessing.
Instead, it bites into the dependency tree, and we are struggling with
getting a natural install that allows other inits.

Even if we didn't want to support init choice, Jessie has no way to
test certain paths of regression. That's bad engineering management,
because Wheezy is a different system.

> Debian set up kfreebsd to deal with these kinds of issues, relative to
> replacing the linux kernel with the freebsd kernel. Setting up a
> debian-sysd would not have been as extensive a project as setting up
> kfreebsd, but would have been similar, because we are basically
> pulling in a new layer between the kernel and the rest of the system.
>
> The systemd folks claimed it wouldn't be necessary. If we had looked
> at the situation with an unbiased eye, we would have known they were
> being overly optimistic. We still turn a blind eye to the problems,
> claiming that the only problems are a bunch of recalcitrant
> noisemakers like yours-truly.
>
>> It is *the systemd proponents* that wanted this change, so it should be
>> *on them* to do the work. Period.

I apologize to anyone who feels cheated for having that "wall of text"
expanded and re-posted, but if I am misunderstood, I wanted to try
once more to clear things up.

I may be too pessimistic. If so, I'm still not going to apologize,
because the next time someone comes up with some similar "great idea",
I'd like the community to be at least a little more conversant in the
reasons for separating the "great idea" from the mainstream project.

-- 
Joel Rees

Be careful when you look at conspiracy.
Look first in your own heart,
and ask yourself if you are not your own worst enemy.
Arm yourself with knowledge of yourself, as well.


Reply to: