[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: engineering management practices and systemd (Re: Installing an Alternative Init?)



On Sun, Nov 16, 2014 at 09:43:23PM +0900, Joel Rees wrote:
> I have been informed off-list that some might misinterpret something I
> wrote here, so I will attempt to clarify a few things.
> 
> On Fri, Nov 14, 2014 at 8:59 AM, Joel Rees <joel.rees@gmail.com> wrote:
> > On Thu, Nov 13, 2014 at 11:04 PM, Tanstaafl <tanstaafl@libertytrek.org> wrote:
> >> On 11/12/2014 5:18 PM, Andrei POPESCU <andreimpopescu@gmail.com> wrote:
> >>> On Mi, 12 nov 14, 15:43:09, Tanstaafl wrote:
> >>>>
> >>>> Sounds good to me, but in reality, since the default *and only* init
> >>>> system for the last very many years was Sysvinit (this extremely salient
> >>>> point seems to be completely and totally lost on the systemd
> >>>> proponents), I think only systemd and sysvinit need to be there... but
> >>>> allowing for additions once required bugs implementing them are resolved
> >>>> as fixed.
> >>>
> >>> You're forgetting about:
> >>
> >> It doesn't matter Andrei...
> >>
> >> 1. The *default* is what we are discussing.
> >>
> >> The *default* for Debian has been sysvinit since - forever?
> >>
> >> 2. The systemd proponents pushed to make systemd the *new* default - a
> >> massively major change from *all* previous releases since... forever?
> >>
> >> 3. A bug was opened to allow for the ability to allow a clean install to
> >> be performed with systemd on wheezy, while sysvinit was still the default.
> >>
> >> It should have been made mandatory that the systemd folks get this bug
> >> fully resolved and functional *on wheezy*, *and* commit to maintaining
> >> this ability in jessie, as a pre-condition to even getting the question
> >> of a change of the default init system for jessi on the ballot.
> >>
> >> Anything else, as I said, makes no sense.
> >
> > To explain to the systemd advocates who refuse to understand the
> > engineering questions, this is the real engineering mistake in
> > systemd.
> >
> > The engineering question keeps getting sidetracked by people who
> > assert that we are talking about technical details, and then proceed
> > to question (foolishly) the necessity of modularity, or (rightly) the
> > meaning of modularity, etc. That all was and is still relevant, but if
> > proper engineering principles had been followed in bringing systemd
> > in, the open development practices our larger community claims as its
> > reason for existence would have taken care of the technical details.
> >
> > Maybe it would help if I said, "engineering management", instead of
> > just "engineering", although you really can't separate management from
> > engineering.
> 
> This person says that I have misrepresented the Fedora community's
> reaction in my description of events.

And you still do. Proofreading and giving links is not so hard, but way harder if
that mean discovering that you may base your ideas on wrong premises.
 
> This is not an attempt to be a linear history of systemd adoption in
> Fedora. It is simply intended as a few of my observations there when I
> was a user, and from here in the two years since I left.
> 
> > It was clear much longer than four years ago how deeply the changes
> > would effect the infrastructure which defines the system, and on which
> > the stability of the system depends. Every daemon package would be
> > effected, even if the systemd project had restrained themselves to
> > working only on the init part of the infrastructure. Every daemon
> > package needed to be fixed to the new interface, and tested under it.
> > (Many still need that.)
> 
> This is not disparaging, it is acknowledging reality. If I were to
> develop an alternative init, add full daemon/service management, tie
> it to device management, login management, error logging, etc., the
> result would impose the same level of re-implementation and testing
> burden across the OS.
> 
> I wouldn't do it that way, of course, but that's the level of
> engineering cost the approach they take incurred.

You say that every daemon need to be fixed for the new interface, but
then either things are broken, and so, you should be able to show bugs reports
( from mageia, from arch, from opensuse ), or they are not and so
you cannot really show they are not broken.

It was a explicit goal of system to still support regular scripts, and
there isn't a flood of debian bug reports to say that it not working.

 
> > They didn't, of course they didn't,
> 
> ... restrain themselves, that is.
> 
> > they've admitted many times that
> > the init system was not their ultimate target.
> 
> Links to Poetterings blog have been posted. It's hard to assume that
> he was intending to speak in the absurd, or that he was
> misrepresenting the goals of the project he leads.
> 
> > Therefore, every package that uses or provides authentication got
> > entangled in the changes and needed both careful editing and extensive
> > testing. The testing is still to be completed, because we are not
> > talking about context-free grammar simplicity here in any of the
> > parts.
> 
> I know that the systemd proponents want to claim that testing is
> almost finished,

Give links.

because no software is ever finished, so no testing can be complete unless 
you stop changing the software.

Since you are already clarifying this email, I will assume that this could be
also a part that requires clairification.

> but, hey, we all know how it is when we tell them
> that the project is 90% complete. It's 90% of what we can see, and
> more than half the time we aren't seeing anything close to the real
> extent of what remains. Top-down was supposed to fix that, objects
> were supposed to fix that, declarative programming was supposed to fix
> that, but programming projects tend to be like cave systems. The more
> we get done, the deeper we dig, the more we discover has to be done
> before we are finished.
> 
> This is one of the very reasons for the existence of open source
> software, that we can decide, when it is our own project, this is
> where we stop for now. But just because we stop doesn't mean we are
> finished.
> 
> I know the systemd proponents really want the job to be mostly
> complete, and most of what they see is mostly complete. It's what they
> don't see that is the problem, and I and many others here think we see
> a lot more than they are admitting to seeing.
> 
> > Then every tool, package, application, etc., that used the
> > system-supplied copy/paste buffer got entangled, and, while they were
> > at it, they decided to try to absorb pretty much the entirety of
> > inter-process communication.
> >
> > Careful re-write, extensive testing. The testing won't be complete yet
> > by the very nature of where they are changing things.
> 
> This is the reason. Defining an API for things that are already being
> done touches basically everything.
> 
> > This all would have been okay for them, if they had followed proper
> > engineering (management) principles. As long as they were an
> > independent maverick, they could do what they want. That was correct,
> > that was good.
> 
> I want to repeat that. As long as they kept their work out of the
> mainstream, it was no problem. 

Your definition of mainstream is strange. So far, I didn't see systemd being
on something else than Linux, and GNU/Linux is not mainstream ( android is, but
systemd is out of android ).

So they kept it out of mainstream, unless you define mainstream as "being used 
by users", in which way I would love to see how you get user feedback 
without having users in the first place. 

> They could refine their API as they
> went and the repercussions were limited to their own source tree. That
> means they could redefine the API as necessary without interfering
> with the day-to-day operations of thousands, or even hundreds of
> thousands of users.
> 
> The more users you have, the harder it is to fix an API error.

yeah, and that's why there is a table :
http://www.freedesktop.org/wiki/Software/systemd/InterfacePortabilityAndStabilityChart/

now, the linux kernel do not have such table, and prevent anyone from writing a 
out of kernel module due to that, despites requests. This didn't prevent
it from being written ( like nvidia out of tree ) , and this didn't prevent people 
from trying, or even creating companies around their patches ( like openvz, grsec for example ).

This doesn't prevent company like samsung or google from forking,
even after their patches go rejected.

If we really cared about ABI in the linux world, we would all follow the LSB. 
In practice, no one seems to care, so your concerns seems a bit
disingenuous.

> > For Fedora, where it was first brought into a major distribution, the
> > proper way to bring it in would have been to break policy and set up a
> > parallel fork.
> 
> They did not make a parallel fork. Policy was against such behavior
> from a long time ago. They sort-of considered it, but it was against
> policy, and the systemd project people were optimistic.

You are misleading.

There is nothing that prevent anyone from forking Fedora, the only thing
is that you have to name it otherway to avoid confusion. Like Mozilla,
like Debian, like Gnome.

And there was 1 release ( Fedora 14 ) where Fedora did have systemd in option to get enough 
tests, and then it was switched as default on F15. See writing on this
from lwn : http://lwn.net/Articles/401856/

Now, no one cared about maintaining alternatives so this didn't happened. And
given the high number of people not doing anything in this thread, the same
will indeed happen, unless people start to contribute ( instead of asking to
others to do stuff ).

> Optimism is not always a bad thing, of course. But it would have been
> better to ...
> 
> > Keep the damage that necessarily occurs with this kind
> > of thing restricted to a sub-community willing _and_ _able_ to deal
> > with the damage by cooperating in the separate bug tracking, triage,
> > etc. Keep the questions of direction somewhat independent so that the
> > systemd side and the "legacy" side don't have to be in lock-step on
> > every tiny detail. Allow separate of source so that regressions and
> > merges can be safely scheduled and safely carried out. Etc.
> 
> They did not do that. It was assumed to be "too expensive" and against
> policy, anyway.

Again, show the said policy, since that's a rather strong word.
( and different from a decision from Fesco or whoever is deciding
on such question ).

And if that's not too expensive, why is no one doing it anyway for
Debian ? Because if people who do the work think this is too expensive
and people who do not do the work think it is not, I tend to trust
the ones that do the work because they know, not the one rambling
without doing. 

As Linus Torvalds said, "talk is cheap, show us the code".

> My assertion is that, by the time this story ends, the costs of this
> approach will have well outweighed the costs of the parallel fork that
> the rejected.

No. Because doing 2 init systems wouldn't remove any work from doing one.
So you can only add work.

Now, if you want to say that spending less time on systemd and more time of
sysvinit would have been better than focusing on systemd alone, say it clearly.
In the end, you would get a different result, so that's hardly comparable.

In fact, the only comparable outcome would be :
- systemd is dropped and we go back to sysvinit
- systemd is dropped and we go on a 3rd system

The first outcome is not gonna happen IMHO. No distribution went back afaik.
The second outcome would be better if moving away from sysvinit was easier 
than moving from systemd, and I do not think this would be the case. 

Now, taking the choice of dropping sysvinit would had mattered only in the 
first outcome. And i think the consensus was that it wouldn't happen. 
So the choice between reducing ressources on systemd or keeping 2 init systems
was more "do we want 2 half baked solution, or one complete one". 
And it was decided to get one thing that work rather than 2 without ressources
to make them work. And since no one stepped to do any work later, this seemed 
to be the right decision.
 
> > If they had done it right from the start, just about now, they would
> > be ready for beginning the integration process in earnest,
> 
> What I'm saying here is that they were about four years too early in
> starting the integration process.
>
> > which would
> > mean that about the beginning of this year, when the question came
> > formally before the committee here, Fedora would have been
> > implementing their own version of an installer that would allow
> > choosing the new init system on install.
> 
> One reason is that they refused to even consider making room for any
> alternative init.
> 
> Up until systemd, there was no API for inits. Swapping them in and out
> was not that hard because the problems were mostly not in package
> dependencies.

Upstart has a dbus api. http://upstart.ubuntu.com/wiki/DBusInterface 

Now, since you seems to not be perfectly clear with what you call "api", 
and I suspect you speak about the readyness protocol.
So Upstart also use a sigstop protocol to notify readyness, as discussed in
great lenght by the tech-ctte. A summary can be found on
http://spootnik.org/entries/2014/11/09_pid-tracking-in-modern-init-systems.html

> Sure, modifying the init scripts for the daemons you had
> running was a bit of a pain, and a bit of lost and repeated work, but
> you didn't really have to touch the upstream source in most cases.
> 
> Well, except for the init scripts themselves.

And bug. Like when upstream didn't really went in the
background correctly, forgetting a chdir('/'), or when upstream
didn't relinquish privileges because the forgoet initgrps(). Of course,
this integration have been done and done again and pushed upstream, so this
was almost transparent to most users. That doesn't mean this didn't exist.
And of course, let's not speak about porting ot the software on BSD, who use
sometime different primitives ( kqueu vs epoll ).

 
> With systemd there is now an API. That goes deeper into upstream
> source, and that's why systemd had to push so hard in the community,
> to convince all the upstream authors to make the changes.

You can perfectly work without any API. Since most daemons do not integrate
with systemd and since they can still start, that's not a problem. 

> I think that was a tactical error and an engineering (management)
> error, the reasons why are beginning to become apparent. Users now,
> upstream developers shortly.
> 
> systemd proponents will disagree with me.

Sure, because you are technically incorrect. If you want to convince
people, it would be much better to be correct, ortherwise, people
will just continue to think that anti systemd are clueless.
 
> We'll see, but we already see a tendency to define the hard problems
> as "irrelevant", "too small an audience", "they're just a bunch of
> idealist luddites and flat-earthers", ...
> 
> http://free-is-not-free.blogspot.jp/2013/06/the-world-is-flat.html
> 
> ... and "if you aren't giving us bug reports, it's your fault!"

It is. That's the way debian and any free software works.
People are already offering a complete operating systems, and you do
not even contribute any useful bug reports to the project ?

if you really cared about sysvinit integration, you would test it instead
of rambling, especially when this was explained as being the way to 
go several times. If you have time to write long email, you
have time to make bug reports ( even if you do not
seems to have time to get your facts right ).

If you do nothing, then you cannot complain that nothing
happen.
 
> It's my fault, even if I don't approve of the design, the API, the
> implementation, or the project management.
> 
> The systemd project is way too ambitious.
> 
> Maybe I'm just gun-shy, but I've watched a few death-march projects
> start with just this kind of over-reach. If you don't know what a
> death-march project is, there is currently a description on wikipedia:
> 
> http://en.wikipedia.org/wiki/Death_march_%28project_management%29

The difference is that :
- you are not a project member ( neither systemd nor Debian ), 
so the definition of "the members feel it is destined to fail" doesn't apply
- after 4 years, this has produced something ( ask to people of CoreOs how 
they feel about systemd, having made their whole OS around it, and having
recevie funding for that, or ask people of
Panteon, having hosting using it, or people from Joila, using it for their
phone, people from intel, paying coders to work on it )

So that's not really a death march project, except if you deny the reality.

> > The systemd folks were too impatient for whatever reason. They pointed
> > out that Linux itself was not done that way, but their version of
> > history is most politely described as colored by their desires for
> > quick success for their project.
> >
> > "Throw it against the wall and see what sticks!" engineering is only
> > appropriate for maverick projects.
> 
> I assume, from what some have said off-list, that there are some
> systemd proponents who don't want to believe that systemd was
> developed this way. Sorry, but I've watched the code change, the APIs
> change, the design expand as things seem to be working. I'm calling it
> as I see it.

That's a free software project, that is publish soon, publish early.
Of course, for someone coming from a proprietary background, or not
knowing how thing are done in free software, thing always look in flux.

But in free software, where losley coupled group all evolve in the open, this
is the way it work. There is no big architect making meeting and top down
management saying "do that". There is independant groups
discussing and making change where all can see. 

So you see changes, that's because of transparency, discussion and feedback.
You seems to paint all of that as a bad thing, but that's how things evolves.

The same go in proprietary companies, except you do not see it. But
that doesn't mean it doesn't happen.
 
> > (And it is very appropriate for
> > maverick projects.) Fedora may be testing for Red Hat, but it is still
> > mainstream in terms of the number of users and the broad spectrum of
> > the user base.
> >
> > So Fedora is not, itself, really ready yet, except for two groups, a
> > certain group of workstation users who want and are willing to use
> > fairly new, relatively high-end hardware, including enough RAM and
> > processors to use VMs for certain things, and a certain group of
> > server-farm users who want and have budget for similarly recent,
> > relatively high-end hardware and lots of RAM and processors for lots
> > of VMs.
> 
> This part seems to be causing serious angst.
> 
> I am not saying that it doesn't work for anyone not in these two
> groups, anymore than I am saying that anyone in these two groups is
> now safe. The odds improve significantly if you are in the groups that
> the systemd project has been giving attention to, currently.
> 
> The problem is that, as the API changes, they have to revisit every
> case they have visited. Their regression testing infrastructure has
> been good for it so far, is the most optimistic thing I can say.
> 
> > The rest of the Fedora users jumped ship.
> 
> I guess that sounds more dramatic than is comfortable for some people.
> 
> I don't know how to test my measurements, and I won't claim more than
> ten percent jumped ship, but I was not alone.

Well, Fedora estimate their users around the million. Unless you claim
that 1) you spoke with people and they said "I moved because of systemd"
2) the number of people was more than 100 000 
then your estimation of 10% is just a fantasy.

There could be any reason for people to leave of to come,
and given the measure, any change on the network level ( like
having a local cache or private mirror for a set of
10 000 work station ) could have a huge impact on the numbers.

So since you can hardly have surveyed lots of users due 
to human nature, I think you should stay on factuals statement, 
and do not get numbers out of nowhere.

> > Now, you who complain that Fedora and Red Hat are off-topic here,
> > remember that Debian is inheriting the results of Red Hat's work. Work
> > that did not allow a choice of inits on install, as one example of
> > where their work is incomplete. That choice was something we still
> > haven't got quite right yet, after how many months?
> 
> If systemd had it's own time-table and so forth, the integration
> issues would be mostly matters of diffs and some macro-preprocessing.
> Instead, it bites into the dependency tree, and we are struggling with
> getting a natural install that allows other inits.
> 
> Even if we didn't want to support init choice, Jessie has no way to
> test certain paths of regression. That's bad engineering management,
> because Wheezy is a different system.

Again, if no one do the work, no work appear. Writing essay about it do not
make work appear, unless the essay explain why you need people
to do work.
 
> > Debian set up kfreebsd to deal with these kinds of issues, relative to
> > replacing the linux kernel with the freebsd kernel. Setting up a
> > debian-sysd would not have been as extensive a project as setting up
> > kfreebsd, but would have been similar, because we are basically
> > pulling in a new layer between the kernel and the rest of the system.
> >
> > The systemd folks claimed it wouldn't be necessary. If we had looked
> > at the situation with an unbiased eye, we would have known they were
> > being overly optimistic. We still turn a blind eye to the problems,
> > claiming that the only problems are a bunch of recalcitrant
> > noisemakers like yours-truly.
> >
> >> It is *the systemd proponents* that wanted this change, so it should be
> >> *on them* to do the work. Period.
> 
> I apologize to anyone who feels cheated for having that "wall of text"
> expanded and re-posted, but if I am misunderstood, I wanted to try
> once more to clear things up.

I guess you need likely a 3rd try. because being factually incorrect
do not help any one to understand what you mean. For example, you didn't
gave any link, you do use term like "API" without speaking
of what you exactl mean by that ( as it seems to be "protocol" in what
I understand and what would make sense after reading twice ).
 
> I may be too pessimistic. If so, I'm still not going to apologize,
> because the next time someone comes up with some similar "great idea",
> I'd like the community to be at least a little more conversant in the
> reasons for separating the "great idea" from the mainstream project.

If by community, you mean "the bystanders doing nothing", they are
not gonna be more listened after your text than before, because the 
social dynamics of free software, that you seems to not care about
would not have changed. Free software, for good or bad, is
still influenced by people who do the work. Those who do not do
work have a limited power, and will have as long as the community will
be free of others influences.

That's in fact quite ironic that old school commercial entities listen 
more to their users, because they depend on the users money to survive, 
which in turn empowers users a lot more than non commercial entities, that
give power to them devoting time to them, leading to different dynamics. 

-- 
l.


Reply to: