[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: default init on non-Linux platforms



Dear Adrian,

John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> writes:

> On 02/21/2014 01:00 PM, heroxbd@gentoo.org wrote:
>>> So, OpenRC actually also relies on files - like System V Init - to
>>> track the state of a service? Isn't that approach somewhat unreliable
>>> and hacky?
>> 
>> I bet you are going to tell me the only reliable and non-hacky way to
>> track the state of a service is not forking/writing to files but
>> starting it foreground by a long-living daemon. I agree with you.
>
> Well, I was thinking about something like CGroups. I don't like the
> idea of having to rely on files for an init system to be able to
> track the processes it has started.
>
> I agree and understand that this was the way to go back in the old
> days, but we shouldn't be using that on current setups.
>
> At my department, we stumbled right over this design when the /var
> filesystem was full and System V Init could no longer create PID
> files to be able to track services, yet it started services without
> complaining.
>
> Since we had to restart our dhcpd several days on a particular day,
> System V Init was unable to track whether the daemon was already
> running and we ended up with several dozen instances.
>
> Sure, it's probably a bug in the init script used as it didn't
> check for enough disk space and wasn't failing when the disk is full.
> But again, this is a core component depending on external scripts
> being bug free which is not the correct approach when you are
> aiming for something robust.

Thank your for sharing with us your real life story. I can reasonate
with it: having a dhcpd malfunctioning and hundreds of people
complaining about the resulting unstable network is no fun at all.

How to cope with this will be a matter of personal taste. You might
think a robust framework will make it fool-proof. While I might think
running a central dhcp server along with something else which possibly
fill up the /var is questionable itself. I appreciate both.

>> OpenRC leverages cgroups when available. We are also working on a plugin
>> framework for external supervisors such as djbtools, runit and s6 (maybe
>> launchd, smf, systemd, ... as well if they're hackable) to do this
>> foreground status tracking while integrated with OpenRC: We are not
>> there yet though.
>
> Can external supervisors like djbtools or runit actually reliably track
> processes and if, yes, how? From my understanding, it's impossible
> to be able to really track a process once it has started when
> you don't have the possibility to use something like CGroups as
> services could always just double-fork. The tracking has to be
> done through a mechanism provided by the kernel, doesn't it?

I've meant "foreground", the opposite of double-fork, which has been
discussed in the list, like:

    http://article.gmane.org/gmane.linux.debian.devel.general/152538

This does not require a special tracking mechanism in the kernel.

Double-fork is not a reliable way to track process. People invented it
as a hack back in history (from BSD community if I remember it right).

> And grepping through the output of "ps" or similar is not what
> I would consider reliable and robust either.

Nod. grepping `ps` is what we should avoid at all cost.

>> These advanced features are optional. We can still use the unreliable
>> and hacky way of trakcing files when the task is trivial, like a
>> personal VPS or laptop which do not care much about running sshd/httpd
>> for 3 years non-stop.
>
> Sure, I fully agree. But there are actually many enterprises who
> need something with 99% service availability. Our department
> runs a webserver, a login node for 1200 users and a large compute
> clusters with over 200 nodes and an SGI UV1000 (1024 CPUs, 2 TiB),
> so we need something which is able to control resources and track
> processes. Many enterprises and websites run Debian.

Yes, though I am a casual user, I actually had systemd and monit to
supervise httpd in one of our mirror servers. And I myself am even using
a computing cluter running RHEL5 (for stability and paid customer
service). So I am quite sharing the view with you. Different people in
different situations have different needs: Using a bad old pid-file
tracking, or no tracking at all is like wearing jeans at home, or even
naked. It happily coexists with the situation of wearing suites doing
public speech.

--- super light-hearded, just for fun, don't take it seriously ---
modern activists: com'on, just us, or you'll not be supported.
(com'on, wear suites, or you're out)
old nerds: fine, we will support ourselves.
(fine, we will find somewhere comfortable to be naked)
--- end ---

Hope this explains why I am devoting to something alternative and even
counter-advertised as suboptimal. Let's coexist and have fun. This
universe is ultimately a friendly place to live in after all.


Coming back to our starting point: service relying on file-tracking is
hackish and is recommended to be avoided in serious business in
preferrence to a better available supervising solution. While it is
still fun and useful because not all computers are for serious business.

Cheers,
Benda


Reply to: