[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#677424: marked as done (lsb-base: status_of_proc returns 4 (unknown) when pid file is specified and does not exist)



Your message dated Mon, 18 Jun 2012 22:47:44 +0200
with message-id <201206182247.44581.odyx@debian.org>
and subject line Re: Bug#677424: lsb-base: status_of_proc returns 4 (unknown) when pid file is specified and does not exist
has caused the Debian Bug report #677424,
regarding lsb-base: status_of_proc returns 4 (unknown) when pid file is specified and does not exist
to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact owner@bugs.debian.org
immediately.)


-- 
677424: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=677424
Debian Bug Tracking System
Contact owner@bugs.debian.org with problems
--- Begin Message ---
Package: lsb-base
Version: 3.2-23.2squeeze1
Severity: normal
Tags: patch


The specific problem I'm experiencing is with /etc/init.d/portmap, which
returns 4 when portmap isn't running. I can't find sufficient documentation on
correct behavior to be sure if init-functions is incorrect, or if portmap is
using it incorrectly, but I think it's the former.

This actually causes serious problems at least in managing portmap with
pacemaker, which requires init scripts to comply strictly with the LSB
specification [2]. Pacemaker will call the "status" action periodically to
monitor the service, and if the response is "unknown", the monitor action is
considered to have failed, which might get the node ejected from the cluster,
or at least prevent other things from running as they should.

I think the crux of the issue is the implementation of pidofproc. The LSB
specification [1] says about pidofproc:

    "If the -p pidfile option is specified and the named pidfile does not
    exist, the functions shall assume that the daemon is not running."

At the end of pidofproc is this:

    if [ -x /bin/pidof -a ! "$specified" ]; then
        status="0"
        /bin/pidof -o %PPID -x $1 || status="$?"
        if [ "$status" = 1 ]; then
            return 3 # program is not running
        fi
        return 0
    fi
    return 4 # Unable to determine status

The way I read this, pidofproc can't return 3 if a pidfile is specified, which
I think is wrong according to the LSB specification. As I read the spec [1], if
the pidfile doesn't exist, and it was explicitly specified, then pidofproc
should return 3. No process table grepping or anything else allowed. In that
spirit, I propose this patch, which at least solves my problem:

--- init-functions.orig 2012-06-13 16:55:02.000000000 -0400
+++ init-functions      2012-06-13 17:02:58.000000000 -0400
@@ -77,6 +77,9 @@
         pidfile="/var/run/$base.pid"
     fi

+    if [ "$specified" -a -n "${pidfile:-}" -a ! -e "$pidfile" ]; then
+        return 3 # explicitly specified pidfile does not exist; must assume not running
+    fi
     if [ -n "${pidfile:-}" -a -r "$pidfile" ]; then
         read pid < "$pidfile"
         if [ -n "${pid:-}" ]; then

-- System Information:
Debian Release: 6.0.5
  APT prefers stable
  APT policy: (500, 'stable')
Architecture: amd64 (x86_64)

Kernel: Linux 2.6.32-5-amd64 (SMP w/16 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash

Versions of packages lsb-base depends on:
ii  ncurses-bin               5.7+20100313-5 terminal-related programs and man 
ii  sed                       4.2.1-7        The GNU sed stream editor

lsb-base recommends no packages.

lsb-base suggests no packages.

-- no debconf information

[1] http://refspecs.linuxbase.org/LSB_3.1.1/LSB-Core-generic/LSB-Core-generic/iniscrptfunc.html
[2] http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/ap-lsb.html




--- End Message ---
--- Begin Message ---
Version: 3.2-25

Hi Phil, and thanks for your feedback,

Le lundi, 18 juin 2012 19.26:00, Phil Frost a écrit :
> On 06/18/2012 12:21 PM, Didier 'OdyX' Raboud wrote:
> > As you reported this bug against the Debian stable release, lsb-base has
> > seen many updates since then and I suspect that your bug above has been
> > fixed by the resolution of bug #597628 in lsb 3.2-25. Can you verify
> > that any of
> > 
> > a) the attached init-functions-a (to be put as /lib/lsb/init-functions)
> > file solves your bug (it's a file from stable + patches up to 3.2-25);
> > b) the attached init-functions-b (to be put as /lib/lsb/init-functions)
> > file solves your bug (it's a file from stable + all patches concerning
> > pidofproc in the current unstable);
> > c) the lsb-base package from the current testing or unstable do so;
> > 
> > … solves your issue.
> 
> It looks like both init-functions-a and init-functions-b solve my issue.

Hereby marking this bug as fixed in the 3.2-25 version (a scenario above)

> I would say as a minor point that the change introduced in init-functions-a
> is correct in my particular case, but maybe not in others. For example, what
> if the pid file exists, but is not readable? The service could very well be
> running, but init-functions-a will return "not running", if I'm reading it
> correctly. I don't see anything in the specification that says what to do in
> this case, but "unknown" seems like a better answer than "not running" to 
> me. I suppose a case could also be made to attempt a guess as if no pidfile
> were available, though personally I'd regard that as DWIM and avoid it.
> 
> The deeply nested logic in init-functions-b is hard to read and
> understand, but seems to partially address this. As I read it, it will
> return "unknown" if the pid file exists but is not readable.

I admit init-functions-b is quite tough to read, but I think it does the right 
thing; see as rewritten is pseudo-code:

if pidfile name is non-empty	# if [ -n "${pidfile:-}" ]; then
 if pidfile exists				# if [ -e "$pidfile" ]; then
  if pidfile is readable		# if [ -r "$pidfile" ]; then
	DO: read it and return 0 or 1 depending on the state of its content
  else
    DO: return 4 as pidfile name is non-empty, exists but is not readable,
        hence status is unknown.
  fi
 else
  # pidfile doesn't exist, try to find the pid nevertheless using pidof
  # If impossible to do, return 3 as without pidfile it's safe to assume it's
  # probably stopped.
 fi
fi
return 4 as we were unable to determine status 

> However, I wonder under what conditions the last conditional could be true:
> 
> if [ "$specified" ]; then
> return 3 # almost certain it's not running
> fi

Indeed, it's probably never-used code but I'm not fluent enough in shell-
scripting to get bold enough to remove it.

> I think the answer is only "weird edge cases". Like, if $pidfile='', or
> if the pidfile didn't contain a PID. Under these conditions, pidofproc
> returns "unknown" if the pidfile is unspecified; I don't know if
> explicitly specifying an invalid pidfile makes things any less unknown.
>  From my perspective (cluster management) "unknown" is the only correct
> answer here, since in either case something is seriously wrong, and the
> right thing to do is fence the troublesome node. Of course, if you
> pretend nothing is wrong, then the cluster manager can't do that.

Yeah, probably that.

> The other possibility (saying the service isn't running, which actually
> you aren't sure) leads to a situation where actually two mutually
> exclusive resources could be started by the cluster manager. For
> example, the same filesystem on a SAN could be mounted twice, or two
> DRBD nodes could be made primary.
> 
> Again, these are minor points. Since I don't actually use LSB init
> scripts for mounting my filesystems or promoting my DRBD nodes, the
> horrible data corruption scenarios I give can't actually exist. Probably
> this is true of most clusters in practice, but still, the possibility
> exists that someone will be surprised by this overly-optimistic behavior
> someday. If you wanted to close this bug I wouldn't object, and I'll
> open another one when I can think of a real-world use case.

That's done, but with version-tracking: the bug is still marked as open in the 
current stable release but marked as done from version 3.2-25 on.

Feel free to add more input either way but I think that from lsb-base point of 
view, what could be done has been done.

Cheers,

OdyX


--- End Message ---

Reply to: