[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#677424: lsb-base: status_of_proc returns 4 (unknown) when pid file is specified and does not exist



On 06/18/2012 12:21 PM, Didier 'OdyX' Raboud wrote:
As you reported this bug against the Debian stable release, lsb-base has seen
many updates since then and I suspect that your bug above has been fixed by
the resolution of bug #597628 in lsb 3.2-25. Can you verify that any of

a) the attached init-functions-a (to be put as /lib/lsb/init-functions) file
solves your bug (it's a file from stable + patches up to 3.2-25);
b) the attached init-functions-b (to be put as /lib/lsb/init-functions) file
solves your bug (it's a file from stable + all patches concerning pidofproc in
the current unstable);
c) the lsb-base package from the current testing or unstable do so;

Â… solves your issue.

It looks like both init-functions-a and init-functions-b solve my issue. I did not test the package from testing or unstable.

I would say as a minor point that the change introduced in init-functions-a is correct in my particular case, but maybe not in others. For example, what if the pid file exists, but is not readable? The service could very well be running, but init-functions-a will return "not running", if I'm reading it correctly. I don't see anything in the specification that says what to do in this case, but "unknown" seems like a better answer than "not running" to me. I suppose a case could also be made to attempt a guess as if no pidfile were available, though personally I'd regard that as DWIM and avoid it.

The deeply nested logic in init-functions-b is hard to read and understand, but seems to partially address this. As I read it, it will return "unknown" if the pid file exists but is not readable. However, I wonder under what conditions the last conditional could be true:

if [ "$specified" ]; then
return 3 # almost certain it's not running
fi

I think the answer is only "weird edge cases". Like, if $pidfile='', or if the pidfile didn't contain a PID. Under these conditions, pidofproc returns "unknown" if the pidfile is unspecified; I don't know if explicitly specifying an invalid pidfile makes things any less unknown. From my perspective (cluster management) "unknown" is the only correct answer here, since in either case something is seriously wrong, and the right thing to do is fence the troublesome node. Of course, if you pretend nothing is wrong, then the cluster manager can't do that.

The other possibility (saying the service isn't running, which actually you aren't sure) leads to a situation where actually two mutually exclusive resources could be started by the cluster manager. For example, the same filesystem on a SAN could be mounted twice, or two DRBD nodes could be made primary.

Again, these are minor points. Since I don't actually use LSB init scripts for mounting my filesystems or promoting my DRBD nodes, the horrible data corruption scenarios I give can't actually exist. Probably this is true of most clusters in practice, but still, the possibility exists that someone will be surprised by this overly-optimistic behavior someday. If you wanted to close this bug I wouldn't object, and I'll open another one when I can think of a real-world use case.




Reply to: