Re: watchdog
Thanks for the reply.
> Why? Sorry, I'm not sure I actually understand what you're saying.
wd_keepalive
> is started to still have basic watchdog functionality without the additional
> checks performed by the watchdog daemon.
Does it actually perform some kind of checks? What I got from the
documentation is that it only writes to /dev/watchdog periodically regardless
what happens. Thus "basic watchdog functionality" would only mean that it is
checked if the userspace process is still running.
> No, only if the kernel does not actually hang. In the case you talk about
> the kernel does not hang enough to not execute wd_keepalive anymore, so
> there is simply no way to figure out that the system needs a reset. If the
> kernel really hangs and stops working having started wd_keepalive
> guarantees a reboot if you have a hardware watchdog.
You are right. I did not actually mean that the kernel hangs but that there is
a deadlock like in the other bug report: the kernel waits for the nfs server
to reply but the watchdog does not trigger because at this time the watchdog
daemon has already been stopped and wd_keepalive started. Therefore the event
that was monitored (timestamp of a periodically touched file) did not trigger
a reboot.
> watchdog has to be stopped before the server it monitors get stopped or else
> it would trigger some sort of action. wd_keepalive then is started to make
> sure the system itself stays under supervision.
That's what I assumed: prevent an accidental reboot in rc6 or rc0 (and of
course when watchdog is stopped by some other means).
Regards,
Bastian
Reply to:
- References:
- watchdog
- From: Bastian Blywis <blywis@zedat.fu-berlin.de>
- Re: watchdog
- From: Michael Meskes <meskes@debian.org>