Re: watchdog

To: Debian Developers <debian-devel@lists.debian.org>
Subject: Re: watchdog
From: Bastian Blywis <blywis@zedat.fu-berlin.de>
Date: Thu, 14 Apr 2011 16:31:06 +0200
Message-id: <[🔎] 201104141631.06427.blywis@zedat.fu-berlin.de>
Reply-to: blywis@inf.fu-berlin.de
In-reply-to: <[🔎] 20110414140535.GA4970@feivel.credativ.lan>
References: <[🔎] 201104141447.17091.blywis@zedat.fu-berlin.de> <[🔎] 20110414140535.GA4970@feivel.credativ.lan>

Thanks for the reply.

> Why? Sorry, I'm not sure I actually understand what you're saying. 
wd_keepalive
> is started to still have basic watchdog functionality without the additional
> checks performed by the watchdog daemon.

Does it actually perform some kind of checks? What I got from the 
documentation is that it only writes to /dev/watchdog periodically regardless 
what happens. Thus "basic watchdog functionality" would only mean that it is 
checked if the userspace process is still running.

> No, only if the kernel does not actually hang. In the case you talk about
> the kernel does not hang enough to not execute wd_keepalive anymore, so
> there is simply no way to figure out that the system needs a reset. If the
> kernel really hangs and stops working having started wd_keepalive
> guarantees a reboot if you have a hardware watchdog.

You are right. I did not actually mean that the kernel hangs but that there is 
a deadlock like in the other bug report: the kernel waits for the nfs server 
to reply but the watchdog does not trigger because at this time the watchdog 
daemon has already been stopped and wd_keepalive started. Therefore the event 
that was monitored (timestamp of a periodically touched file) did not trigger 
a reboot.

> watchdog has to be stopped before the server it monitors get stopped or else
> it would trigger some sort of action. wd_keepalive then is started to make
> sure the system itself stays under supervision.

That's what I assumed: prevent an accidental reboot in rc6 or rc0 (and of 
course when watchdog is stopped by some other means).


Regards,

Bastian

Reply to:

Follow-Ups:
- Re: watchdog
  - From: Michael Meskes <meskes@debian.org>

References:
- watchdog
  - From: Bastian Blywis <blywis@zedat.fu-berlin.de>
- Re: watchdog
  - From: Michael Meskes <meskes@debian.org>

Prev by Date: Re: Default size limits for /run (/var/run) and /run/lock (/var/lock)
Next by Date: Re: Default size limits for /run (/var/run) and /run/lock (/var/lock)
Previous by thread: Re: watchdog
Next by thread: Re: watchdog
Index(es):
- Date
- Thread