Re: Netra T1 200 watchdog timeouts
On Sat, Sep 22, 2012 at 12:26:26PM +0100, Richard Mortimer wrote:
> On 19/09/2012 13:10, Mark Morgan Lloyd wrote:
> >Richard Mortimer wrote:
> >>On 18/09/2012 18:49, Mark Morgan Lloyd wrote:
> >>>Richard Mortimer wrote:
> ... snip ...
> >>>>>This affects both Lenny and Wheezy but does not affect Squeeze,
> >>>>>i.e. it
> >>>>>appears to be a regression. Since this happens in between the OBP boot
> >>>>>command and SILO's boot prompt, I presume that it is a SILO problem or
> >>>>>that the installer is doing something odd to the disklabel.
> >>>>>Lenny: 1.4.13
> >>>>>Squeeze: 1.4.14
> >>>>>Wheezy: 1.4.14
> >>>>I don't see how the LOM firmware would affect this. OBP maybe but if
> >>>>it is a processor watchdog then it I doubt its LOM. SILO would be my
> >>>>first suspect.
> >>>SILO is also my suspect (after a lot of fiddling trying to disable lom
> >>>watchdog from OBP etc.) and those are SILO version numbers :-/
> >>Brain wasn't turned on enough to realise that!
> >> From memory I don't think the LOM watchdog is ever enabled in OBP on
> >>the T1 200. It only ever gets enabled by the device drivers once
> >>Solaris is running (if the packages you mention below are installed of
> >OK but at the same time the README from Solaris patch 110208-21
> >explicitly says
> >5043823 Patch 110208-18 changes watchdog behavior and causes watchdog
> >resets when probed
> >4412177 lomlite2 watchdog is not always disabled on "reboot" - 110208-07
> >both of which read as though there could be spurious watchdog events
> >even without Solaris's intervention. However I note your point about the
> >LOM log not showing anything.
> I'm still pretty convinced that the problem you are seeing is
> nothing to do with LOM. I think that both of those are Solaris
> device driver issues too.
> >Should I be raising this as a bug, or can I assume that the people who
> >need to know about it are already aware of the issue?
> Given that this affects Wheezy then a Debian bug is certainly in order.
> I haven't had time to track the development of Wheezy closely but I
> think that it is pretty much using upstream SILO. I vaguely remember
> a few changes upstream recently for both ext2/4 support and for cpu
> detection. One of those could be causing your problem on the Wheezy
Well, Mark mentioned that the same issue is encountered in both Wheezy
and Squeeze SILO versions, which predates the recent ext2/4 changes.
And yes, there haven't been any Debian-specific changes to upstream
SILO as of version 1.4.14+git20100228-1, uploaded in February 2010.
Before that we had some Debian-specific patches included.
Mark, if you can try different SILO versions and find out which one
introduced the regression, that would be great. As far as I can tell,
releases shipped with the following versions:
Lenny : 1.4.13a+git20070930-3
Assuming that the failure was introduced between 1.4.13a+git20070930-3
(Lenny version) and 1.4.14+git20100228-1+b1 (Squeeze version), you
just have one intermediate version (1.4.14+git20100207-1) to test.
> Given the nature of the problem I think it would be useful to have a
> good description of your installation in the bug. In particular
> filesystem layout (partition table), type (ext2/3/4) etc. may be
> relevant. A copy of the console session would be good to attach too.
Yep, the bug would be useful. Given that it's the first report like
this that I see and that a simple enough workaround exists, I would
don't think it qualifies as RC.
Jurij Smakov firstname.lastname@example.org
Key: http://www.wooyd.org/pgpkey/ KeyID: C99E03CC