[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Netra T1 200 watchdog timeouts



Richard Mortimer wrote:
Hi Mark,

On 18/09/2012 14:36, Mark Morgan Lloyd wrote:
If I install either the current Wheezy/testing or Lenny on a Netra T1
200 with LOM (lomlite2) at 3.10, the first time OBP issues a boot
command I get

[string of hex]
Watchdog Reset
Externally Initiated Reset

I have a feeling that this is not a LOM watchdog reset but more a SPARC processor watchdog reset (the processor running out of trap levels in memory fault/interrupt processing).

You should be able to verify if it is a LOM watchdog reset by running the "loghistory" command at the lom prompt.

No watchdog events shown, only power on/off and reset events (plus a 'LOM booted' near the start).

If I subsequently issue a second boot the system runs as expected.
If I'm correct then this is probably due to something like retained memory (not cleared during a soft reset/reboot just cleared during a powercycle). That would explain why the second boot after the Watchdog/XIR works fine.

But this also happens after a (soft) power-on, irrespective of whether power has been physically removed (i.e. IEC connector pulled out of back and left for a few minutes).

This affects both Lenny and Wheezy but does not affect Squeeze, i.e. it
appears to be a regression. Since this happens in between the OBP boot
command and SILO's boot prompt, I presume that it is a SILO problem or
that the installer is doing something odd to the disklabel.

Lenny:    1.4.13
Squeeze: 1.4.14
Wheezy:    1.4.14

I don't see how the LOM firmware would affect this. OBP maybe but if it is a processor watchdog then it I doubt its LOM. SILO would be my first suspect.

SILO is also my suspect (after a lot of fiddling trying to disable lom watchdog from OBP etc.) and those are SILO version numbers :-/

Regards

Richard


The correct way of fixing this is probably to upgrade the LOM firmware
to 3.14. However this requires Solaris, and before the patch can be
installed it requires that the appropriate packages be installed:

"To use LOM commands you must install the Lights Out Management 2.0
packages (SUNWlomu, SUNWlomr and SUNWlomm) from the Solaris
Supplementary CD."
http://docs.oracle.com/cd/E19102-01/n1280.srvr/819-1269-11/poweron.html
The problem is that I don't believe that the supplementary CD is freely
available, which in practice means that this course is not available to
most Linux users.

I'm hoping there's enough detail in there that it shows up on Google, it might save people work in the future.

--
Mark Morgan Lloyd
markMLl .AT. telemetry.co .DOT. uk

[Opinions above are the author's, not those of his employers or colleagues]


Reply to: