System failures: LEDs 1, 2 & 6


We have managed to get our "cluster" of 15 or so 715/75's running, but
have had 3 unexplained system failures.  They run for a while (they are
on continuously) and then die of a hardware failure.  As indicated on
the Subject line, LED's 1, 2 and 6 come on when they stop.

Checking what little documentation that we have, it appears that these
3 lights  reflect something called a "PCX-T FRU", which, as near as we
can tell is the CPU.

We thought at first that over-heating was the cause, but the three stacks
of machines (2 are 6 units high, 1 is 3 units high) are about 30 cm (12")
or more from the wall, and the most recently deceased was at the top, and
nearest the most fresh air of all.  Furthermore, the units are separated
from each other (vertically) by 4 small pieces of wood about 1+ cm
in thickness.  Horizontally, they are separated by about 3-4 cm or more.

When we received them, they had been in continuous service for about 6-8
years, and I was assured that there had never been any hardware problems
(i.e. none had failed as these 3 have).

Frankly, I was always under the impression that HP equipment was
"bulletproof", and I was very surprised when this started to happen.
I hate to think that we have passed the "Best Before" date.  Our "cluster"
is certainly not "state of the art", and will never make the "top 5000"
list, let alone the "top 500" list, but it should be quite useful for
experimentation.  While the equipment is old - up to 10 years or so -
I firmly believe that except for the mechanical components, it should
run for some considerable time to come.

Do any readers of this list have any idea why these systems might
be failing?  Has anyone had a similar experience?

Regards from Calgary,

				Dean Provins 
		dprovins@ucalgary.ca,  provinsd@telusplanet.net
Linux is a stimulating and productive alternative to other PC operating systems.

