Re: .d.o machines which are down (Re: Questions for the DPL candidates)

Ben Collins <bcollins@debian.org> writes:

> Ok, I can guarantee that it never dies. The hardrives are raid 5
> configuration, and the power supplies are redundant, and if any of the
> three cpu/mem boards goes bad, I can just remove it and let the other two
> (4x cpu's and 4gigs ram) run. Then there's also two 10/100mbit ethernet
> adapters.

So why isn't auric running now?  It's down on a "RAID failure" or
something like that, right?

If a cpu/mem board goes bad, is "just remove it" necessary for the
machine to keep working?  What worries me is not the high-reliability
enterprise hardware doing it's job, but your "day or two" delay in
getting things back.  The point of the N+1 rule, as I understand it,
is to give a different kind of redundancy, so that we don't have to
wait a day or two.


