Re: .d.o machines which are down (Re: Questions for the DPL candidates)
Ben Collins <email@example.com> writes:
> Ok, I can guarantee that it never dies. The hardrives are raid 5
> configuration, and the power supplies are redundant, and if any of the
> three cpu/mem boards goes bad, I can just remove it and let the other two
> (4x cpu's and 4gigs ram) run. Then there's also two 10/100mbit ethernet
So why isn't auric running now? It's down on a "RAID failure" or
something like that, right?
If a cpu/mem board goes bad, is "just remove it" necessary for the
machine to keep working? What worries me is not the high-reliability
enterprise hardware doing it's job, but your "day or two" delay in
getting things back. The point of the N+1 rule, as I understand it,
is to give a different kind of redundancy, so that we don't have to
wait a day or two.