[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Debian hotswap and 5 9's



On Tue, Dec 10, 2002 at 10:25:54PM -0800, nate wrote:
> Hanasaki JiJi said:
> 5 9s on any hardware..You need redundant motherboards, power supplies,
Yes. 
> cpus, ram, disks, network, and of course it all needs to be hot swappable.

No. 

Think RAID. 

In RAID it is acceptable that any one harddrive goes completely 
bad. 

So, if you have a computer that allows it's network card to fail, 
sends you an Email requesting a new network card, and that you can
change the network card all while the computer is still completelly
functional, then that's great. One way to solve the problem. 

However if your system remains "up" because you get an Email from the
redundant computer who took over, then that's acceptable as well, as
long as you reach the stated goal (99.999% uptime of the SYSTEM!!). 

If you have staff onsite who can diagnose a broken switch in less than
a minute and they have the spare parts to replace it within two minutes, 
you can tolerate one or two of these single-point-of-failure  failures
a year. 

But this solution is more expensive than devising a fail-over system
(you do this ONCE!) that allows the switch to fail without bringing 
the system down, and then buying two switches (you had to have one
spare in the case with the clever staff as well!). 

Also note that if for example, your failover can lose one transaction,
you can consider this a partial failure of your system. If it takes a 
week to recover the transaction (humans on the phone?) and this is
1 millionth of your transactions in that week, then this can be considered
a 0.0001 % of downtime during that week, or 0.000002% on a yearly
scale. You can have this happen twice a week and still easily achieve
five nines aggregate uptime. 

The "bad reputation" that you'd get from losing a tranasction may however
be valued more, so that you'd have to weigh this type of failure. Fine
Multiply by ten, and with the presumed volume, you're still plenty clear
of "five nines". 

		Roger. 

-- 
** R.E.Wolff@BitWizard.nl ** http://www.BitWizard.nl/ ** +31-15-2600998 **
*-- BitWizard writes Linux device drivers for any device you may have! --*
* The Worlds Ecosystem is a stable system. Stable systems may experience *
* excursions from the stable situation. We are currently in such an      * 
* excursion: The stable situation does not include humans. ***************



Reply to: