[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: What is the point of RAID?



On Wednesday 12 November 2008, lee <lee@yun.yagibdah.de> wrote about 'Re: 
What is the point of RAID?':
>On Wed, 12 Nov 2008 09:59:09 -0600
>
>"Boyd Stephen Smith Jr." <bss03@volumehost.net> wrote:
>> >So what is the optimal number of disks in a raid 5 and a raid 1?
>>
>> If by optimal, you mean, least chance of failure:
>
>Not exactly; I was wondering if there is a breaking point, as in
>"adding more drives only increases the chances of the whole array to
>fail" beyond that point, and "adding more drives reduces the chances of
>the whole array failing" before that point.

Um, that's exactly what I mean by least chance of failure.

Under the assumption of "hard disk n" fails events being, 
mathematically, "independent events":

RAID-1 always grows in redundancy, and doesn't fail until all the drives 
fail.  At n drives, the chance of failure of the array is p^n, for 
probabilities less than 100% (or 1), p^(n+1) < p^n for any n > 0, so 
adding a drive always reduces the chance of failure.  [aleph-sub-naught is 
the first countable infinity.]

RAID-5 always grows in storage not redundancy, and fails as soon as any two 
(or more) drives file.  At n drives the chance of failure of the array 
is ... something I'd have to look up ... but where (for n drives) < (for 
n+1 drive) for any n >= 2.  So 3 drives (the minimum in RAID-5), provides 
the least chance of failure.

>Wouldn't it be useful if that breaking point was known for all kinds of
>raid setups?

Under the "independent events" assumption, it is.  I'm not sure anyone just 
has a table you can look it up with, but basic combinatorial maths will 
get you the answer.

>The calculation would have to consider the chances of 
>several drives failing at (about) the same time.

You could do that, and the failure probabilities would be different but in 
the same relative order, until you varied two much from the "independent 
events" assumption, for example: making two of the drive completely 
dependent on one another; when one fail so does the other and vice-versa.

For something that takes into account that recovery/rebuild takes some 
time, during when redundancy is decreased or lost entirely, well I'm 
pretty sure the maths exists, I just haven't studied it at all.

The problem is that redundancy isn't the whole story.  Sure 5 drives in a 
RAID-1 is very safe compared to 5 drives in a RAID-5, but the RAID-5 has 4 
times the storage for the same cost.  You have to consider the array's 
cost, usable storage, chance of failure, and performance.  Small 
trade-offs in one of those values can change the other a lot.

RAID-5 might not be the fastest or least risky way to store data across 
5x(1 TB) drives, but it wins because it gives a lot of space (4 TB) at an 
acceptable level of redundancy -- I challenge anyone to come up with a 
scheme that gives the same performance and at least 3 1/3 TB of space with 
ANY redundancy.

RAID-6 might not perform as well as RAID-5/0 or even RAID-5, but it stays 
up no matter which two drives go down and doesn't require an even number 
of drives.

RAID is "old school"/"old guard", but it works very well.  I'm not a big 
believer in ZFS, because I believe a separation between the filesystem and 
block-device management makes the whole system for flexible and useful.
-- 
Boyd Stephen Smith Jr.                     ,= ,-_-. =. 
bss03@volumehost.net                      ((_/)o o(\_))
ICQ: 514984 YM/AIM: DaTwinkDaddy           `-'(. .)`-' 
http://iguanasuicide.org/                      \_/     

Attachment: signature.asc
Description: This is a digitally signed message part.


Reply to: