[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Software RAID10 - which two disks can fail?



On 08/04/2014 05:54, Gary Dale wrote:
On 07/04/14 03:48 PM, Rafał Radecki wrote:
Hi All.

I have a server which uses RAID10 made of 4 partitions for / and boots
from it. It looks like so:

mdadm -D /dev/md1
/dev/md1:
Version : 00.90
Creation Time : Mon Apr 27 09:25:05 2009
Raid Level : raid10
Array Size : 973827968 (928.71 GiB 997.20 GB)
Used Dev Size : 486913984 (464.36 GiB 498.60 GB)
Raid Devices : 4
Total Devices : 4
Preferred Minor : 1
Persistence : Superblock is persistent

Update Time : Mon Apr 7 21:26:29 2014
State : clean
Active Devices : 4
Working Devices : 4
Failed Devices : 0
Spare Devices : 0

Layout : near=2, far=1
Chunk Size : 64K

UUID : 1403e5aa:3152b3f8:086582aa:c95c4fc7
Events : 0.38695092

Number Major Minor RaidDevice State
0 8 6 0 active sync /dev/sda6
1 8 22 1 active sync /dev/sdb6
2 8 54 2 active sync /dev/sdd6
3 8 38 3 active sync /dev/sdc6

As far as I know raid10 is ~ "a raid0 built on top of two raid1"
(http://en.wikipedia.org/wiki/Nested_RAID_levels#RAID_1.2B0 - raid10).
So I think that by default in my case:

/dev/sda6 and /dev/sdb6 form the first "raid1"
/dev/sdd6 and /dev/sdc6 form the second "raid1"

So is it so that if I fail/remove for example:
- /dev/sdb6 and /dev/sdc6 (different "raid1's") - the raid10 will be
usable/data will be ok?
- /dev/sda6 and /dev/sdb6 (the same "raid1") - the raid10 will be not
usable/data will be lost?

I read in context of raid10 about replicas of data (2 by default) and
the data layout (near/far/offset). I see in the output of mdadm -D the
line "Layout : near=2, far=1" and am not sure which layout is exactly
used and how it influences data layout/distribution in my case :|

I would really appreciate a definite answer which partitions I can
remove and which I cannot remove at the same time because I need to
perform some disk maintenance tasks on this raid10 array. Thanks for
all help!

BR,
Rafal.
IMHO this is a bad setup. Because you are reading and writing to
multiple disks at at time, you may have some speed-up, but you only
get half the total space. And you are vulnerable to some two-disk
failures. If you went to RAID-6, you'd get the same basic performance
and space but would be immune to two-disk failures.

In RAID 6 you could fail any two drives and still have a running array.

In short, I don't like RAID 10.

However, with respect to your question, it would depend on how you
created the initial array. I can't tell from the information provided.
However, you can remove any one disk and have it continue to work. If
you remove the wrong two, it won't.

So depending on your preference, you can try removing one at a time or
simply pull two at random. If the array restarts, you pulled the right
two. If it doesn't, put one back in and try another.

Warning: any time you are running a degraded RAID array, you are
vulnerable.



My strongest advice would be not to put disks from the same series, as they tend to beak at the same time. (Last summer I had to change 4 out of 5 on one of my servers in a 1 month interval)

Also, from my experience performance is not quite the same, and RAID 5 and 6 also become sluggish in case a disk is lost and they take much time to rebuild. IMHO, if you really are concerned about the loss of 2 disks, put 2 more disks and go for Raid 10.

My 2 cents

Bruno


Reply to: