[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: LVM RAID5 with missing disk?

Gary Dale wrote:
> Mart van de Wege wrote:
> > The problem is not that RAID5 does not provide resilience against a
> > single disk failure. The problem is that with modern disk capacities,
> > the chances of *another* disk failing while the array is rebuilding have
> > significantly risen.
> >
> > Especially when all the disks came out of the the same batch, they tend
> > to fail at similar times. I know Best Practice is to mix disks in RAID
> > arrays, but who actually practices that, instead of just taking the risk
> > of failure and covering it with a higher RAID level, like RAID6 in this
> > case?
> The chances of two disks failing within hours of one another is very small
> even for disks from the same batch.

Statistics are a wonderful thing.  If statistically the odds are one
out of a thousand that sounds very unlikely.  But if so then out of
the 2,500+ subscribers to this mailing list there should be 2.5 of us
who have experienced this failure personally.  In isolation it is
unlikely that it will happen to any one individual.  But in the
population it is simply the statistics that it will happen.

Twice now over twenty years and many systems I have had a double disk
failure where both drives in a RAID 1 mirror failed within hours of
each other.  Both were when the drives were from the same vendor batch
and had been purchased together for RAID and had been running an
identical number of hours.  Within hours in one case was seven days
later between the two drives.  Within hours in another case was within
36 hours of each other.  Spinning devices don't last forever.

The moral to this story?  I always mix disks in RAID arrays to try to
decouple age failure modes of drives due to this experience.  When I
see a single disk raid failure as quickly as practical I jump on
getting the drive replaced and the degraded RAID sync'd again.  I
ensure that there are good backups.  These are good and safe
recommendations that I think everyone can agree are good Best
Practices.  I favor RAID6's extra redundancy for more safety but I
still use RAID1 too.


Attachment: signature.asc
Description: Digital signature

Reply to: