[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: raid recomendation



On 12/7/2012 5:48 PM, Aaron Toponce wrote:

> A RAID-1 will outperform
> a parity-based RAID using the same disks every time, due to calculating the
> parity. 

This hasn't been true for almost a decade.  Even the slowest of modern
single core x86 CPUs have plenty of excess horsepower for soft RAID
parity, as do ASICs on hardware RAID solutions.  There are two big
problems today with parity arrays.

The first is the IO load and latency of read-modify-write cycles which
occur when partial stripes are updated.  Most write IO is small and
random, typically comprising over 95% of writes for most typically
workloads, mail, file, LDAP, SQL servers for example. Streaming
applications such as video are an exception as they write full stripes.
 Thus for every random write, you first must, at a minimum, read two
disks (data chunk/strip and parity chunk/strip) and write back to both
with the new data chunk and parity chunk.  This is with an optimized
algorithm.  The md/RAID driver has some optimizations to cut down on RMW
penalties.  Many hardware RAID solutions read then write the entire
stripe for scrubbing purposes (i.e. write all disks frequently so media
errors are caught sooner rather than later).  This is a data integrity
feature of higher end controllers.  This implementation is much slower
due to all the extra IO and head movement, but more reliable.

The second is that failed drive rebuilds take FOREVER as all disks are
being read in parallel and parity calculated for every stripe, just to
rebuild one disk.  Even a small count 2TB drive RAID6 array can take
12-24 hours to rebuild.  The recommended max array drive count for
RAID5/6 are 4 and 8 drives respectively.  One of the reasons for this
BCP is rebuild time.  With RAID10 rebuild time is a constant, as you're
simply copying all the sectors from one drive to another.  A 60x2TB
drive RAID10 rebuild will take about 5 hours with low normal workload IO
hitting the array.

> Further, striping across two mirrors will give increased
> performance that parity-based RAID cannot achieve. 

A parity array actually has superior read speed vs a RAID10 array of the
same total spindle count because there are more data spindles.  An 8
drive RAID6 has 6 data spindles, whereas an 8 drive RAID10 only has 4.
Write performance, however, as I mentioned, is an order of magnitude
slower due to RMW.

> Lastly, you can suffer
> any sort of disk failures, provided all mirrors in the stripe remains in
> tact.

You mean any "number" not "sort".  Yes, with RAID10 you can lose half
the drives in the array as long as no two are in the same mirror pair.
I wouldn't bank on this though.  Many drive dropouts are not due to
problems with the drives, but with the backplanes and cabling.  When
that happens, if you've not staggered your mirrors across HBAs, cables,
and cabinets (which isn't possible with RAID HBAs), you may very well
lose two drives in the same mirror.

>     1: http://zfsonlinux.org

> Just my $.02.

And that sums up the value of your ZFS on Linux recommendation, quite
well.  Being a fanboy is fine.  Run it yourself.  But please don't
recommend unsupported, out of tree, difficult for the average Debian
user to install, software, for a general purpose storage solution.

Good hardware RAID is relatively cheap, Linux has md/RAID which isn't
horrible for most workloads, and there are plenty of high quality Linux
filesystems to meet most needs, with EXT4 for casual stuff, JFS and XFS
for heavy duty, though XFS is a much better choice for many reasons; the
big one being that it's actively developed, whereas JFS is mostly in
maintenance only mode.

-- 
Stan


Reply to: