Re: RAID 6 mdadm

To: Muhammad Yousuf Khan <sirtcp@gmail.com>
Cc: debian <debian-user@lists.debian.org>
Subject: Re: RAID 6 mdadm
From: Stan Hoeppner <stan@hardwarefreak.com>
Date: Wed, 10 Apr 2013 16:36:54 -0500
Message-id: <[🔎] 5165DB76.8010204@hardwarefreak.com>
Reply-to: stan@hardwarefreak.com
In-reply-to: <[🔎] CAGWVfMkGOvGHQX7fdJs3Tqr45DE-Fb5qNOffkVM6LmnuAVRJuw@mail.gmail.com>
References: <[🔎] CAGWVfMkGOvGHQX7fdJs3Tqr45DE-Fb5qNOffkVM6LmnuAVRJuw@mail.gmail.com>

On 4/10/2013 9:15 AM, Muhammad Yousuf Khan wrote:

> actually what i need is 4GB LAN throughput with teaming (802.3ad) for data
> storage to backup VMs and same huge data manipulations will be done. so
> just confused if it going to work or not.

Addressing the bonded ethernet issue:

802.3ad LACP configured at the switch and within the Linux bonding
driver will facilitate TRANSMIT TCP stream load balancing.  However, it
does nothing to address RECEIVE load balancing.  Your application is a
~100% receive workload.  So LACP is out.  Linux balance-alb is required
to achieve any level of receive load balancing.  It does not require
bonding support in the switch.  And it only works with IPv4.

See:  https://www.kernel.org/doc/Documentation/networking/bonding.txt

Addressing the storage issue:

RAID6 only has good write performance in a single scenario: large single
streaming writes, such as video recording, large file backup, etc.  If
your writes are smaller than stripe width you incur read-modify-write
cycles.  These can triple (or more) the number of heads seeks per drive
required to complete the write, and you suffer the extra rotational
latency in between the seeks.  Thus if you're performing multiple
partial stripe width writes in parallel, then your performance, even
with expensive hardware, will drop into the teens of MB/s.  And if
you're doing multiple streams RAID6 isn't suitable simply due to the
extra head seek and rotational latency of double rotating parity writes.

So if you're serious about achieving an aggregate 400MB/s of write
throughput to the array, with this workload, you're very likely not
going to get there with 8x 7.2K SATA drives in RAID6.  Depending on how
many concurrent backup streams you're running, and what filesystem
you're using, you might get close to 400MB/s using 10 drives in RAID10.
 The more concurrent jobs, the more spindles you need to mitigate the
extra seeks.  This will also greatly depend on your backup job write
pattern, and thus the chunk/stripe width you select for the array.  The
RAID parameters must match the application write pattern or performance
may be horrible.

And lastly, select the right filesystem for the job.  EXT3/4 are not
suitable for large parallel streaming writes.  You will likely need XFS
for this workload for optimal performance, and you'll need to align the
filesystem to the md/RAID stripe parameters during mkfs.  If you're
directly formatting the md/RAID device, mkfs.xfs will query the kernel
and configure proper alignment automatically.  If you use LVM between
md/RAID and XFS then you'll need to do manual alignment during mkfs.xfs.

Parting thought:  It's very easy to get 400MB/s write throughput from an
8 SATA drive md/RAID6 array when doing single streaming writes from 'dd'
or other simple one dimensional tests.  However, achieving 400MB/s with
real workloads is a completely different matter and much more demanding.
 Most of the md/RAID throughput data you'll find via Google comes from
these one dimensional simplistic tests.  Don't trust this information.
Look for specific workload tests or synthetic benchmark data from
bonnie, iozone, etc.  While these won't precisely mimic your workload,
they give a much better picture of potential real world performance.
And the numbers will always be much, much smaller than 'dd' numbers.

-- 
Stan

Reply to:

References:
- RAID 6 mdadm
  - From: Muhammad Yousuf Khan <sirtcp@gmail.com>

Prev by Date: Re: Information not on relavant man page - where to find on WEB?
Next by Date: Re: RAID 6 mdadm
Previous by thread: Re: RAID 6 mdadm
Next by thread: A home directory on removeable storage ...
Index(es):
- Date
- Thread