Hardware/Software RAID (nearly a religious war...)
I apologize to the list as I didn't mean to hijack the thread.
--On August 30, 2007 7:56:52 AM +0200 martin f krafft <madduck@debian.org> 
wrote:
MDRAID is also very difficult to administer, offering only
(depending on your version) mdadm or raid* tools.  mdadm is rather
arcane.  simple operations are not well documented, like, how do
i replace a failed drive?  or start a rebuild?
Have you actually bothered to look into /usr/share/doc/mdadm?
You do have me there, not since 3.1 when it contained just a few sparse 
notes. I had very incorrectly assumed that hadn't changed as much as it 
clearly has.  Typically I make sure to double check something before I say 
anything about it and this is one of those times where I didn't.
there's no 'rebuild drive' it's completely NON automated either.
meaning it always takes user intervention to recover from any
failure.
Not if you're using spares. But even then, yes, to pull a disk out
and insert a new one, you need to shut down the machine, unless you
have hotplugging drives. Same story for hardware RAID.
Ok, given.  Spare replacement is fully automated as one would expect.  I 
think hotplug (atleast on the drives part) is more or less mandatory in 
SATA, unless they don't support the SATA power connector.
a single I/O error causes MDRAID to mark the element as failed.
it does not even bother to retry.
And that's a feature. I've seen disk corruption where a block would
return wrong data only in 1/10 reads. On retry, it would work, and
the RAID would hide the problem from me. I'd much rather have
a failed drive
Would a patch be accepted that allowed a controllable retry?  The place 
where this has caused the most pain is on IDE/SATA drives (and PATA) that 
will occasionally fault a sector, then read it correctly the next try 
around.  Granted this condition needs to be reported, and the MD driver 
shouldn't persist on 'banging its head'.  Some form of automatic recovery 
would be nice like in the case of mirrors read (fault) read (partner) write 
(to faulted area) sort of logic?  I know it gets really complicated, 
especially trying to avoid deadlocks.
MDRAID is also incapable of performing background patrolling
reads, something i think even 3Ware does.
Wrong. It does this only once a month by default (on Debian; the
mdadm sunday), but you could make it do that every hour.
Background patrolling reads are executed at low priority during I/O low 
times, often constantly starting a new one when an old one finishes.  EMC 
implements it a bit differently though...doing patrolling reads on 
partitions only when they detect an error by default.  However the arrays 
can be configured to do patrols more often.  ICP GDT controllers do 
similarly.  I *think* a firmware update for ICP ICP model controllers adds 
it as well.  Some of the older ICP GDTs didn't support patrolling reads. 
The operation is obviously the same as a consistency check, just the system 
does it more often.  It takes the large hardware arrays onsite about a week 
to finish a run.  It's saved us from undetected failures a few times.
I was/am very glad to see the periodic consistency checks though.  That 
alleviated one of our bigger complaints which was lack of patrolling, we 
haven't had much experience with how it behaves quite yet though.  I 
presume it uses the same/similar throttling mechanism that rebuilds have 
used for a while now.  That works really well, most of the machines don't 
notice rebuild I/O traffic at all.  The few that I have seen it be an issue 
on have controllers with DMA disabled by default due to issues with the 
controller hardware, not MD's problem at all there.
MDRAID RAID5 sets are non-bootable.
grub2 can boot them.
Is anyone shipping it yet though?  Etch still has 0.97.
In many years of software RAID management and in two years as mdadm
maintainer, I have never heard of a single case where md failed to
correctly identify a failed drive.
I should have been clearer....my problem is that MD gets over-zealous about 
marking drives as failed.  Which, as you've noted (and I knew, but still 
don't necessarily agree with, and clearly did a poor job of acknowledging), 
is intentional.
<...>
I in no way meant to disrespect you or your work.  I only meant to make 
aware some of the possible issues one can experience with software based 
RAID.  MD has, and continues to, improve.  You have your camp, I have mine. 
We disagree on what's best.  In some situations software RAID is best, in 
some hardware RAID is best.  It's up to each op to determine 'best' for 
themselves, in the end I feel that's what Free and Open Source Software and 
Open Source OSes are about.
Reply to: