Hardware/Software RAID (nearly a religious war...)
I apologize to the list as I didn't mean to hijack the thread.
--On August 30, 2007 7:56:52 AM +0200 martin f krafft <madduck@debian.org>
wrote:
MDRAID is also very difficult to administer, offering only
(depending on your version) mdadm or raid* tools. mdadm is rather
arcane. simple operations are not well documented, like, how do
i replace a failed drive? or start a rebuild?
Have you actually bothered to look into /usr/share/doc/mdadm?
You do have me there, not since 3.1 when it contained just a few sparse
notes. I had very incorrectly assumed that hadn't changed as much as it
clearly has. Typically I make sure to double check something before I say
anything about it and this is one of those times where I didn't.
there's no 'rebuild drive' it's completely NON automated either.
meaning it always takes user intervention to recover from any
failure.
Not if you're using spares. But even then, yes, to pull a disk out
and insert a new one, you need to shut down the machine, unless you
have hotplugging drives. Same story for hardware RAID.
Ok, given. Spare replacement is fully automated as one would expect. I
think hotplug (atleast on the drives part) is more or less mandatory in
SATA, unless they don't support the SATA power connector.
a single I/O error causes MDRAID to mark the element as failed.
it does not even bother to retry.
And that's a feature. I've seen disk corruption where a block would
return wrong data only in 1/10 reads. On retry, it would work, and
the RAID would hide the problem from me. I'd much rather have
a failed drive
Would a patch be accepted that allowed a controllable retry? The place
where this has caused the most pain is on IDE/SATA drives (and PATA) that
will occasionally fault a sector, then read it correctly the next try
around. Granted this condition needs to be reported, and the MD driver
shouldn't persist on 'banging its head'. Some form of automatic recovery
would be nice like in the case of mirrors read (fault) read (partner) write
(to faulted area) sort of logic? I know it gets really complicated,
especially trying to avoid deadlocks.
MDRAID is also incapable of performing background patrolling
reads, something i think even 3Ware does.
Wrong. It does this only once a month by default (on Debian; the
mdadm sunday), but you could make it do that every hour.
Background patrolling reads are executed at low priority during I/O low
times, often constantly starting a new one when an old one finishes. EMC
implements it a bit differently though...doing patrolling reads on
partitions only when they detect an error by default. However the arrays
can be configured to do patrols more often. ICP GDT controllers do
similarly. I *think* a firmware update for ICP ICP model controllers adds
it as well. Some of the older ICP GDTs didn't support patrolling reads.
The operation is obviously the same as a consistency check, just the system
does it more often. It takes the large hardware arrays onsite about a week
to finish a run. It's saved us from undetected failures a few times.
I was/am very glad to see the periodic consistency checks though. That
alleviated one of our bigger complaints which was lack of patrolling, we
haven't had much experience with how it behaves quite yet though. I
presume it uses the same/similar throttling mechanism that rebuilds have
used for a while now. That works really well, most of the machines don't
notice rebuild I/O traffic at all. The few that I have seen it be an issue
on have controllers with DMA disabled by default due to issues with the
controller hardware, not MD's problem at all there.
MDRAID RAID5 sets are non-bootable.
grub2 can boot them.
Is anyone shipping it yet though? Etch still has 0.97.
In many years of software RAID management and in two years as mdadm
maintainer, I have never heard of a single case where md failed to
correctly identify a failed drive.
I should have been clearer....my problem is that MD gets over-zealous about
marking drives as failed. Which, as you've noted (and I knew, but still
don't necessarily agree with, and clearly did a poor job of acknowledging),
is intentional.
<...>
I in no way meant to disrespect you or your work. I only meant to make
aware some of the possible issues one can experience with software based
RAID. MD has, and continues to, improve. You have your camp, I have mine.
We disagree on what's best. In some situations software RAID is best, in
some hardware RAID is best. It's up to each op to determine 'best' for
themselves, in the end I feel that's what Free and Open Source Software and
Open Source OSes are about.
Reply to: