[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Hardware/Software RAID (nearly a religious war...)

I apologize to the list as I didn't mean to hijack the thread.

--On August 30, 2007 7:56:52 AM +0200 martin f krafft <madduck@debian.org> wrote:

MDRAID is also very difficult to administer, offering only
(depending on your version) mdadm or raid* tools.  mdadm is rather
arcane.  simple operations are not well documented, like, how do
i replace a failed drive?  or start a rebuild?

Have you actually bothered to look into /usr/share/doc/mdadm?

You do have me there, not since 3.1 when it contained just a few sparse notes. I had very incorrectly assumed that hadn't changed as much as it clearly has. Typically I make sure to double check something before I say anything about it and this is one of those times where I didn't.

there's no 'rebuild drive' it's completely NON automated either.
meaning it always takes user intervention to recover from any

Not if you're using spares. But even then, yes, to pull a disk out
and insert a new one, you need to shut down the machine, unless you
have hotplugging drives. Same story for hardware RAID.

Ok, given. Spare replacement is fully automated as one would expect. I think hotplug (atleast on the drives part) is more or less mandatory in SATA, unless they don't support the SATA power connector.

a single I/O error causes MDRAID to mark the element as failed.
it does not even bother to retry.

And that's a feature. I've seen disk corruption where a block would
return wrong data only in 1/10 reads. On retry, it would work, and
the RAID would hide the problem from me. I'd much rather have
a failed drive

Would a patch be accepted that allowed a controllable retry? The place where this has caused the most pain is on IDE/SATA drives (and PATA) that will occasionally fault a sector, then read it correctly the next try around. Granted this condition needs to be reported, and the MD driver shouldn't persist on 'banging its head'. Some form of automatic recovery would be nice like in the case of mirrors read (fault) read (partner) write (to faulted area) sort of logic? I know it gets really complicated, especially trying to avoid deadlocks.

MDRAID is also incapable of performing background patrolling
reads, something i think even 3Ware does.

Wrong. It does this only once a month by default (on Debian; the
mdadm sunday), but you could make it do that every hour.

Background patrolling reads are executed at low priority during I/O low times, often constantly starting a new one when an old one finishes. EMC implements it a bit differently though...doing patrolling reads on partitions only when they detect an error by default. However the arrays can be configured to do patrols more often. ICP GDT controllers do similarly. I *think* a firmware update for ICP ICP model controllers adds it as well. Some of the older ICP GDTs didn't support patrolling reads. The operation is obviously the same as a consistency check, just the system does it more often. It takes the large hardware arrays onsite about a week to finish a run. It's saved us from undetected failures a few times.

I was/am very glad to see the periodic consistency checks though. That alleviated one of our bigger complaints which was lack of patrolling, we haven't had much experience with how it behaves quite yet though. I presume it uses the same/similar throttling mechanism that rebuilds have used for a while now. That works really well, most of the machines don't notice rebuild I/O traffic at all. The few that I have seen it be an issue on have controllers with DMA disabled by default due to issues with the controller hardware, not MD's problem at all there.

MDRAID RAID5 sets are non-bootable.

grub2 can boot them.

Is anyone shipping it yet though?  Etch still has 0.97.

In many years of software RAID management and in two years as mdadm
maintainer, I have never heard of a single case where md failed to
correctly identify a failed drive.

I should have been clearer....my problem is that MD gets over-zealous about marking drives as failed. Which, as you've noted (and I knew, but still don't necessarily agree with, and clearly did a poor job of acknowledging), is intentional.


I in no way meant to disrespect you or your work. I only meant to make aware some of the possible issues one can experience with software based RAID. MD has, and continues to, improve. You have your camp, I have mine. We disagree on what's best. In some situations software RAID is best, in some hardware RAID is best. It's up to each op to determine 'best' for themselves, in the end I feel that's what Free and Open Source Software and Open Source OSes are about.

Reply to: