Re: [OT] 19"/2U Cases

To: debian-isp@lists.debian.org
Subject: Re: [OT] 19"/2U Cases
From: martin f krafft <madduck@debian.org>
Date: Thu, 30 Aug 2007 07:56:52 +0200
Message-id: <[🔎] 20070830055652.GA28932@piper.oerlikon.madduck.net>
Mail-followup-to: debian-isp@lists.debian.org
In-reply-to: <[🔎] 8C5B136FF6B0449BED449590@dhcp-2-206.wgops.com> <[🔎] 200708291454.31723.mgb-debian@yosemite.net> <[🔎] E2A15BF150C7AB4F80CCDAA6@dhcp-2-206.wgops.com>
References: <[🔎] E2A15BF150C7AB4F80CCDAA6@dhcp-2-206.wgops.com> <[🔎] 200708291454.31723.mgb-debian@yosemite.net> <[🔎] 8C5B136FF6B0449BED449590@dhcp-2-206.wgops.com> <[🔎] 20070829102614.GL1409@freenet.de> <[🔎] 200708291303.54360.mgb-debian@yosemite.net> <[🔎] E2A15BF150C7AB4F80CCDAA6@dhcp-2-206.wgops.com> <[🔎] 200708291454.31723.mgb-debian@yosemite.net> <[🔎] 20070829102614.GL1409@freenet.de> <[🔎] 200708291303.54360.mgb-debian@yosemite.net> <[🔎] E2A15BF150C7AB4F80CCDAA6@dhcp-2-206.wgops.com>

My only purpose in bothering to reply to this message is to set
straight all the things the original poster, Michael, misunderstood
or falsely claimed. I am not interested in a discussion unless
arguments about how md sucks are actually backed up.

also sprach Michael Loftis <mloftis@modwest.com> [2007.08.29.2245 +0200]:
> MDRAID certainly isn't reliable in a huge number of failure cases.
> It causes the machine to OOPS/lock up.  Or even lose data.

I've never seen a bug report by you.

> MDRAID is also very difficult to administer, offering only
> (depending on your version) mdadm or raid* tools.  mdadm is rather
> arcane.  simple operations are not well documented, like, how do
> i replace a failed drive?  or start a rebuild?  

Have you actually bothered to look into /usr/share/doc/mdadm?

> there's no 'rebuild drive' it's completely NON automated either.
> meaning it always takes user intervention to recover from any
> failure.

Not if you're using spares. But even then, yes, to pull a disk out
and insert a new one, you need to shut down the machine, unless you
have hotplugging drives. Same story for hardware RAID.

> a single I/O error causes MDRAID to mark the element as failed.
> it does not even bother to retry.

And that's a feature. I've seen disk corruption where a block would
return wrong data only in 1/10 reads. On retry, it would work, and
the RAID would hide the problem from me. I'd much rather have
a failed drive

> MDRAID is also incapable of performing background patrolling 
> reads, something i think even 3Ware does. 

Wrong. It does this only once a month by default (on Debian; the
mdadm sunday), but you could make it do that every hour.

> MDRAID RAID5 sets are non-bootable.

grub2 can boot them.

> I can never recommend any software RAID for anything other than
> simple mirrors, and then, always, with the caveat that it will be
> a bitch to fix if things go wrong, you probably won't lose data,
> but getting a software raid running again is often arcane,
> especially with MDRAID and it's frequent inability to correctly
> identify a failed drive (sometimes the fault of the SATA
> controller mind you).

In many years of software RAID management and in two years as mdadm
maintainer, I have never heard of a single case where md failed to
correctly identify a failed drive.

> And god forbid you lose your boot drive and have forgotten to keep
> all the boot blocks on your spare properly updated.  you also have
> to manually intervene and reorder drives in that case, something
> hardware raid, any hardware raid, will transparently cover.

You can easily boot off RAID1 even with grub1 in such a way that
it's impossible for the boot blocks to get out of sync.

also sprach Mike Bird <mgb-debian@yosemite.net> [2007.08.29.2354 +0200]:
> With the current Grub one should grub-install each mirror.

Which is a one-time operation and has been fixed in grub2 to the
best of my knowledge.



also sprach Michael Loftis <mloftis@modwest.com> [2007.08.30.0037 +0200]:
> On any hardware raid (atleast with a hotswap chassis) you can
> remove, and insert a new drive, live, no intervention, and the
> RAID takes care of starting the rebuild/readding the drive.

You can make software RAID do exactly the same with udev hooks. It's
just that I won't automate that for you because this is Debian: we
don't want magic things happening behind our backs. Instead, we want
to stay in control of our systems. If you don't like it, find
another distribution; there are plenty out there who'd take patches
that auto-readd hotswapped drives to RAIDs.

> Now I might be wrong but Linux AFAIK does not support SATA hotswap
> on most controllers.

Linux does, the El-cheapo controllers don't.

> I've seen it mostly work on SCSI systems (you have to manually
> rescan the scsi bus usually to get the kernel to update it's list
> of drives).  But on FibreChannel when a loop has an issue, the
> kernel will tend to mark the loop down, and no amount of coaxing
> short of a reboot will get that loop back into the up state.  Just
> as recently as this week or last week on a  2.6.18 kernel MD RAID
> flipped on a mirror and marked both drives bad, when neither had
> any detectable issue.  This caused the machine to OOPS/panic and
> stop.  Neither drive was faulty.

It's understandable that md will fail in such a case. That the
machine paniced is a Linux issue. The situation sounds akin to
damaging the cables between your beloved hardware RAID controller
and the disks; if the drive contains, e.g. swap, the system might
lock up even though none of the disks have failed.

> Many hardware raid controllers support partitioning like this, but
> it's an advanced option found only on higher end cards.  I haven't
> seen any consumer level RAIDs support this, so you have that one
> for sure.  As far as hardware raid cards failing, in the hundreds
> of installations, I've seen it once.

Yeah, and with a couple thousand people running my mdadm package,
I've never heard of cases as arcane as what you claim md to be.

> I'd love to see any documentation on any of this.

Start compiling. This is the free software world: you don't bitch
about something not working, you try to rectify the situation.
Unless you don't get it or like making a fool of yourself. You'll
have my full support.

> installations are of about four major 'flavors'.  RedHat9, FC3,
> Debian 3.0 and 3.1, and Debian 4.0.  And none are immune to the
> issues.  Debian 3 was pretty bad sometimes not making it past the
> initrd when a drive failed. The MD setups were all done using the
> normal TUI (anaconda, debian's system installer) tools during
> installation.

Debian 3.0 didn't let you set up RAID in the normal TUI.

> And grub-installing on both drives isn't as simple as it sounds,
> because it only works right if their geometry matches.  the grub
> installer isn't smart enough to figure out if things don't match.
> typing grub-install to install a boot block on hd1 (sdb, hdb,
> whatever it really is) won't necessarily give you a bootable hd1,
> because if your grub config references hd0 partitions, 

True. grub isn't flawless as it is and does require admin brain to
get it working. If that annoys you, maybe you want to put some time
into grub2 and make sure it'll be better?

> The other issue is that no bios i know of will handle if the boot
> drive fails in some way that doesn't leave it simply not showing
> up.  and most of the time they tend to fail in ways that leave
> them showing up to the bios, but are actually unusable.
> a hardware raid solves this.

I have seen systems who fell over this, but not recently. My two
RAID test systems here take a bit longer, but after about 30
seconds of failing to talk to hd0, the BIOS will load hd1. And it's
the same for all machines I use in production.

> A related issue to that is the fact that most PC BIOS' have
> a pretty sad serial console support.  This means that failures
> will more often require onsite visits if a reboot (for whatever
> reason) happens after a boot drive failure but before you can get
> a tech on site.

So don't get a PC. If your hardware RAID card has an issue, you'll
have to leave the house too.

> Software RAID has caveats, it's not perfect.  Hardware RAID has
> caveats, it's not perfect.  Having seen far more issues in the
> real world with software RAIDs than with hardware RAIDs puts me
> pretty squarely in the hardware RAID camp.

My main argument against hardware RAID comes from experience: we had
a fileserver running on hardware RAID (adaptec) for *years*, when
the controller died. We did buy *two* spare controllers, but neither
worked for more than a few months. In the end, they were all three
dead. The controller could not be bought any more and the next one
up couldn't read our RAID. All data had to be restored from backup.
That puts me "pretty squarely" in the software RAID camp.

> I am not FUDing as you put it,

You are. Maybe your experience has led you to dislike software RAID
and prefer hardware RAID, but the way you communicate it very much
reminds me of FUD as you're simply making claims without any facts
to back them up.

I won't reply to this thread again unless you actually provide hard
facts and examples which would be interesting to me and the other
readers of this list to help us (a) learn from your troubles, (b)
improve the products involved, (c) avoid problems arising from
clearly understandable situations that we can map to real life,
instead of subjective claims.

And maybe consider keeping your mails a bit shorter too.

-- 
 .''`.   martin f. krafft <madduck@debian.org>
: :'  :  proud Debian developer, author, administrator, and user
`. `'`   http://people.debian.org/~madduck - http://debiansystem.info
  `-  Debian - when you have better things to do than fixing systems
 
"a human being should be able to change a diaper, plan an invasion,
 butcher a hog, conn a ship, design a building, write a sonnet,
 balance accounts, build a wall, set a bone, comfort the dying, take
 orders, give orders, cooperate, act alone, solve equations, analyze
 a new problem, pitch manure, program a computer, cook a tasty meal,
 fight efficiently, die gallantly. specialization is for insects."
                                                  -- robert heinlein

Attachment: digital_signature_gpg.asc
Description: Digital signature (see http://martin-krafft.net/gpg/)

Reply to:

Follow-Ups:
- Re: Software RAID (was [OT] 19"/2U Cases)
  - From: Keith Edmunds <kae@midnighthax.com>
- Hardware/Software RAID (nearly a religious war...)
  - From: Michael Loftis <mloftis@modwest.com>
- Re: [OT] 19"/2U Cases
  - From: martin f krafft <madduck@debian.org>

References:
- Re: [OT] 19"/2U Cases
  - From: Michael Loftis <mloftis@modwest.com>
- Re: [OT] 19"/2U Cases
  - From: Mike Bird <mgb-debian@yosemite.net>
- Re: [OT] 19"/2U Cases
  - From: Michael Loftis <mloftis@modwest.com>
- [OT] 19"/2U Cases
  - From: Michelle Konzack <linux4michelle@freenet.de>
- Re: [OT] 19"/2U Cases
  - From: Mike Bird <mgb-debian@yosemite.net>

Prev by Date: Re: [OT] 19"/2U Cases
Next by Date: Re: Software RAID (was [OT] 19"/2U Cases)
Previous by thread: Re: [OT] 19"/2U Cases
Next by thread: Re: Software RAID (was [OT] 19"/2U Cases)
Index(es):
- Date
- Thread