Re: Temporary 'lock-up' under heavy write, MegaRAID RAID-5

On Thu, Nov 10, 2005 at 08:11:17AM +0000, Dave Ewart wrote:
> On Wednesday, 09.11.2005 at 22:07 -0800, Andrew Sharp wrote:
> > This whole thread may be OT for this list, but you guys running 4 way
> > SMP opterons with three whole drives in a raid 5, throw away those
> > raid cards and just use the software raid.  
> That's a perfectly sensible suggestion and I'm guessing it would work
> reasonably well.
> However, the system in question is a live system and it's not easy to
> simply change the disk arrangement on the fly: we are unfortunate in
> that the problem of "some apps locking-up during heavy write conditions"
> only manifested itself once the system was deployed and came into heavy
> use.

Ah, a production system.  Drat.

> However, I am planning on flashing the BIOS on the RAID controller
> during our next maintenance window.  If that doesn't fix it, then we can
> perhaps try a backup-n-restore and use software RAID.

Hopefully that will fix it.

> Are you implying that the RAID card (or the megaraid driver) is faulty?
> Would you suggesting using a different controller in this case?

Well, no, not necessarily faulty, but some default parameters might not
be agreeing so well with some other default parameters.  Like stride size
and disk cache size, fs type, etc.  Hopefully the raid card's default
stride size doesn't disagree with its own cache size.  And all this is
ruling out that there isn't something specific that needs to be done to
this card/setup to pull the cork out.  Have you googled?


