[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: RAID1 problem - server freezes on md data-check



On Mon, 04 Jan 2010 20:34:09 +0800, Thomas Goirand <thomas@goirand.fr> wrote:
> Ross Halliday wrote:
> > Aside from any bugs that checkarray
> > function is definitely a pain on a production system.

I have this same problem with the Lenny kernels on certain machines. I
have not been able to identify anything specific that is identical on
the machines where this happens yet. Essentially, on these systems, the
monthly raid check requires a reboot as the drive subsystem becomes so
blocked that the load goes over 500 and the raid resync never
completes. I can wait for days for it and it wont finish. 

If I reboot the system and sync the raid arrays before anything starts
to use that particular partition, then everything works fine.

On these systems I disable the monthly raid check, its not the right
solution obviously, but it sucks to wake up on Sunday morning to find
multiple outages due to this scheduled raid check.

> Well, it's even more a pain to have no monthly check at all, and have
> your drive silently die without a warning. Also, my findings is that
> most of the time, such lock-up happens only on certain kind of
> controllers, or with defective (half working) HDD.

I agree silent drive death is bad, but in a raid mirror setup, if one of
the drives dies, wont you be fine?

I am pretty certain its not a particular type of controller, because I
have a number of duplicate hardware machines, some have this problem,
some do not. The 'half working' HDD was my theory as well, but smart
tests, badblocks doesn't seem to do anything.

m

Attachment: pgpSoceI2bT_d.pgp
Description: PGP signature


Reply to: