Re: RAID1 problem - server freezes on md data-check

To: Thomas Goirand <thomas@goirand.fr>, debian-isp@lists.debian.org
Subject: Re: RAID1 problem - server freezes on md data-check
From: micah anderson <micah@riseup.net>
Date: Tue, 05 Jan 2010 09:59:25 -0500
Message-id: <[🔎] 87fx6k61iq.fsf@lillypad.riseup.net>
In-reply-to: <[🔎] 4B41E041.8080003@goirand.fr>
References: <[🔎] 1262591415.27174.27.camel@gchelidze.magti.ge> <[🔎] 151BC03492E46E4CB8D479E42CEF7890D1F5EE@exchange.wtc.local> <[🔎] 4B41E041.8080003@goirand.fr>

On Mon, 04 Jan 2010 20:34:09 +0800, Thomas Goirand <thomas@goirand.fr> wrote:
> Ross Halliday wrote:
> > Aside from any bugs that checkarray
> > function is definitely a pain on a production system.

I have this same problem with the Lenny kernels on certain machines. I
have not been able to identify anything specific that is identical on
the machines where this happens yet. Essentially, on these systems, the
monthly raid check requires a reboot as the drive subsystem becomes so
blocked that the load goes over 500 and the raid resync never
completes. I can wait for days for it and it wont finish. 

If I reboot the system and sync the raid arrays before anything starts
to use that particular partition, then everything works fine.

On these systems I disable the monthly raid check, its not the right
solution obviously, but it sucks to wake up on Sunday morning to find
multiple outages due to this scheduled raid check.

> Well, it's even more a pain to have no monthly check at all, and have
> your drive silently die without a warning. Also, my findings is that
> most of the time, such lock-up happens only on certain kind of
> controllers, or with defective (half working) HDD.

I agree silent drive death is bad, but in a raid mirror setup, if one of
the drives dies, wont you be fine?

I am pretty certain its not a particular type of controller, because I
have a number of duplicate hardware machines, some have this problem,
some do not. The 'half working' HDD was my theory as well, but smart
tests, badblocks doesn't seem to do anything.

m

Attachment: pgpSoceI2bT_d.pgp
Description: PGP signature

Reply to:

Follow-Ups:
- Re: RAID1 problem - server freezes on md data-check
  - From: Peter Vratny <peter@vratny.at>
- Re: RAID1 problem - server freezes on md data-check
  - From: Thomas Goirand <thomas@goirand.fr>

References:
- RAID1 problem - server freezes on md data-check
  - From: George Chelidze <wrath@geo.net.ge>
- RE: RAID1 problem - server freezes on md data-check
  - From: "Ross Halliday" <ross@wtccommunications.ca>
- Re: RAID1 problem - server freezes on md data-check
  - From: Thomas Goirand <thomas@goirand.fr>

Prev by Date: Re: RAID1 problem - server freezes on md data-check
Next by Date: Re: RAID1 problem - server freezes on md data-check
Previous by thread: Re: RAID1 problem - server freezes on md data-check
Next by thread: Re: RAID1 problem - server freezes on md data-check
Index(es):
- Date
- Thread