Fwd: hard drive failure under RAID-1

To: debian-isp@lists.debian.org
Subject: Fwd: hard drive failure under RAID-1
From: Russell Coker <russell@coker.com.au>
Date: Tue, 25 Sep 2001 12:36:04 +0200
Message-id: <[🔎] 20010925115656.69AD0355091@lyta.coker.com.au>
Reply-to: Russell Coker <russell@coker.com.au>

The following is something to consider when setting up RAID arrays.  At the
moment AFAIK every RAID solution suffers from this problem.  :(


I have a Linux software RAID-1 array consisting of two IBM IDE hard drives.
The latest kernel works the same way as the 2.4.2 kernel I am using on that
machine.

I have just had them both fail at the same time!  They both had quite a
number of bad sectors, however there was no sector that was bad on both
disks!

The result I would have liked to see would be that when a bad sector is
encountered during a read from disk 0, then disk 1 should then be read.  If
the data can be read from disk 1 then it should be written back to disk 0.
If after that disk 0 can be read (the likely result using sector-sparing in
hardware) then it should give lots of huge kprintf() errors and keep running.

The result I saw was that disk 0 was marked as failed, then when a different
sector failed on disk 1 the ext2 file system saw errors, the system stopped
functioning correctly and needed a hard reset.  Then it paniced on boot
because it couldn't add either disk to the RAID-1.  Since then I have been
trying to recover it.  I wrote a program to read both disks and take data
from disk 1, but take it from disk 0 when disk 1 returned a bad sector.  But
this didn't work well because disk 1 had run for some time without disk 0.

In summary a situation which could have been salvaged by an emergency visit
to a computer store turned into a catastrophy.  :(

-- 
http://www.coker.com.au/bonnie++/     Bonnie++ hard drive benchmark
http://www.coker.com.au/postal/       Postal SMTP/POP benchmark
http://www.coker.com.au/projects.html Projects I am working on
http://www.coker.com.au/~russell/     My home page

Reply to:

Follow-Ups:
- Re: Fwd: hard drive failure under RAID-1
  - From: James <james@fr.clara.net>
- Re: Fwd: hard drive failure under RAID-1
  - From: WHIRLYCOTT <phil@whirlycott.com>

Prev by Date: Re: Mailserver with accounts seperated from unix-accounts
Next by Date: Re: Fwd: hard drive failure under RAID-1
Previous by thread: Re: pop3s package
Next by thread: Re: Fwd: hard drive failure under RAID-1
Index(es):
- Date
- Thread