Re: Bad Blocks in IDE software Raid 1

To: "I. Forbes" <iforbes@zsd.co.za>, debian-isp@lists.debian.org
Subject: Re: Bad Blocks in IDE software Raid 1
From: Russell Coker <russell@coker.com.au>
Date: Tue, 15 Apr 2003 20:21:14 +1000
Message-id: <[🔎] 200304152021.14584.russell@coker.com.au>
Reply-to: Russell Coker <russell@coker.com.au>
In-reply-to: <[🔎] 3E9BF0C4.20070.960FDF@localhost>
References: <[🔎] 3E9BF0C4.20070.960FDF@localhost>

On Tue, 15 Apr 2003 19:45, I. Forbes wrote:
> As far as I know, with modern IDE drives the formated drive includes
> spare blocks and the drive firmware will automatically re-map the drive
> to replace bad blocks with ones from the spare space. This all
> happens transparently without any feedback to the system log files.

True.  The drive does that wherever possible.

If you do a write and something goes wrong then the data will be re-mapped.  I 
don't know how many (if any) drives do "read after write" verification.  If 
they don't then it's likely that an error will only be discovered some time 
later when you want to read the data (and this can happen even if the data is 
verified).

Then the drive will return a read error.  If you then write to the bad block 
the drive will usually perform a re-mapping and after that things will be 
fine.

If using software RAID then a raidhotadd operation will usually trigger a 
re-mapping on the sector that caused the disk in question to be removed from 
the array.

> This would imply that bad blocks on one drive in an array are mapped
> out by the firmware, until a point is reached where there are no spare
> blocks on that drive. Further bad blocks would result in disk errors and
> the drive would be "failed" out of the array.

That should not happen for a long time.  You can use SMART to determine how 
many re-mapping events have occurred.  Expect to be able to remap at least 
1000 blocks before running out.

> The ext2 file system also handles mapping out of bad blocks. These
> can be detected during the initial formating of the drive, or during
> subsequent fsck runs.

True, although I've never detected bad blocks during fsck and I don't recall 
the last time I detected them during format (I haven't even done mkfs -c for 
years).

> Can ext2 file systems actively map out bad blocks during normal
> operation?

I don't think so, and I don't think it's desirable with modern IDE and SCSI 
drives.

> Finally, if an ext2 filesystem is mounted on a Linux software raid1
> device, and a file system error occurs, will a portion of that device be
> mapped out as a bad block, or will one of the drives be "failed" out of
> the array?

One of the drives will be removed from the array and the file system drivers 
won't know the difference.

> If ext2 maps out a bad block, I assume the same block on both the
> good and bad drives gets mapped out.

True.

> If one of the drives is "failed" it would explain why the failure rate on
> raid drives seems higher than that in single drive machines. ie Raid
> fails the drive, while in a single drive machine ext2 caries on, hiding
> the problem from the end user who is not watching the log files.

It won't be hidden.  It may even result in a kernel panic.  But you are 
correct that there are situations where software RAID will make errors more 
obvious, this is a good thing IMHO.

-- 
http://www.coker.com.au/selinux/   My NSA Security Enhanced Linux packages
http://www.coker.com.au/bonnie++/  Bonnie++ hard drive benchmark
http://www.coker.com.au/postal/    Postal SMTP/POP benchmark
http://www.coker.com.au/~russell/  My home page

Reply to:

Follow-Ups:
- Re: Bad Blocks in IDE software Raid 1
  - From: "I. Forbes" <iforbes@zsd.co.za>

References:
- Bad Blocks in IDE software Raid 1
  - From: "I. Forbes" <iforbes@zsd.co.za>

Prev by Date: Bad Blocks in IDE software Raid 1
Next by Date: Re: Bad Blocks in IDE software Raid 1
Previous by thread: Bad Blocks in IDE software Raid 1
Next by thread: Re: Bad Blocks in IDE software Raid 1
Index(es):
- Date
- Thread