[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Testing Spare Drives in Software RAID



On Wed, Aug 22, 2007 at 11:05:17PM -0400, Hal Vaughan wrote:
> I know mdadm offers a "--test" command, but it seems quite useless:
> 
> -t, --test
>               Generate a TestMessage alert for every array found at 
> startup.  This alert gets mailed and passed to the alert program.  This 
> can be  used for testing that alert message to get through 
> successfully.
> 
> That helps with knowing that alerts are working, but I'd like to be able 
> to test the spare drive in an array.  Is there any way to test a spare 
> device in a RAID to make sure it's in good shape and not likely to fail 
> once it's needed?
> 

I don't know how to test the actual functioning of a spare drive.  With
no activity on a spare, S.M.A.R.T. may not have any data to go on.

You could have an extra spare drive.  This spare drive could be as large
as your largest partition used in your arrays.  You could test this
drive off-line with, for example, badblocks to write and read every
block of the drive, giving S.M.A.R.T. something to go by.  Then you
could partition this drive as needed to create a temporary spare
partition to put into an array, allowing you to remove a spare drive
from an array for similar testing.

If the array is raid1, you could add the spare in so that there are 3
mirror images.  Once it has synced, S.M.A.R.T. should have some good
data.  You can then remove it and turn it back into a spare.  I don't
know how to do this with raid levels other than 1.

Another method that doesn't require a spare-spare but with some risk
would be to force-fail one of the active drives so that the spare comes
into play, then add it back in as a spare.  The risk is that if a drive
in the array actually dies while the spare is being sync'ed then the
array has two failed drives and the array itself fails.

Doug.



Reply to: