Re: Why didn't software RAID detect a faulty drive?
Seth Mattinen wrote:
The system was horribly unresponsive; I never did try adding the drive
back in because it was a live server and I didn't want to risk it. I
would have expected any RAID to fault an unresponsive drive even if it
was a quirk. I just replaced it.
Two things I learned recently, the hard way, when I had a RAID drive fail:
1. Drives can fail in ways that can get masked for a long time, in
particular - increasing numbers of disk reads or writes that eventually
succeed - after lots of retries. The symptom is that things slow down
to a crawl. Not sure why the md software doesn't simply fail drives
that exhibit long delays, but it doesn't seem to (ideas anyone?).
2. If all of your drives are the same age - it would be a very good idea
to replace the OTHER drives in your RAID array before they start
failing. In my case, I had a server with four drives (2 RAID1 sets).
As I was recovering from one drive failure, two of the others failed in
rapid succession. Not very pretty at all.
Miles Fidelman
--
In theory, there is no difference between theory and practice.
In practice, there is. .... Yogi Berra
Reply to: