[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Problem Replacing LVM on RAID1 Disk


The four failures seem really high to me too. This might be a silly
question but: have you checked/replaced the controller and cables yet?

I had a machine with four disks and one of them picked randomly were
reported as bad every few weeks (all of them connected to the motherboard).
I ruled out the cables and HDDs so I decided to put a PCI controller card
in the machine and since then I got no errors. (Of course, my best choice
would be to replace the motherboard, but that was not an option at that

Are you sure your disks are bad? Have you ran badblocks on them ("badblocks
-vws" for read-WRITE mode, check man page before running)?


On Sat, 17 Jul 2010 15:19:30 -0500, Stan Hoeppner <stan@hardwarefreak.com>
> Matthew Glubb put forth on 7/17/2010 3:11 AM:
>> Normally in the past when a disk has failed, I have dropped the
> offending disk from the array, replaced the disk, booted, rebuilt the
> filesystem on the new disk and re-synced the array. I've done this about
> four times with this method.
> Once you fix your immediate problem you really need to address the larger
> issue, which is:
> Why are you suffering so many disk failures, apparently on a single host?
> The probability of one OP/host suffering 4 disk failures, even over a
> period such as 10 years, is astronomically low.  If you manage a server
> farm
> of a few dozen or more hosts and had one disk failure on each of four of
> them,
> the odds are bit higher.  However in your case we're not talking about a
> farm
> situation are we?
> Are these disks really failing, or are you seeing the software RAID
> flag disks that aren't really going bad?  What make/model disk drives are
> these that are apparently failing?  Do you have sufficient airflow in the
> case
> to cool the drives?  Is the host in an environment with a constant
> temperature over 80 degrees Fahrenheit?
> -- 
> Stan

Reply to: