Re: Multiple hardware and RAID failures
On Tue, Mar 14, 2006 at 09:43:55AM +0100, Frank Hart wrote:
> My Athlon 3400Mhz server was running flawlessly until a couple of weeks
> ago. The system is fitted with 2 SATA 200GB Maxtor disks in a Raid 1
> configuration with mdadm and a seperate IDE disk. The motherboard is an
> Asus MB K8V-MX AMD S754.
> All of a sudden the system freezed. I did a hard reset and after an hour
> it stopped again. The logging reported dma errors on the seperate IDE
> disk and RAID failures. Now, the server also refused to boot. After I
> disconnected one SATA disk it would start again. Thinking it was a disk
> failure, I replaced the disk the following day only to find out that
> after installing the new disk, the system wouldn't boot at all. After
> disconnecting all cables (except to power :P) and removing the memory
> the system beeped a couple of times. I replaced the motherboard with a
> new K8V-MX and all seemed fine again. But one day later the other SATA
> disk got thrown out of the mirror and the system again didn't want to
> boot. I replaced the SATA disk and the IDE disk with brand new ones.
> Now, I was pretty convinced all this horror happened because of some
> power surge so after replacing the disks I installed an APC with power
> overload protection. But after just a couple of days the SATA disk I
> replaced first started to give errors again. Mdadm reports errors on all
> mirror sets and after an upgrade to kernel 2.6.15 I can't get the disk
> out of faulty status.
> Can someone enlighten me what the hell is going on here? I replaced all
> components except the power supply. Could this be the problem? The only
> one who is benefitting from all this is the local PC store ;)
Perhaps your power supply is defective and not providing steady power,
or isn't happy with the load that is on it. I have seen quite a few
systems that were unstable and had disk issues and crashed, where the
problem disappeared once a quality power supply was installed.