[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Multiple hardware and RAID failures

Sounds like the new setup is drawing more power than the PS can supply. Especially, if removing some disk, makes it work again. I had a similar problem with a system with lots of disks. Put in a beefer PS and all was good again.

-Steve W.

Frank Hart wrote:
My Athlon 3400Mhz server was running flawlessly until a couple of weeks
ago. The system is fitted with 2 SATA 200GB Maxtor disks in a Raid 1
configuration with mdadm and a seperate IDE disk. The motherboard is an
Asus MB K8V-MX AMD S754.

All of a sudden the system freezed. I did a hard reset and after an hour
it stopped again. The logging reported dma errors on the seperate IDE
disk and RAID failures. Now, the server also refused to boot. After I
disconnected one SATA disk it would start again. Thinking it was a disk
failure, I replaced the disk the following day only to find out that
after installing the new disk, the system wouldn't boot at all. After
disconnecting all cables (except to power :P) and removing the memory
the system beeped a couple of times. I replaced the motherboard with a
new K8V-MX and all seemed fine again. But one day later the other SATA
disk got thrown out of the mirror and the system again didn't want to
boot. I replaced the SATA disk and the IDE disk with brand new ones.

Now, I was pretty convinced all this horror happened because of some
power surge so after replacing the disks I installed an APC with power
overload protection. But after just a couple of days the SATA disk I
replaced first started to give errors again. Mdadm reports errors on all
mirror sets and after an upgrade to kernel 2.6.15 I can't get the disk
out of faulty status.

Can someone enlighten me what the hell is going on here? I replaced all components except the power supply. Could this be the problem? The only one who is benefitting from all this is the local PC store ;)


Reply to: