Re: Multiple hardware and RAID failures
Sounds like the new setup is drawing more power than the PS can supply.
Especially, if removing some disk, makes it work again. I had a
similar problem with a system with lots of disks. Put in a beefer PS and
all was good again.
Frank Hart wrote:
My Athlon 3400Mhz server was running flawlessly until a couple of weeks
ago. The system is fitted with 2 SATA 200GB Maxtor disks in a Raid 1
configuration with mdadm and a seperate IDE disk. The motherboard is an
Asus MB K8V-MX AMD S754.
All of a sudden the system freezed. I did a hard reset and after an hour
it stopped again. The logging reported dma errors on the seperate IDE
disk and RAID failures. Now, the server also refused to boot. After I
disconnected one SATA disk it would start again. Thinking it was a disk
failure, I replaced the disk the following day only to find out that
after installing the new disk, the system wouldn't boot at all. After
disconnecting all cables (except to power :P) and removing the memory
the system beeped a couple of times. I replaced the motherboard with a
new K8V-MX and all seemed fine again. But one day later the other SATA
disk got thrown out of the mirror and the system again didn't want to
boot. I replaced the SATA disk and the IDE disk with brand new ones.
Now, I was pretty convinced all this horror happened because of some
power surge so after replacing the disks I installed an APC with power
overload protection. But after just a couple of days the SATA disk I
replaced first started to give errors again. Mdadm reports errors on all
mirror sets and after an upgrade to kernel 2.6.15 I can't get the disk
out of faulty status.
Can someone enlighten me what the hell is going on here? I replaced all
components except the power supply. Could this be the problem? The only
one who is benefitting from all this is the local PC store ;)