Re: Multiple hardware and RAID failures
On Tue, Mar 14, 2006 at 08:54:58AM -0500, Lennart Sorensen wrote:
> On Tue, Mar 14, 2006 at 09:43:55AM +0100, Frank Hart wrote:
> > My Athlon 3400Mhz server was running flawlessly until a couple of weeks
> > ago. The system is fitted with 2 SATA 200GB Maxtor disks in a Raid 1
> > configuration with mdadm and a seperate IDE disk. The motherboard is an
> > Asus MB K8V-MX AMD S754.
> > All of a sudden the system freezed. I did a hard reset and after an hour
> > it stopped again. The logging reported dma errors on the seperate IDE
> > disk and RAID failures. Now, the server also refused to boot. After I
> > disconnected one SATA disk it would start again. Thinking it was a disk
> > failure, I replaced the disk the following day only to find out that
> > after installing the new disk, the system wouldn't boot at all. After
> > disconnecting all cables (except to power :P) and removing the memory
> > the system beeped a couple of times. I replaced the motherboard with a
> > new K8V-MX and all seemed fine again. But one day later the other SATA
> > disk got thrown out of the mirror and the system again didn't want to
> > boot. I replaced the SATA disk and the IDE disk with brand new ones.
> > Now, I was pretty convinced all this horror happened because of some
> > power surge so after replacing the disks I installed an APC with power
> > overload protection. But after just a couple of days the SATA disk I
> > replaced first started to give errors again. Mdadm reports errors on all
> > mirror sets and after an upgrade to kernel 2.6.15 I can't get the disk
> > out of faulty status.
> > Can someone enlighten me what the hell is going on here? I replaced all
> > components except the power supply. Could this be the problem? The only
> > one who is benefitting from all this is the local PC store ;)
> Perhaps your power supply is defective and not providing steady power,
> or isn't happy with the load that is on it. I have seen quite a few
> systems that were unstable and had disk issues and crashed, where the
> problem disappeared once a quality power supply was installed.
True enough. If you had a power spike, anything connected to power
could be now suspect. PS on down.
I didn't read where you checked your disk cables. I've had more than a
couple of cables of various sorts just suddenly go bad for no reason.
Good luck with it.