[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

SATA software RAID recovery problem



Hello,

I'm trying to add SATA software RAID-5 to a system, which boots an runs
off an internal IDE HD (storage space provided by the RAID is intended
for backups and archives). It's a Pentium4 2.4GHz Dell desktop, 1GB RAM.
SATA adapter is Promise SATAII-150 TX4. Debian unstable, kernel
2.6.14-2, mdadm 1.12.

RAID is created by
# mdadm -C /dev/md0 -l5 -n4 -x0 /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1
and seems to work fine (create ext3 on md0, mount, rsync ~100G of files).

I'm trying to simulate a disk failure then:
# mdadm /dev/md0 -f /dev/sdb1
# mdadm /dev/md0 -r /dev/sdb1

The disk is properly marked as faulty, then removed, array continues to
function well. However, when I try to add a "replacement" disk
(--zero-superblock before that), i.e.:
# mdadm /dev/md0 -a /dev/sdb1
the system encounters

ataX: status=0x51 { DriveReady SeekComplete Error }
ataX: error=0x40 { UncorrectableError }

at 2.8% of array reconstruction. sdX1 corresponding to ataX is
eventually (after a large number of above errors) marked as faulty and
/dev/md0 is stopped. The error is not on the disk which is being added.
This looks like a hard disk problem, but, after switching disks on SATA
channels and using just 3 disks (instead of 4), I'm still getting the
same error on a random disk. Also, even that I use -x0 (no spare drives)
option, RAID-5 is still always created with a spare. Activating the
spare will always succeed and it looks like essentially the same process
as adding a replacement disk.

I have tried this with sata_promise.ko and ulsata2.ko (open source
driver from Promise), Debian Sid, Ubuntu Breezy and openSUSE 10 and also
by moving RAID to another identical Dell system with the same results.

Has anybody experienced something similar? Why would the problem surface
on array reconstruction only? Any ideas on what else to test?

I would suspect SATA adapter, if the array wouldn't be working well
until reconstruction is attempted. BTW, Promise eSupport site only gives
 a permanent "Please wait..."---anybody had a better luck there?

I will most likely be looking for a different SATA adapter. Any
recommendations here?

Thanks in advance,

Sarunas Burdulis
Systems Administrator
Department of Mathematics
Dartmouth College



Reply to: