[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

RAID problem: data lost on RAID 6...



Hi,

My Debian 3.1 (x86_64) system has suffered a very nasty mishap.

First, the IOMMU code ran out of space to map I/O to the SATA drives.
This looked to the md code like a faulty drive - so one by one, drives
were marked 'failed', until three components had failed out of the
seven-drive array, at which point it no longer functioned.

After rebooting, I got five drives back into the array - enough for it
to 'run' and be fscked. Almost recovered!

Then, a genuine drive failure - lots of entries like this in syslog:

***
end_request: I/O error, dev sde, sector 4057289
Buffer I/O error on device sde2, logical block 962111
ATA: abnormal status 0xD8 on port 0xFFFFC20000010287
ATA: abnormal status 0xD8 on port 0xFFFFC20000010287
ATA: abnormal status 0xD8 on port 0xFFFFC20000010287
ata7: command 0x25 timeout, stat 0xd8 host_stat 0x1
ata7: translated ATA stat/err 0xd8/00 to SCSI SK/ASC/ASCQ 0xb/47/00
ata7: status=0xd8 { Busy }
sd 6:0:0:0: SCSI error: return code = 0x8000002
sde: Current: sense key=0xb
    ASC=0x47 ASCQ=0x0
***

Of course, with two drives already (wrongly) marked 'failed', there's
nothing to rebuild with...

Is there a way I can 'unfail' another of the two drives and rebuild
from that? Trying to 'assemble' the array just results in the other two
being marked as "spare" components, then I'm told that 4 drives and 1
spare isn't enough to start a 7 drive array.


James.



Reply to: