[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: ATA abnormal status



On Wed, Aug 23, 2006 at 10:16:30PM +0200, Francesco Pietra wrote:
> While computing with mpqc2.3.1 (debian etch amd6a; dual opterons; 8GB ram ECC; 
> raid 1; filesystem ext3; grub on its own partition):
> 
> Led of HD permanently lighted.
> 
> Messages on screen:
> 
> ATA: abnormal status 0x58 on port 0x1C5F
> ata3: command 0x35 timeout, stat 0x50 host_stat 0x24
> ata 4: same as above for ata3
> 
> Trying:
> $ df -h
> 
> sd 3:0:0:0 SCSI error return code 0x8000002
> Additional sense : SCSI parity error
> end request: I/O error, dev sda, sector 47748992

Hmm, SATA drive, probably old kernel, sense code not yet mapped to ATA
sense codes. SCSI parity error means cable problem, IIRC this is mapped
to the ATA "CRC error", which also means bad cable.

> Later:
> raid1 Disk failure on sd6, disabling device
> raid1 :sdb3: redirecting another mirror
> 
> RAID1 conf printout
> --- wd:1 rd:2
> disk1, wo:0, o:1, dev: sdb8
> _______
> 
> I cold only switch power off because it did not respond to down commands.
> _______
> 
> Rebooting, the $ prompt was obtained without warnings.
> 
> Then I looked at
>  /etc/fstab
> 
> and issued:
> 
> #fdisk /dev/sda
> # p
> #fdisk /dev/sdb
> # p
> #df -h
> 
> there was nothing wrong: both disks identical to before.
> ______________________________
> 
> Similar hanging already occurred on 3 August (it was already ext3 filesystem) 
> during similar computation with mpqc. There was nothing wrong after rebooting 
> and up to now there was no anomaly. I checked disks and ram.
> 
> Before that, when using raiser 3.6 filesystem, I had many problems with debian 
> while carrying out mpqc computations . Therefore, I changed to ext3.
> 
> Thread computations with mpqc for without interruption many days are big 
> stress to the system (mostly for memory because mpqc writes sparingly on HD).
> ___________--
> 
> Any guess at what that means? I naively understand it was failure by the OS, 
> not failure of hardware.

I guess hardware failure. Replace cables and see if that fixes your
problem. Would be nice to know some more details: kernel version,
hardware (what sata controller, what drives).


Erik

-- 
+-- Erik Mouw -- www.harddisk-recovery.com -- +31 70 370 12 90 --
| Lab address: Delftechpark 26, 2628 XH, Delft, The Netherlands



Reply to: