[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: SATA disk errors



Tony van der Hoff (tony@vanderhoff.org on 2011-12-31 18:21 +0000):
> 199 UDMA_CRC_Error_Count    0x003e   200   199   000    Old_age
> Always -       455

This is your problem (well, symptom). The disk isn't failing
hardware-wise, but it is seeing a lot of transmission errors. The ATA
bus errors in dmesg seem to agree with that.

I'd say you have a problem on the SATA bus. That can be either a faulty
controller, faulty wiring or a faulty SATA chip on the disk. If you
can, try to swap the disk positions or connectors. If the problem
remains on ata3, the problem is in the controller. Similar tests can
help you rule out faulty cables.


Stan Hoeppner (stan@hardwarefreak.com on 2011-12-31 17:04 -0600):
> On 12/31/2011 12:21 PM, Tony van der Hoff wrote:
> 
> /dev/sda
> >   1 Raw_Read_Error_Rate     243530983
> >   7 Seek_Error_Rate         18363743
> 
> /dev/sdb
> >   1 Raw_Read_Error_Rate     138763088
> >   7 Seek_Error_Rate         1374378
> 
> Interestingly, SMART says these two drives have been in service only
> 2.6 months:
> 
> >   9 Power_On_Hours          1893
> 
> This indicates both drives are failing and should be replaced ASAP.

As Camaleon has said, for Seagate drives this isn't necessarily true.
My WD drives keep the raw value strictly at 0, but the Seagates I use
always report very high read error rate (same goes for ECC recovered).
They seem to me more like debug flags than actual counters.

I find it more instructive to look at the longevity indicators in the
middle (VALUE, WORST and THRESH), because they help interpret the raw
data without requiring you to read a data sheet.


Regards,
Arno


Reply to: