Re: SATA disk errors
Tony van der Hoff (email@example.com on 2011-12-31 18:21 +0000):
> 199 UDMA_CRC_Error_Count 0x003e 200 199 000 Old_age
> Always - 455
This is your problem (well, symptom). The disk isn't failing
hardware-wise, but it is seeing a lot of transmission errors. The ATA
bus errors in dmesg seem to agree with that.
I'd say you have a problem on the SATA bus. That can be either a faulty
controller, faulty wiring or a faulty SATA chip on the disk. If you
can, try to swap the disk positions or connectors. If the problem
remains on ata3, the problem is in the controller. Similar tests can
help you rule out faulty cables.
Stan Hoeppner (firstname.lastname@example.org on 2011-12-31 17:04 -0600):
> On 12/31/2011 12:21 PM, Tony van der Hoff wrote:
> > 1 Raw_Read_Error_Rate 243530983
> > 7 Seek_Error_Rate 18363743
> > 1 Raw_Read_Error_Rate 138763088
> > 7 Seek_Error_Rate 1374378
> Interestingly, SMART says these two drives have been in service only
> 2.6 months:
> > 9 Power_On_Hours 1893
> This indicates both drives are failing and should be replaced ASAP.
As Camaleon has said, for Seagate drives this isn't necessarily true.
My WD drives keep the raw value strictly at 0, but the Seagates I use
always report very high read error rate (same goes for ECC recovered).
They seem to me more like debug flags than actual counters.
I find it more instructive to look at the longevity indicators in the
middle (VALUE, WORST and THRESH), because they help interpret the raw
data without requiring you to read a data sheet.