HDD problems that do not follow SMART results
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hi,
I'm recurrently getting freezes because of HDD problems. During these
freezes, that generally last until I shut down the computer, I get such
messages:
==
smartctl 5.40 2010-07-12 r3124 [i686-pc-linux-gnu] (local build)
Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net
=== START OF INFORMATION SECTION ===
Model Family: Maxtor DiamondMax Plus 9 family
Device Model: Maxtor 6Y160M0
Serial Number: Y44NQSTE
Firmware Version: YAR51HW0
User Capacity: 163,928,604,672 bytes
Device is: In smartctl database [for details use: -P show]
ATA Version is: 7
ATA Standard is: ATA/ATAPI-7 T13 1532D revision 0
Local Time is: Tue Aug 28 16:09:09 2012 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
[...]
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
Aug 28 10:21:39 merciadriluca-station kernel: [ 2160.000030] ata6.00: exception Emask 0x10 SAct 0x0 SErr 0x400100 action 0x6 frozen
Aug 28 10:21:39 merciadriluca-station kernel: [ 2160.000035] ata6: SError: { UnrecovData Handshk }
Aug 28 10:21:39 merciadriluca-station kernel: [ 2160.000038] ata6.00: failed command: WRITE DMA EXT
Aug 28 10:21:39 merciadriluca-station kernel: [ 2160.000044] ata6.00: cmd 35/00:80:00:4f:f5/00:01:12:00:00/e0 tag 0 dma 196608 out
Aug 28 10:21:39 merciadriluca-station kernel: [ 2160.000046] res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x14 (ATA bus error)
Aug 28 10:21:39 merciadriluca-station kernel: [ 2160.000049] ata6.00: status: { DRDY }
Aug 28 10:21:39 merciadriluca-station kernel: [ 2160.000056] ata6: hard resetting link
Aug 28 10:21:39 merciadriluca-station kernel: [ 2160.476042] ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Aug 28 10:21:40 merciadriluca-station kernel: [ 2160.597999] ata6.00: configured for UDMA/133
Aug 28 10:21:40 merciadriluca-station kernel: [ 2160.598003] ata6.00: device reported invalid CHS sector 0
Aug 28 10:21:40 merciadriluca-station kernel: [ 2160.598008] ata6: EH complete
Aug 28 10:22:10 merciadriluca-station kernel: [ 2190.965242] ata6.00: exception Emask 0x10 SAct 0x0 SErr 0x400100 action 0x6 frozen
Aug 28 10:22:10 merciadriluca-station kernel: [ 2190.965247] ata6: SError: { UnrecovData Handshk }
Aug 28 10:22:10 merciadriluca-station kernel: [ 2190.965251] ata6.00: failed command: WRITE DMA EXT
Aug 28 10:22:10 merciadriluca-station kernel: [ 2190.965257] ata6.00: cmd 35/00:80:00:4f:f5/00:01:12:00:00/e0 tag 0 dma 196608 out
Aug 28 10:22:10 merciadriluca-station kernel: [ 2190.965258] res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x14 (ATA bus error)
Aug 28 10:22:10 merciadriluca-station kernel: [ 2190.965261] ata6.00: status: { DRDY }
Aug 28 10:22:10 merciadriluca-station kernel: [ 2190.965269] ata6: hard resetting link
Aug 28 10:22:10 merciadriluca-station kernel: [ 2191.440043] ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Aug 28 10:22:11 merciadriluca-station kernel: [ 2191.546566] ata6.00: configured for UDMA/133
Aug 28 10:22:11 merciadriluca-station kernel: [ 2191.546571] ata6.00: device reported invalid CHS sector 0
Aug 28 10:22:11 merciadriluca-station kernel: [ 2191.546578] ata6: EH complete
==
After restarting, I got messages such as
==
Aug 28 11:01:35 merciadriluca-station kernel: [ 233.816026] ata4.00: exception Emask 0x10 SAct 0x0 SErr 0x400100 action 0x6 frozen
Aug 28 11:01:35 merciadriluca-station kernel: [ 233.816031] ata4: SError: { UnrecovData Handshk }
Aug 28 11:01:35 merciadriluca-station kernel: [ 233.816035] ata4.00: failed command: WRITE DMA
Aug 28 11:01:35 merciadriluca-station kernel: [ 233.816040] ata4.00: cmd ca/00:90:08:71:05/00:00:00:00:00/e0 tag 0 dma 73728 out
Aug 28 11:01:35 merciadriluca-station kernel: [ 233.816042] res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x14 (ATA bus error)
Aug 28 11:01:35 merciadriluca-station kernel: [ 233.816045] ata4.00: status: { DRDY }
Aug 28 11:01:35 merciadriluca-station kernel: [ 233.816053] ata4: hard resetting link
Aug 28 11:01:35 merciadriluca-station kernel: [ 234.292041] ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Aug 28 11:01:35 merciadriluca-station kernel: [ 234.411821] ata4.00: configured for UDMA/133
Aug 28 11:01:35 merciadriluca-station kernel: [ 234.411826] ata4.00: device reported invalid CHS sector 0
Aug 28 11:01:35 merciadriluca-station kernel: [ 234.411831] ata4: EH complete
Aug 28 11:02:14 merciadriluca-station kernel: [ 272.780026] ata4: limiting SATA link speed to 1.5 Gbps
Aug 28 11:02:14 merciadriluca-station kernel: [ 272.780030] ata4.00: exception Emask 0x10 SAct 0x0 SErr 0x400100 action 0x6 frozen
Aug 28 11:02:14 merciadriluca-station kernel: [ 272.780034] ata4: SError: { UnrecovData Handshk }
Aug 28 11:02:14 merciadriluca-station kernel: [ 272.780038] ata4.00: failed command: WRITE DMA EXT
Aug 28 11:02:14 merciadriluca-station kernel: [ 272.780044] ata4.00: cmd 35/00:90:00:83:05/00:03:00:00:00/e0 tag 0 dma 466944 out
Aug 28 11:02:14 merciadriluca-station kernel: [ 272.780045] res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x14 (ATA bus error)
Aug 28 11:02:14 merciadriluca-station kernel: [ 272.780048] ata4.00: status: { DRDY }
Aug 28 11:02:14 merciadriluca-station kernel: [ 272.780056] ata4: hard resetting link
Aug 28 11:02:14 merciadriluca-station kernel: [ 273.256538] ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
Aug 28 11:02:14 merciadriluca-station kernel: [ 273.382089] ata4.00: configured for UDMA/133
Aug 28 11:02:14 merciadriluca-station kernel: [ 273.382093] ata4.00: device reported invalid CHS sector 0
Aug 28 11:02:14 merciadriluca-station kernel: [ 273.382098] ata4: EH complete
Aug 28 11:02:44 merciadriluca-station kernel: [ 303.380023] ata4.00: exception Emask 0x10 SAct 0x0 SErr 0x400101 action 0x6 frozen
Aug 28 11:02:44 merciadriluca-station kernel: [ 303.380028] ata4: SError: { RecovData UnrecovData Handshk }
Aug 28 11:02:44 merciadriluca-station kernel: [ 303.380032] ata4.00: failed command: WRITE DMA EXT
Aug 28 11:02:44 merciadriluca-station kernel: [ 303.380038] ata4.00: cmd 35/00:90:00:83:05/00:03:00:00:00/e0 tag 0 dma 466944 out
Aug 28 11:02:44 merciadriluca-station kernel: [ 303.380039] res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x14 (ATA bus error)
Aug 28 11:02:44 merciadriluca-station kernel: [ 303.380042] ata4.00: status: { DRDY }
==
and also
==
Aug 28 11:04:49 merciadriluca-station kernel: [ 427.572574] sd 3:0:0:0: [sdc] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Aug 28 11:04:49 merciadriluca-station kernel: [ 427.572578] sd 3:0:0:0: [sdc] Sense Key : Aborted Command [current] [descriptor]
Aug 28 11:04:49 merciadriluca-station kernel: [ 427.572582] Descriptor sense data with sense descriptors (in hex):
Aug 28 11:04:49 merciadriluca-station kernel: [ 427.572584] 72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00
Aug 28 11:04:49 merciadriluca-station kernel: [ 427.572592] 00 00 00 00
Aug 28 11:04:49 merciadriluca-station kernel: [ 427.572596] sd 3:0:0:0: [sdc] Add. Sense: No additional sense information
Aug 28 11:04:49 merciadriluca-station kernel: [ 427.572600] sd 3:0:0:0: [sdc] CDB: Write(10): 2a 00 00 05 83 00 00 03 90 00
Aug 28 11:04:49 merciadriluca-station kernel: [ 427.572608] end_request: I/O error, dev sdc, sector 361216
Aug 28 11:04:49 merciadriluca-station kernel: [ 427.572613] Buffer I/O error on device sdc5, logical block 43136
Aug 28 11:04:49 merciadriluca-station kernel: [ 427.572615] lost page write due to I/O error on sdc5
Aug 28 11:04:49 merciadriluca-station kernel: [ 427.572622] Buffer I/O error on device sdc5, logical block 43137
Aug 28 11:04:49 merciadriluca-station kernel: [ 427.572625] lost page write due to I/O error on sdc5
Aug 28 11:04:49 merciadriluca-station kernel: [ 427.572629] Buffer I/O error on device sdc5, logical block 43138
Aug 28 11:04:49 merciadriluca-station kernel: [ 427.572631] lost page write due to I/O error on sdc5
Aug 28 11:04:49 merciadriluca-station kernel: [ 427.572636] Buffer I/O error on device sdc5, logical block 43139
Aug 28 11:04:49 merciadriluca-station kernel: [ 427.572638] lost page write due to I/O error on sdc5
Aug 28 11:04:49 merciadriluca-station kernel: [ 427.572642] Buffer I/O error on device sdc5, logical block 43140
Aug 28 11:04:49 merciadriluca-station kernel: [ 427.572644] lost page write due to I/O error on sdc5
Aug 28 11:04:49 merciadriluca-station kernel: [ 427.572648] Buffer I/O error on device sdc5, logical block 43141
Aug 28 11:04:49 merciadriluca-station kernel: [ 427.572651] lost page write due to I/O error on sdc5
Aug 28 11:04:49 merciadriluca-station kernel: [ 427.572655] Buffer I/O error on device sdc5, logical block 43142
Aug 28 11:04:49 merciadriluca-station kernel: [ 427.572657] lost page write due to I/O error on sdc5
Aug 28 11:04:49 merciadriluca-station kernel: [ 427.572661] Buffer I/O error on device sdc5, logical block 43143
Aug 28 11:04:49 merciadriluca-station kernel: [ 427.572663] lost page write due to I/O error on sdc5
Aug 28 11:04:49 merciadriluca-station kernel: [ 427.572667] Buffer I/O error on device sdc5, logical block 43144
Aug 28 11:04:49 merciadriluca-station kernel: [ 427.572669] lost page write due to I/O error on sdc5
Aug 28 11:04:49 merciadriluca-station kernel: [ 427.572674] Buffer I/O error on device sdc5, logical block 43145
Aug 28 11:04:49 merciadriluca-station kernel: [ 427.572676] lost page write due to I/O error on sdc5
==
It looks like the HDD associated with sdc is encountering some
issues. But is sdc linked to ata4 or ata6? Do these two problems (before
and after restarting) are the same ones or not?
After running several short and long tests with S.M.A.R.T. on each of my
3 HDDs, I got these results:
1) HDD associated with /dev/sda looks in some pre-failure state:
==
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
3 Spin_Up_Time 0x0027 203 202 063 Pre-fail Always - 19440
4 Start_Stop_Count 0x0032 252 252 000 Old_age Always - 3294
5 Reallocated_Sector_Ct 0x0033 252 252 063 Pre-fail Always - 17
6 Read_Channel_Margin 0x0001 253 253 100 Pre-fail Offline - 0
7 Seek_Error_Rate 0x000a 253 252 000 Old_age Always - 0
8 Seek_Time_Performance 0x0027 252 237 187 Pre-fail Always - 46578
9 Power_On_Minutes 0x0032 172 172 000 Old_age Always - 1007h+24m
10 Spin_Retry_Count 0x002b 253 252 157 Pre-fail Always - 0
11 Calibration_Retry_Count 0x002b 253 252 223 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 245 245 000 Old_age Always - 3314
192 Power-Off_Retract_Count 0x0032 253 253 000 Old_age Always - 0
193 Load_Cycle_Count 0x0032 253 253 000 Old_age Always - 0
194 Temperature_Celsius 0x0032 253 253 000 Old_age Always - 56
195 Hardware_ECC_Recovered 0x000a 253 252 000 Old_age Always - 8324
196 Reallocated_Event_Count 0x0008 238 238 000 Old_age Offline - 15
197 Current_Pending_Sector 0x0008 252 252 000 Old_age Offline - 15
198 Offline_Uncorrectable 0x0008 237 001 000 Old_age Offline - 16
199 UDMA_CRC_Error_Count 0x0008 195 194 000 Old_age Offline - 5
200 Multi_Zone_Error_Rate 0x000a 253 252 000 Old_age Always - 0
201 Soft_Read_Error_Rate 0x000a 253 252 000 Old_age Always - 0
202 Data_Address_Mark_Errs 0x000a 253 226 000 Old_age Always - 0
203 Run_Out_Cancel 0x000b 253 252 180 Pre-fail Always - 8
204 Soft_ECC_Correction 0x000a 253 251 000 Old_age Always - 0
205 Thermal_Asperity_Rate 0x000a 253 252 000 Old_age Always - 0
207 Spin_High_Current 0x002a 253 252 000 Old_age Always - 0
208 Spin_Buzz 0x002a 253 252 000 Old_age Always - 0
209 Offline_Seek_Performnce 0x0024 194 189 000 Old_age Offline - 0
99 Unknown_Attribute 0x0004 253 253 000 Old_age Offline - 0
100 Unknown_Attribute 0x0004 253 253 000 Old_age Offline - 0
101 Unknown_Attribute 0x0004 253 253 000 Old_age Offline - 0
SMART Error Log Version: 1
Warning: ATA error count 454 inconsistent with error log pointer 5
ATA Error Count: 454 (device log contains only the most recent five errors)
CR = Command Register [HEX]
FR = Features Register [HEX]
SC = Sector Count Register [HEX]
SN = Sector Number Register [HEX]
CL = Cylinder Low Register [HEX]
CH = Cylinder High Register [HEX]
DH = Device/Head Register [HEX]
DC = Device Command Register [HEX]
ER = Error register [HEX]
ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.
Error 454 occurred at disk power-on lifetime: 14837 hours (618 days + 5 hours)
When the command that caused the error occurred, the device was in an unknown state.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 24 81 02 32 e0 Error: UNC 36 sectors at LBA = 0x00320281 = 3277441
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
25 d0 00 81 02 32 e0 00 02:36:40.624 READ DMA EXT
25 d0 d2 af 01 32 e0 00 02:36:40.624 READ DMA EXT
25 d0 2e 81 e0 31 e0 00 02:36:40.624 READ DMA EXT
25 d0 00 81 df 31 e0 00 02:36:40.608 READ DMA EXT
25 d0 d2 af de 31 e0 00 02:36:40.608 READ DMA EXT
Error 453 occurred at disk power-on lifetime: 12776 hours (532 days + 8 hours)
When the command that caused the error occurred, the device was in an unknown state.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 01 52 27 0f e0 Error: UNC at LBA = 0x000f2752 = 993106
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
42 d0 01 52 27 0f e0 00 03:46:51.472 READ VERIFY SECTOR(S) EXT
25 d0 01 00 00 00 e0 00 03:46:51.472 READ DMA EXT
42 d0 01 51 27 0f e0 00 03:46:50.464 READ VERIFY SECTOR(S) EXT
25 d0 01 00 00 00 e0 00 03:46:50.448 READ DMA EXT
42 d0 02 51 27 0f e0 00 03:46:49.440 READ VERIFY SECTOR(S) EXT
Error 452 occurred at disk power-on lifetime: 12776 hours (532 days + 8 hours)
When the command that caused the error occurred, the device was in an unknown state.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 01 51 27 0f e0 Error: UNC at LBA = 0x000f2751 = 993105
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
42 d0 01 51 27 0f e0 00 03:46:50.464 READ VERIFY SECTOR(S) EXT
25 d0 01 00 00 00 e0 00 03:46:50.448 READ DMA EXT
42 d0 02 51 27 0f e0 00 03:46:49.440 READ VERIFY SECTOR(S) EXT
42 d0 02 4f 27 0f e0 00 03:46:49.440 READ VERIFY SECTOR(S) EXT
42 d0 04 53 27 0f e0 00 03:46:48.640 READ VERIFY SECTOR(S) EXT
Error 451 occurred at disk power-on lifetime: 12776 hours (532 days + 8 hours)
When the command that caused the error occurred, the device was in an unknown state.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 02 51 27 0f e0 Error: UNC at LBA = 0x000f2751 = 993105
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
42 d0 02 51 27 0f e0 00 03:46:49.440 READ VERIFY SECTOR(S) EXT
42 d0 02 4f 27 0f e0 00 03:46:49.440 READ VERIFY SECTOR(S) EXT
42 d0 04 53 27 0f e0 00 03:46:48.640 READ VERIFY SECTOR(S) EXT
25 d0 01 00 00 00 e0 00 03:46:48.624 READ DMA EXT
42 d0 04 4f 27 0f e0 00 03:46:47.616 READ VERIFY SECTOR(S) EXT
Error 450 occurred at disk power-on lifetime: 12776 hours (532 days + 8 hours)
When the command that caused the error occurred, the device was in an unknown state.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 02 4f 27 0f e0 Error: UNC at LBA = 0x000f274f = 993103
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
42 d0 04 4f 27 0f e0 00 03:46:47.616 READ VERIFY SECTOR(S) EXT
25 d0 01 00 00 00 e0 00 03:46:47.616 READ DMA EXT
42 d0 08 57 27 0f e0 00 03:46:47.600 READ VERIFY SECTOR(S) EXT
25 d0 01 00 00 00 e0 00 03:46:47.600 READ DMA EXT
42 d0 08 4f 27 0f e0 00 03:46:46.576 READ VERIFY SECTOR(S) EXT
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed: read failure 10% 26543 319759751
# 2 Short offline Completed: read failure 60% 26542 319759751
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
==
Short offline test ends at 40% completed, and extended offline one ends
at 90% completed, the LBA of the first error being 319759751 in both
cases.
2) HDD associated with /dev/sdb verifies
==
smartctl 5.40 2010-07-12 r3124 [i686-pc-linux-gnu] (local build)
Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net
=== START OF INFORMATION SECTION ===
Model Family: Seagate Barracuda 7200.10 family
Device Model: ST3320620AS
Serial Number: 9QFAYRCP
Firmware Version: 3.AAG
User Capacity: 320,072,933,376 bytes
Device is: In smartctl database [for details use: -P show]
ATA Version is: 7
ATA Standard is: Exact ATA specification draft version not indicated
Local Time is: Tue Aug 28 16:11:54 2012 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
[...]
SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 100 253 006 Pre-fail Always - 0
3 Spin_Up_Time 0x0003 096 095 000 Pre-fail Always - 0
4 Start_Stop_Count 0x0032 099 099 020 Old_age Always - 1753
5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 0
7 Seek_Error_Rate 0x000f 085 060 030 Pre-fail Always - 355938474
9 Power_On_Hours 0x0032 083 083 000 Old_age Always - 15739
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 099 099 020 Old_age Always - 1745
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0
189 High_Fly_Writes 0x003a 100 100 000 Old_age Always - 0
190 Airflow_Temperature_Cel 0x0022 053 048 045 Old_age Always - 47 (Lifetime Min/Max 47/48)
194 Temperature_Celsius 0x0022 047 052 000 Old_age Always - 47 (0 20 0 0)
195 Hardware_ECC_Recovered 0x001a 065 055 000 Old_age Always - 1306602
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x0000 100 253 000 Old_age Offline - 0
202 Data_Address_Mark_Errs 0x0032 100 253 000 Old_age Always - 0
SMART Error Log Version: 1
No Errors Logged
SMART Self-test log structure revision number 1
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
==
(this is the one that looks the healthiest, actually).
3) The HDD associated with /dev/sdc, which should be in some way broken
(being given the messages that I wrote above from /var/log/syslog), does
not look so through SMART:
==
smartctl 5.40 2010-07-12 r3124 [i686-pc-linux-gnu] (local build)
Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net
=== START OF INFORMATION SECTION ===
Model Family: Seagate Maxtor DiamondMax 21
Device Model: MAXTOR STM3320820AS
Serial Number: 5QF2T6W6
Firmware Version: 3.AAE
User Capacity: 320,072,933,376 bytes
Device is: In smartctl database [for details use: -P show]
ATA Version is: 7
ATA Standard is: Exact ATA specification draft version not indicated
Local Time is: Tue Aug 28 16:12:32 2012 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
[...]
SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 092 085 006 Pre-fail Always - 63613073
3 Spin_Up_Time 0x0003 096 095 000 Pre-fail Always - 0
4 Start_Stop_Count 0x0032 098 098 020 Old_age Always - 2362
5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 0
7 Seek_Error_Rate 0x000f 087 060 030 Pre-fail Always - 574383816
9 Power_On_Hours 0x0032 079 079 000 Old_age Always - 18552
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 098 098 020 Old_age Always - 2386
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0
189 High_Fly_Writes 0x003a 100 100 000 Old_age Always - 0
190 Airflow_Temperature_Cel 0x0022 054 046 045 Old_age Always - 46 (Lifetime Min/Max 45/47)
194 Temperature_Celsius 0x0022 046 054 000 Old_age Always - 46 (0 12 0 0)
195 Hardware_ECC_Recovered 0x001a 065 052 000 Old_age Always - 222324542
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 2
200 Multi_Zone_Error_Rate 0x0000 100 253 000 Old_age Offline - 0
202 Data_Address_Mark_Errs 0x0032 100 253 000 Old_age Always - 0
SMART Error Log Version: 1
No Errors Logged
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed without error 00% 18551 -
# 2 Extended offline Completed without error 00% 18493 -
# 3 Short offline Completed without error 00% 18492 -
# 4 Short offline Completed without error 00% 13106 -
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
==
What can I deduce from this? It looks like /dev/sdc is broken but SMART
tells /dev/sda would have more chance being on the verge to broke than
/dev/sdc.
Note that I tried exchanging SATA cables, to no avail.
All the best,
- --
Merciadri Luca
See http://www.student.montefiore.ulg.ac.be/~merciadri/
- --
It's the early bird that gets the worm.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Processed by Mailcrypt 3.5.8 <http://mailcrypt.sourceforge.net/>
iEYEARECAAYFAlA80oQACgkQM0LLzLt8MhwUGgCbB9WOOBb3vHlorBnymavWCvmY
aBkAnRbCcc2WZK+AXQTcwqKTGyt0ph/b
=OzHm
-----END PGP SIGNATURE-----
Reply to: