Re: Failing Hard Drive, or False Alarms?
On Wed, Sep 10, 2025 at 01:30:47PM -0400, Bruce Halco wrote:
> A couple of weeks ago I upgraded to trixie. Since then I've gotten a number
> of messages like
> Device: /dev/sda [SAT], 8 Currently unreadable (pending) sectors and Device:
> /dev/sda [SAT], 30 Offline uncorrectable sectors These seem to come within a
> day or so of a reboot, but it hasn't been long enough to know if that's a
> red herring.
I would take this as a serious warning. Time to get the disk replaced in a
nice, calm, unrushed manner. The disk is *likely* to hard fail within a year.
Do NOT think "I can leave it 9 months", that 1 year is not an accurate
prediction.
Questions:
• how much is your data worth ?
• what will it cost your time to have to suddenly rebuild the machine (prolly
at the most awkward time) ?
Compare these costs to the price of a new disk.
> I ran "smartctl -t offline /dev/sda", and the eventual result of "smartctl
> -a /dev/sda" shows
>
> SMART Attributes Data Structure revision number: 16
> Vendor Specific SMART Attributes with Thresholds:
> ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE
> UPDATED WHEN_FAILED RAW_VALUE
> 1 Raw_Read_Error_Rate 0x000b 100 100 016 Pre-fail
> Always - 0
> 2 Throughput_Performance 0x0005 140 140 054 Pre-fail
> Offline - 68
> 3 Spin_Up_Time 0x0007 178 178 024 Pre-fail
> Always - 351 (Average 293)
> 4 Start_Stop_Count 0x0012 100 100 000 Old_age
> Always - 120
> 5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail
> Always - 0
> 7 Seek_Error_Rate 0x000b 100 100 067 Pre-fail
> Always - 0
> 8 Seek_Time_Performance 0x0005 124 124 020 Pre-fail
> Offline - 33
> 9 Power_On_Hours 0x0012 088 088 000 Old_age
> Always - 85831
> 10 Spin_Retry_Count 0x0013 100 100 060 Pre-fail
> Always - 0
> 12 Power_Cycle_Count 0x0032 100 100 000 Old_age
> Always - 120
> 192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age
> Always - 1028
> 193 Load_Cycle_Count 0x0012 100 100 000 Old_age
> Always - 1028
> 194 Temperature_Celsius 0x0002 171 171 000 Old_age
> Always - 35 (Min/Max 21/43)
> 196 Reallocated_Event_Count 0x0032 100 100 000 Old_age
> Always - 0
> 197 Current_Pending_Sector 0x0022 100 100 000 Old_age
> Always - 0
> 198 Offline_Uncorrectable 0x0008 100 100 000 Old_age
> Offline - 0
> 199 UDMA_CRC_Error_Count 0x000a 200 200 000 Old_age
> Always - 0
>
> SMART Error Log Version: 1
> No Errors Logged
>
>
> which seems to indicate no problems.
>
> I admit I'm not a smartctl wizard, but to me it seems that smartctl is
> contradicting itself.
>
> Can anyone help me out?
--
Alain Williams
Linux/GNU Consultant - Mail systems, Web sites, Networking, Programmer, IT Lecturer.
+44 (0) 787 668 0256 https://www.phcomp.co.uk/
Parliament Hill Computers. Registration Information: https://www.phcomp.co.uk/Contact.html
#include <std_disclaimer.h>
Reply to: