[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Failing Hard Drive, or False Alarms?



On Wed, Sep 10, 2025 at 01:30:47PM -0400, Bruce Halco wrote:
> A couple of weeks ago I upgraded to trixie.  Since then I've gotten a number
> of messages like
> Device: /dev/sda [SAT], 8 Currently unreadable (pending) sectors and Device:
> /dev/sda [SAT], 30 Offline uncorrectable sectors These seem to come within a
> day or so of a reboot, but it hasn't been long enough to know if that's a
> red herring.

I would take this as a serious warning. Time to get the disk replaced in a
nice, calm, unrushed manner. The disk is *likely* to hard fail within a year.
Do NOT think "I can leave it 9 months", that 1 year is not an accurate
prediction.

Questions:

• how much is your data worth ?

• what will it cost your time to have to suddenly rebuild the machine (prolly
at the most awkward time) ?

Compare these costs to the price of a new disk.

> I ran "smartctl -t offline /dev/sda", and the eventual result of "smartctl
> -a /dev/sda" shows
> 
>    SMART Attributes Data Structure revision number: 16
>    Vendor Specific SMART Attributes with Thresholds:
>    ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE
>          UPDATED  WHEN_FAILED RAW_VALUE
>      1 Raw_Read_Error_Rate     0x000b   100   100   016    Pre-fail
>      Always       -       0
>      2 Throughput_Performance  0x0005   140   140   054    Pre-fail
>      Offline      -       68
>      3 Spin_Up_Time            0x0007   178   178   024    Pre-fail
>      Always       -       351 (Average 293)
>      4 Start_Stop_Count        0x0012   100   100   000    Old_age
>       Always       -       120
>      5 Reallocated_Sector_Ct   0x0033   100   100   005    Pre-fail
>      Always       -       0
>      7 Seek_Error_Rate         0x000b   100   100   067    Pre-fail
>      Always       -       0
>      8 Seek_Time_Performance   0x0005   124   124   020    Pre-fail
>      Offline      -       33
>      9 Power_On_Hours          0x0012   088   088   000    Old_age
>       Always       -       85831
>    10 Spin_Retry_Count        0x0013   100   100   060    Pre-fail
>      Always       -       0
>    12 Power_Cycle_Count       0x0032   100   100   000    Old_age
>       Always       -       120
>    192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age
>       Always       -       1028
>    193 Load_Cycle_Count        0x0012   100   100   000    Old_age
>       Always       -       1028
>    194 Temperature_Celsius     0x0002   171   171   000    Old_age
>       Always       -       35 (Min/Max 21/43)
>    196 Reallocated_Event_Count 0x0032   100   100   000    Old_age
>       Always       -       0
>    197 Current_Pending_Sector  0x0022   100   100   000    Old_age
>       Always       -       0
>    198 Offline_Uncorrectable   0x0008   100   100   000    Old_age
>       Offline      -       0
>    199 UDMA_CRC_Error_Count    0x000a   200   200   000    Old_age
>       Always       -       0
> 
>    SMART Error Log Version: 1
>    No Errors Logged
> 
> 
> which seems to indicate no problems.
> 
> I admit I'm not a smartctl wizard, but to me it seems that smartctl is
> contradicting itself.
> 
> Can anyone help me out?

-- 
Alain Williams
Linux/GNU Consultant - Mail systems, Web sites, Networking, Programmer, IT Lecturer.
+44 (0) 787 668 0256  https://www.phcomp.co.uk/
Parliament Hill Computers. Registration Information: https://www.phcomp.co.uk/Contact.html
#include <std_disclaimer.h>


Reply to: