[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Failing Hard Drive, or False Alarms?



And that is my inclination. I've already bought a new drive.

But it really bugs me that running smartctl from the command line conflicts with the automated messages. Which I never got before upgrading to trixie.


On 9/10/25 2:11 PM, alain williams wrote:
On Wed, Sep 10, 2025 at 01:30:47PM -0400, Bruce Halco wrote:
A couple of weeks ago I upgraded to trixie.  Since then I've gotten a number
of messages like
Device: /dev/sda [SAT], 8 Currently unreadable (pending) sectors and Device:
/dev/sda [SAT], 30 Offline uncorrectable sectors These seem to come within a
day or so of a reboot, but it hasn't been long enough to know if that's a
red herring.
I would take this as a serious warning. Time to get the disk replaced in a
nice, calm, unrushed manner. The disk is *likely* to hard fail within a year.
Do NOT think "I can leave it 9 months", that 1 year is not an accurate
prediction.

Questions:

• how much is your data worth ?

• what will it cost your time to have to suddenly rebuild the machine (prolly
at the most awkward time) ?

Compare these costs to the price of a new disk.

I ran "smartctl -t offline /dev/sda", and the eventual result of "smartctl
-a /dev/sda" shows

    SMART Attributes Data Structure revision number: 16
    Vendor Specific SMART Attributes with Thresholds:
    ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE
          UPDATED  WHEN_FAILED RAW_VALUE
      1 Raw_Read_Error_Rate     0x000b   100   100   016    Pre-fail
      Always       -       0
      2 Throughput_Performance  0x0005   140   140   054    Pre-fail
      Offline      -       68
      3 Spin_Up_Time            0x0007   178   178   024    Pre-fail
      Always       -       351 (Average 293)
      4 Start_Stop_Count        0x0012   100   100   000    Old_age
       Always       -       120
      5 Reallocated_Sector_Ct   0x0033   100   100   005    Pre-fail
      Always       -       0
      7 Seek_Error_Rate         0x000b   100   100   067    Pre-fail
      Always       -       0
      8 Seek_Time_Performance   0x0005   124   124   020    Pre-fail
      Offline      -       33
      9 Power_On_Hours          0x0012   088   088   000    Old_age
       Always       -       85831
    10 Spin_Retry_Count        0x0013   100   100   060    Pre-fail
      Always       -       0
    12 Power_Cycle_Count       0x0032   100   100   000    Old_age
       Always       -       120
    192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age
       Always       -       1028
    193 Load_Cycle_Count        0x0012   100   100   000    Old_age
       Always       -       1028
    194 Temperature_Celsius     0x0002   171   171   000    Old_age
       Always       -       35 (Min/Max 21/43)
    196 Reallocated_Event_Count 0x0032   100   100   000    Old_age
       Always       -       0
    197 Current_Pending_Sector  0x0022   100   100   000    Old_age
       Always       -       0
    198 Offline_Uncorrectable   0x0008   100   100   000    Old_age
       Offline      -       0
    199 UDMA_CRC_Error_Count    0x000a   200   200   000    Old_age
       Always       -       0

    SMART Error Log Version: 1
    No Errors Logged


which seems to indicate no problems.

I admit I'm not a smartctl wizard, but to me it seems that smartctl is
contradicting itself.

Can anyone help me out?


Reply to: