Re: Failing Hard Drive, or False Alarms?

To: debian-user@lists.debian.org
Subject: Re: Failing Hard Drive, or False Alarms?
From: Bruce Halco <bruce@halcomp.com>
Date: Wed, 10 Sep 2025 14:24:07 -0400
Message-id: <[🔎] ab4e87d7-b811-4704-8152-b34ad2100107@halcomp.com>
In-reply-to: <[🔎] aMG_P-sRPULXrCkF@phcomp.co.uk>
References: <[🔎] 04224488cfd2061574802637317a9603@workdoc.in> <[🔎] 20250910022001.GE2547@wooledge.org> <[🔎] 3e846c810fbf9b081bf67c5aaadc7a2d@workdoc.in> <[🔎] 109rnti$1n5o3$1@dont-email.me> <[🔎] aMFqk39_FwRg7XmF@phare.normalesup.org> <[🔎] 109rsop$1oon6$1@dont-email.me> <[🔎] d03b36d0-f499-4c58-9132-3cacf54dceed@halcomp.com> <[🔎] aMG_P-sRPULXrCkF@phcomp.co.uk>

And that is my inclination. I've already bought a new drive.

But it really bugs me that running smartctl from the command lineconflicts with the automated messages. Which I never got beforeupgrading to trixie.



On 9/10/25 2:11 PM, alain williams wrote:

On Wed, Sep 10, 2025 at 01:30:47PM -0400, Bruce Halco wrote:

A couple of weeks ago I upgraded to trixie.  Since then I've gotten a number
of messages like
Device: /dev/sda [SAT], 8 Currently unreadable (pending) sectors and Device:
/dev/sda [SAT], 30 Offline uncorrectable sectors These seem to come within a
day or so of a reboot, but it hasn't been long enough to know if that's a
red herring.

I would take this as a serious warning. Time to get the disk replaced in a
nice, calm, unrushed manner. The disk is *likely* to hard fail within a year.
Do NOT think "I can leave it 9 months", that 1 year is not an accurate
prediction.

Questions:

• how much is your data worth ?

• what will it cost your time to have to suddenly rebuild the machine (prolly
at the most awkward time) ?

Compare these costs to the price of a new disk.

I ran "smartctl -t offline /dev/sda", and the eventual result of "smartctl
-a /dev/sda" shows

    SMART Attributes Data Structure revision number: 16
    Vendor Specific SMART Attributes with Thresholds:
    ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE
          UPDATED  WHEN_FAILED RAW_VALUE
      1 Raw_Read_Error_Rate     0x000b   100   100   016    Pre-fail
      Always       -       0
      2 Throughput_Performance  0x0005   140   140   054    Pre-fail
      Offline      -       68
      3 Spin_Up_Time            0x0007   178   178   024    Pre-fail
      Always       -       351 (Average 293)
      4 Start_Stop_Count        0x0012   100   100   000    Old_age
       Always       -       120
      5 Reallocated_Sector_Ct   0x0033   100   100   005    Pre-fail
      Always       -       0
      7 Seek_Error_Rate         0x000b   100   100   067    Pre-fail
      Always       -       0
      8 Seek_Time_Performance   0x0005   124   124   020    Pre-fail
      Offline      -       33
      9 Power_On_Hours          0x0012   088   088   000    Old_age
       Always       -       85831
    10 Spin_Retry_Count        0x0013   100   100   060    Pre-fail
      Always       -       0
    12 Power_Cycle_Count       0x0032   100   100   000    Old_age
       Always       -       120
    192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age
       Always       -       1028
    193 Load_Cycle_Count        0x0012   100   100   000    Old_age
       Always       -       1028
    194 Temperature_Celsius     0x0002   171   171   000    Old_age
       Always       -       35 (Min/Max 21/43)
    196 Reallocated_Event_Count 0x0032   100   100   000    Old_age
       Always       -       0
    197 Current_Pending_Sector  0x0022   100   100   000    Old_age
       Always       -       0
    198 Offline_Uncorrectable   0x0008   100   100   000    Old_age
       Offline      -       0
    199 UDMA_CRC_Error_Count    0x000a   200   200   000    Old_age
       Always       -       0

    SMART Error Log Version: 1
    No Errors Logged


which seems to indicate no problems.

I admit I'm not a smartctl wizard, but to me it seems that smartctl is
contradicting itself.

Can anyone help me out?

Reply to:

Follow-Ups:
- Re: Failing Hard Drive, or False Alarms?
  - From: Andy Smith <andy@strugglers.net>

References:
- mail log question
  - From: Rob Hoo <rhoo@workdoc.in>
- Re: mail log question
  - From: Greg Wooledge <greg@wooledge.org>
- Re: mail log question
  - From: Rob Hoo <rhoo@workdoc.in>
- Re: mail log question
  - From: Kevin Chadwick <kc-usenet@chadwicks.me.uk>
- Re: mail log question
  - From: Nicolas George <george@nsup.org>
- Re: mail log question
  - From: Kevin Chadwick <kc-usenet@chadwicks.me.uk>
- Failing Hard Drive, or False Alarms?
  - From: Bruce Halco <bruce@halcomp.com>
- Re: Failing Hard Drive, or False Alarms?
  - From: alain williams <addw@phcomp.co.uk>

Prev by Date: Re: Dual-stack preseeding
Next by Date: Re: Failing Hard Drive, or False Alarms?
Previous by thread: Re: Failing Hard Drive, or False Alarms?
Next by thread: Re: Failing Hard Drive, or False Alarms?
Index(es):
- Date
- Thread