Re: filesystem damage
On Mon, 3 Mar 2025 at 10:03, Dan Purgert <dan@djph.net> wrote:
> On Mar 02, 2025, Eben King wrote:
> > [...]
> > ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED
> > WHEN_FAILED RAW_VALUE
> > 1 Raw_Read_Error_Rate 0x000f 082 064 006 Pre-fail
> > Always - 146369262
>
> 146 million read-errors.
> > 7 Seek_Error_Rate 0x000f 084 060 045 Pre-fail
> > Always - 232382570
>
> 230 million seek errors
> > 9 Power_On_Hours 0x0032 093 093 000 Old_age
> > Always - 6346h+20m+46.297s
>
> ~9 years on-time
>
> > 195 Hardware_ECC_Recovered 0x001a 082 064 000 Old_age
> > Always - 146369262
>
> Well, at least those read errors were all corrected ;)
>
> None of the first three bits are absolute proof that the drive is going,
> but they're certainly cause for suspicion.
I see no cause for concern in that data.
The wikipedia page [1] regarding "1 Raw_Read_Error_Rate" says:
The raw value has different structure for different vendors and is often
not meaningful as a decimal number. For some drives, this number
may increase during normal operation without necessarily signifying errors.
The wikipedia page [1] regarding "7 Seek_Error_Rate" says:
The raw value has different structure for different vendors and is often
not meaningful as a decimal number. For some drives, this number
may increase during normal operation without necessarily signifying errors.
Why do you write that the "9 Power_On_Hours" data represents
"~9 years on-time"? It looks to me that it says 6346 hours.
There are 365 * 24 = 8760 hours / year.
So 6346 hours is less than one year.
The S.M.A.R.T. attributes that are generally considered to
be cause for concern are [2]:
SMART 5: Reallocated_Sector_Count.
SMART 187: Reported_Uncorrectable_Errors.
SMART 188: Command_Timeout.
SMART 197: Current_Pending_Sector_Count.
SMART 198: Offline_Uncorrectable.
The data previously posted for this drive shows:
5 Reallocated_Sector_Ct 0x0033 100 100 010 Pre-fail
Always - 0
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always
- 0
188 Command_Timeout 0x0032 100 100 000 Old_age Always
- 0 0 0
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always
- 0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age
Offline - 0
The raw values are all zero. The normalised values are all 100,
which is a typical initial value. I see no cause for concern in that data.
S.M.A.R.T. data is a topic where it is hard to find reliable information.
I am no expert in this. I speak from the experience of being a casual
user of S.M.A.R.T. to manage my personal machines for more than
a decade, including some drive failures followed by many years of recovery.
It is no problem for me to run all my drives to failure, because I am
careful with backups. I do not monitor checksums, I just check the
attributes of concern from time to time and that approach plus backups
has served me well enough.
[1] https://en.wikipedia.org/wiki/Self-Monitoring,_Analysis_and_Reporting_Technology
[2] https://www.backblaze.com/blog/hard-drive-smart-stats/
Reply to: