Re: failing HDD, ddrescue says remaning time is 7104d

To: debian-user@lists.debian.org
Subject: Re: failing HDD, ddrescue says remaning time is 7104d
From: David Christensen <dpchrist@holgerdanske.com>
Date: Wed, 31 Aug 2022 14:02:19 -0700
Message-id: <[🔎] b24808ef-2654-4ce6-37ca-cf38924dceac@holgerdanske.com>
In-reply-to: <[🔎] 1773972c6d9db6e432a8f66a090af3ab@zaclys.net>
References: <[🔎] 1773972c6d9db6e432a8f66a090af3ab@zaclys.net>

On 8/31/22 06:25, ppr wrote:

I would appreciate advice from the community about a failing hard drive.
When booting up, the computer complained about /dev/sdb, which is a ext4HDD with data (not the computer main disk). dmesg shows `AE_NOT_FOUND`and `failed command: READ FPDMA QUEUED` messages (full dmesg log athttps://hastebin.com/raw/jebelileru).
It has finally booted after trying unsuccessfully to start /dev/sdb.

I launched smartctl which shows hard drive failure.

---
# smartctl -H -i /dev/sdb
smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.19.0-21-amd64] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Toshiba 3.5" DT01ACA... Desktop HDD
Device Model:     TOSHIBA DT01ACA100
Serial Number:    663X1XGNS
LU WWN Device Id: 5 000039 fe9dad918
Firmware Version: MS2OA750
User Capacity:    1 000 204 886 016 bytes [1,00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    7200 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Wed Aug 31 13:56:34 2022 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: FAILED!
Drive failure expected in less than 24 hours. SAVE ALL DATA.
Failed Attributes:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATEDWHEN_FAILED RAW_VALUE 2 Throughput_Performance 0x0005 037 037 054 Pre-failOffline FAILING_NOW 3774 5 Reallocated_Sector_Ct 0x0033 001 001 005 Pre-failAlways FAILING_NOW 2004
---
I did not try to mount the HDD. I plugged an external HDD (ext4) andlaunched ddrescue. After two days it has recovered 33GB of 1TB but thespeed are now so slow it will take 7104 days to complete.
# ddrescue -n /dev/sdb/media/sara/2274a2da-1f02-4afd-a5c5-e8dcb1c02195/recup_HDD_sara/image_HDD1.img/media/sara/2274a2da-1f02-4afd-a5c5-e8dcb1c02195/recup_HDD_sara/recup.log
GNU ddrescue 1.23
Press Ctrl-C to interrupt
      ipos:   33992 MB, non-trimmed:        0 B,  current rate:     636 B/s
      opos:   33992 MB, non-scraped:        0 B,  average rate:    188 kB/s
non-tried:  966212 MB,  bad-sector:        0 B,    error rate:       0 B/s
   rescued:   33992 MB,   bad areas:        0,        run time:  2d  2h 6m
pct rescued:    3.39%, read errors:        0,  remaining time:   7104d 20h
                               time since last successful read: 0s
Copying non-tried blocks... Pass 1 (forwards)^C
Should I wait hoping for a speeding? Should I pass different option toddrescue or use another tool?

Unless you have enterprise grade equipment designed for 100% duty cyclefor 48 hours, I would kill the ddresue job before your hardware isdestroyed.

Both the failed drive and the destination drive will be in heavy usewhile you attempt to recover sectors. At 100 MB/s, transferring 1 TBwill take nearly 3 hours (!). Make sure everything has good powersupplies and good cooling. Use the best drive you have for thedestination; an SSD will expedite this process and steps that follow.



Ensure that the destination contains zeros for sectors not recovered.

Comment out the /etc/crypttab and/or /etc/fstab entries for the faileddrive. When you mount the drive, mount it read only.

The challenge is figuring out the right options and strategies for usingddresue(1) to get as many good sectors as you can off the failing drivebefore it dies completely. Fortunately or unfortunately, I have notneeded ddrescue(1) in many years; so, I would RFTM carefully and thenSTFW for articles about using ddrescue(1) effectively. Consider doingthe work in chunks. You should already have sectors 0- 33 GB. Skip 33GB and/or 34 GB. Do 35-100 GB. Then, 100-200 GB, 200-300 GB, 300-400GB, etc.. Get the good sectors first. Do the problem sectors last.

Once you have an image file containing whatever sectors you couldrecover, make the file read-only and back it up. Better yet, make twobackups and put one off-site.

To do the filesystem repair/ recovery work, make a copy of the image andwork on the copy. If you make a mistake, you can throw away the copyand start over.

I find it very useful to install Debian onto a good quality USB 3.0flash drive, to use for system administration, maintenance,trouble-shooting, etc.. I prefer this approach over "live"distributions because I have a full Debian system and can installanything I want or need.

I find it very useful to have a spare computer for maintenance andtroubleshooting tasks.

I find it very useful to use a version control system for systemconfiguration files, system administration notes, etc..

I backup, archive, and image compulsively. I keep a supply of spareparts on hand. Do not be afraid to spend money new an improved parts --the last time I lost data when when I tried to "get by" with old andinadequate parts.



David


https://toshiba.semicon-storage.com/us/storage/product/internal-specialty/pc/articles/dt01aca-series.html

https://linux.die.net/man/1/ddrescue

Reply to:

Follow-Ups:
- Re: failing HDD, ddrescue says remaning time is 7104d
  - From: David Wright <david@lionunicorn.co.uk>

References:
- failing HDD, ddrescue says remaning time is 7104d
  - From: ppr <ppr@zaclys.net>

Prev by Date: Re: failing HDD, ddrescue says remaning time is 7104d
Next by Date: Re: networking.service: start operation timed out [SOLVED]
Previous by thread: Re: failing HDD, ddrescue says remaning time is 7104d
Next by thread: Re: failing HDD, ddrescue says remaning time is 7104d
Index(es):
- Date
- Thread