[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: failing HDD, ddrescue says remaning time is 7104d



On 8/31/22 06:25, ppr wrote:
I would appreciate advice from the community about a failing hard drive.

When booting up, the computer complained about /dev/sdb, which is a ext4 HDD with data (not the computer main disk). dmesg shows `AE_NOT_FOUND` and  `failed command: READ FPDMA QUEUED` messages (full dmesg log at https://hastebin.com/raw/jebelileru).

It has finally booted after trying unsuccessfully to start /dev/sdb.

I launched smartctl which shows hard drive failure.

---
# smartctl -H -i /dev/sdb
smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.19.0-21-amd64] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Toshiba 3.5" DT01ACA... Desktop HDD
Device Model:     TOSHIBA DT01ACA100
Serial Number:    663X1XGNS
LU WWN Device Id: 5 000039 fe9dad918
Firmware Version: MS2OA750
User Capacity:    1 000 204 886 016 bytes [1,00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    7200 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Wed Aug 31 13:56:34 2022 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: FAILED!
Drive failure expected in less than 24 hours. SAVE ALL DATA.
Failed Attributes:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE   2 Throughput_Performance  0x0005   037   037   054    Pre-fail Offline  FAILING_NOW 3774   5 Reallocated_Sector_Ct   0x0033   001   001   005    Pre-fail Always   FAILING_NOW 2004
---

I did not try to mount the HDD. I plugged an external HDD (ext4) and launched ddrescue. After two days it has recovered 33GB of 1TB but the speed are now so slow it will take 7104 days to complete.

# ddrescue -n /dev/sdb /media/sara/2274a2da-1f02-4afd-a5c5-e8dcb1c02195/recup_HDD_sara/image_HDD1.img /media/sara/2274a2da-1f02-4afd-a5c5-e8dcb1c02195/recup_HDD_sara/recup.log
GNU ddrescue 1.23
Press Ctrl-C to interrupt
      ipos:   33992 MB, non-trimmed:        0 B,  current rate:     636 B/s
      opos:   33992 MB, non-scraped:        0 B,  average rate:    188 kB/s
non-tried:  966212 MB,  bad-sector:        0 B,    error rate:       0 B/s
   rescued:   33992 MB,   bad areas:        0,        run time:  2d  2h 6m
pct rescued:    3.39%, read errors:        0,  remaining time:   7104d 20h
                               time since last successful read: 0s
Copying non-tried blocks... Pass 1 (forwards)^C

Should I wait hoping for a speeding? Should I pass different option to ddrescue or use another tool?


Unless you have enterprise grade equipment designed for 100% duty cycle for 48 hours, I would kill the ddresue job before your hardware is destroyed.


Both the failed drive and the destination drive will be in heavy use while you attempt to recover sectors. At 100 MB/s, transferring 1 TB will take nearly 3 hours (!). Make sure everything has good power supplies and good cooling. Use the best drive you have for the destination; an SSD will expedite this process and steps that follow.


Ensure that the destination contains zeros for sectors not recovered.


Comment out the /etc/crypttab and/or /etc/fstab entries for the failed drive. When you mount the drive, mount it read only.


The challenge is figuring out the right options and strategies for using ddresue(1) to get as many good sectors as you can off the failing drive before it dies completely. Fortunately or unfortunately, I have not needed ddrescue(1) in many years; so, I would RFTM carefully and then STFW for articles about using ddrescue(1) effectively. Consider doing the work in chunks. You should already have sectors 0- 33 GB. Skip 33 GB and/or 34 GB. Do 35-100 GB. Then, 100-200 GB, 200-300 GB, 300-400 GB, etc.. Get the good sectors first. Do the problem sectors last.


Once you have an image file containing whatever sectors you could recover, make the file read-only and back it up. Better yet, make two backups and put one off-site.


To do the filesystem repair/ recovery work, make a copy of the image and work on the copy. If you make a mistake, you can throw away the copy and start over.


I find it very useful to install Debian onto a good quality USB 3.0 flash drive, to use for system administration, maintenance, trouble-shooting, etc.. I prefer this approach over "live" distributions because I have a full Debian system and can install anything I want or need.


I find it very useful to have a spare computer for maintenance and troubleshooting tasks.


I find it very useful to use a version control system for system configuration files, system administration notes, etc..


I backup, archive, and image compulsively. I keep a supply of spare parts on hand. Do not be afraid to spend money new an improved parts -- the last time I lost data when when I tried to "get by" with old and inadequate parts.


David


https://toshiba.semicon-storage.com/us/storage/product/internal-specialty/pc/articles/dt01aca-series.html

https://linux.die.net/man/1/ddrescue


Reply to: