Re: how to test disk for bad sector

To: debian-user@lists.debian.org
Subject: Re: how to test disk for bad sector
From: Marco Möller <talby@debianlists.mobilxpress.net>
Date: Sat, 29 Aug 2020 14:56:20 +0200
Message-id: <[🔎] 1cbeb09b-814b-73da-b6e3-1dd077825d8c@debianlists.mobilxpress.net>
In-reply-to: <[🔎] 3c495c3a-a7b6-dc88-bcf6-82f9a7b78fc1@gmail.com>
References: <537557110.32145.1598669993940.ref@mail.yahoo.com> <[🔎] 537557110.32145.1598669993940@mail.yahoo.com> <[🔎] 3c495c3a-a7b6-dc88-bcf6-82f9a7b78fc1@gmail.com>

On 29.08.20 10:18, Alexander V. Makartsev wrote:

On 29.08.2020 07:59, Long Wind wrote:
installation of linux to sdb1 fails
i believe hard disk has bad sector
If hard drive has bad sectors or recently encountered them, informationabout this should be noted to hard drive's SMART table.Alternatively, you can use "badblocks" program from "e2fsprogs" packageto scan hard drive for bad blocks.I'd perform tests on wiped clean hard drive with non-destructive readtest first, followed by write test.Testing media for bad blocks could be time consuming if hard drive ismultiple terabytes in size.
i use e2fsck with -c, i.e. read-only test
it doesn't  report any error


I support this recommendation to use badblocks.

If you first would need to rescue data from the disk, although yourquestion sounds like there is no data worth to rescue from the diskanymore, then use ddrescue from package gddrescue first.Then, using badblocks, I recommend to run it in write mode with option"-n" for the following reason: if I am correctly informed, then diskswith S.M.A.R.T have usually a reservoir of memory blocks to which thefirmware of the disk itself, without the operating system seeing this,redirects by the disk itself already detected bad blocks. The statisticsabout these permanent redirection events is found in the S.M.A.R.T. logof the disk, which you can access by the smartctl program. But theinternal mechanism of the disk's S.M.A.R.T. will only detect bad blocksupon the intent to write to blocks. Simply intending to read from badblocks will not trigger S.M.A.R.T. to recognize blocks as bad blocks andthey would thus not become visible in the S.M.A.R.T. report. If youlater would write to the disk (i.e. during your OS installation you arementioning as the cause to have encountered a problem with yourhardware) then either S.M.A.R.T. will invisibly protect you by applyingits internal redirection mechanism to reservoir blocks, or, if no morereservoir blocks are available, leave the operating system with theproblem. This is what might happen in your situation right now. So, theoperating system now needs to maintain its own list of bad blocks, whichis thus the list of bad blocks no more cared for by S.M.A.R.T. . Again,simply reading from the disk might not be enough to properly detectthese still present bad blocks. Therefore I recommend to let theoperating system search for them by running badblocks with option "-n"(or "-w", please consult the man pages what better fits your needs) inwrite mode! Actually, I would recommend to repeat such run severaltimes, in order to monitor if the amount of bad blocks is at leastconstant or if it is increasing. In the latter case you should replacethe disk by a new one for sure. In the former case, if badblocks commandfinds already bad blocks which couldn't be cared for by S.M.A.R.T., Iwould also seriously consider to replace the disk for a new one now, ifthe financial situation allows for it. But if a replacement is wished tobe avoided now for financial reasons, then at least continue to monitorthe situation very frequently and of course at any time have a properbackup of your data on a still good medium. Given the requirement tofrequently monitor a disk which can not buffer problems for youautomatically by its S.M.A.R.T., and considering the time effort thisrepeatedly involves, you will have to balance this costs of time andmissing trust in the present medium against the costs for a new disk.

If the disk comes out to not be the cause for your trouble encounteredwith your system, then you could check if all components on themotherboard involved in moving data around are still fine:- write a heavy amount of data to the disk: with command dd copy a hugeamount of data from an externally to USB connected drive to the drivewhich you at the moment suspect to trouble; maybe the motherboard failsfrom time to time to still handle such job free of errors; so, the diskmight be still be fine and could be reused elsewhere, but the datahighway on your motherboard started to fail; for this check I would notsimply use if=/dev/zero, but really reading data from another drive, inorder to ensure that on the motherboard the respective data highway hasto be used as during your OS installation or during future data copyprocedures;- check your RAM with memtest86+, you will have to search for a Lifepen-drive offering this command in its grub boot menu; I am notperfectly sure right now, but would expect the Knoppix Linuxdistribution to offer this;

- check your CPU with stress-ng

Good luck!
Marco

Reply to:

References:
- how to test disk for bad sector
  - From: Long Wind <longwind2@yahoo.com>
- Re: how to test disk for bad sector
  - From: "Alexander V. Makartsev" <avbetev@gmail.com>

Prev by Date: Re: how to test disk for bad sector
Next by Date: Re: Encrypt files on Linux, decrypt on Windows
Previous by thread: Re: how to test disk for bad sector
Next by thread: Re: Re: Download button disabled
Index(es):
- Date
- Thread