[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Signs of hard drive failure?




On 23/10/19 12:50 am, David Wright wrote:
On Tue 22 Oct 2019 at 19:24:00 (+1000), elvis wrote:
Lots cut
On 22/10/19 6:16 pm, Ken Heard wrote:
       0  Not_testing
Selective self-test flags (0x0):
    After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute
delay.

My comments: is that what is supposed to happen?  I have not done
/dev/sdb yet.

"elvis" in an earlier post asked me if I had "tried e2fsck to check
the filesystem first?"  According to 'man e2fsk' I have to run that
command on an unmounted device.  If the devices in question, /dev/sda
and /dev/sdb, comprise the RAID 1, how can I unmount them and still
use the computer to run those commands?
You run e2fsck on the partition. So if your raid is sdb and sda then
your raid device is /dev/md127 or /dev/kensarray/bigdisk or something
like that. Then your partition is on top of that.

Hopefully you have your partition mounted on /mnt/allmydata for
example and you just umount it and run e2fsck on whatever the block
device is.

If you have boot and root on raid, I would boot with a recovery USB
like  sysrescue and then run the tests offline. It should assemble
your raid automatically to save stuffing around as well.

Also make sure you are using the right filesystem checker, I have just
assumed you are using ext4
Is all this really necessary? I thought one of the benefits of the
extra complication of using initramfs is that the system can check
the root filesystem automatically before it's mounted.

In fact, I've just done it. I typed
# ./check-fs-rebooting-now
which runs   grub-reboot 'fsck>fsck'; reboot
which reboots the machine with   forcefsck   added to the kernel parameters.
After Grub loads the kernel and initrd, the root filesystem gets fsck'd
before any of the normal boot messages appear.

I haven't yet run RAID but am told it's really simple. Is this
going to complicate matters?

That's much easier, the reason I suggested sysrescue is that everytime I have had to check the root it has been so hosed it gets mounted ro and it is too hard to fix it from the recovery shell.


Raid is really simple, till something goes wrong :-) I reckon most data loss is from people taking wrong options to fix things from inexperience rather than raid losing the data.



Cheers,
David.

--
We have enough youth, how about a fountain of SMART?


Reply to: