Re: Bug with soft raid?
On 2/12/19 11:37 AM, Tom Bachreier wrote:
Feb 12, 2019, 12:08 PM by dlist@bluewin.ch:
The system blocks for about 3 minutes and then I get back a hand on it.
I have a similar - maybe the same - problem in buster - see the thread
"Software RAID blocks" on this list about a month ago. Unfortunately
still no solution. :-(
I have the advantage that my system harddisk is outside the RAID on a
separate disk. Therefore I'm still able to send "low level" commands
like smartctl or fdisk to the disks in the array during the block. If
I trigger the right disk the block aborts immediately.
In each of my machines, I use a single 16 GB USB 3.0 flash drive, or a
small SDD, for the system drive. I then use btrfs for all file systems.
It is my expectation that if a disk goes bad, the machine will log an
error and/or halt.
Maybe this works for you, too?
You can try:
for i in /dev/sd{b..f}; do echo "DISK: ${i}"; smartctl -l scterc "${i}"; sleep 3; done
Some drives allow you to adjust the Error Recovery Control timeout in
their firmware. You can use this to force the drive to return an error
promptly, rather than spending minutes trying to recover (e.g. block for
3 minutes):
https://en.wikipedia.org/wiki/Error_recovery_control
I had a Linux md RAID0 (mirror) built from two older desktop/ SOHO
server drives that supported scterc. So, I put commands like the
following, one per drive, into a script that was run at system startup:
# /usr/sbin/smartctl -l scterc,70,70 /dev/disk/by-id/ata-XXX_YYY
smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.9.0-4-amd64] (local
build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke,
www.smartmontools.org
SCT Error Recovery Control set to:
Read: 70 (7.0 seconds)
Write: 70 (7.0 seconds)
David
Reply to: