[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: MD software raid & multiple disk failure recovery help script



Mike Fedyk wrote:

Then it will create a new array (make sure you have one missing drive so that it doesn't try syncing the disks) with the old disks. What you're trying to do is find the original disk order, and if you fail multiple disks, that ordering info is lost AFAIK.

Here[1] are the combinations that would be tried for a four drive raid array. That's 24 combinations for 4 drives, 120 for 5 drives and a whopping 720 for 6 drives. I have four drives, but even running the commands and keeping track of the combinations on paper 24 times is enough.

I have an update that enforces the "missing" to avoid array reconstruction (which would destroy the data on an array if reconstructed improperly).

This version includes code to create the array, and use tune2fs, mount, and e2fsck for array construction verification. Oh, there is a hard coded setting of 256K sized chunks, which is very important in recovery.

With all of these combinations, I still haven't found one that passes the tests. Anyone have any ideas?

Thanks,

Mike
#!/bin/ash
set -e
#set -x
rotate() {
   local last_var=$1
   shift
   echo $@ $last_var
}
cut_one() {
    shift
    echo $@
}
rotate_part() {
    local no_rotate=""
    local r_to_shift=$1
    shift
    while [ $r_to_shift -gt 0 ]; do
        no_rotate="${no_rotate# }$1 "
       shift
        r_to_shift=$(( $r_to_shift - 1 ))
    done
    echo "$no_rotate$(rotate $@)"
}
do_it() {
    local shift_factor="$1"
    shift
    local my_partitions="$@"
    local d_shift=$(( $num_drives - $shift_factor ))
    if [ $shift_factor -lt $(( $num_drives - 1 )) ]; then
        while [ 0 -lt $d_shift ]; do
            do_it $(( $shift_factor + 1 )) "$my_partitions"
            my_partitions=$(rotate_part $shift_factor $my_partitions)
	    d_shift=$(( $d_shift - 1 ))
        done
    else
        echo -n "$my_partitions:"
        mdadm -S "$array_dev" > /dev/null 2>&1
	mdadm -C "$array_dev" -c 256 -l 5 -n $num_drives $my_partitions --force --run > /dev/null 2>&1 \
	&& tune2fs -l "$array_dev" > /dev/null \
	&& echo "tune2fs: $my_partitions" >> /tmp/abc.log \
	&& mount -t ext3 "$array_dev" /mnt/test -o ro \
	&& echo "mount: $my_partitions" >> /tmp/abc.log \
	&& umount "$array_dev" \
	&& e2fsck -fn "$array_dev"  \
	&& echo "e2fsck: $partitions" >> /tmp/abc.log \
	&& sleep 10
	echo
    fi
}

#partitions="missing disc0/part2 disc2/part2 disc3/part2"
array_dev=$1
shift

#Add "missing" drive to keep the MD RAID driver from 
#starting a reconstruction thread.
partitions=$@

#Start counting at -1 to account for added "missing" drive.
num_drives=0
for i in $partitions; do
    num_drives=$(( $num_drives + 1 ))
done

[ ! -e /mnt/test ] && mkdir /mnt/test
[ -e /tmp/abc.log ] && mv --backup=numbered /tmp/abc.log /tmp/abc.log.bak

for i in $partitions; do
    do_it 0 "missing $(cut_one $partitions)"
    partitions=$(rotate $partitions)
done

Reply to: