[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

RAID1 arrays not starting when drive is missing



I have installed LVM-over-RAID1 on a debian-derived system.
Kernel = 2.6.22
mdadm = 2.6.2
lvm2 = 2.02.26

I have two RAID1 devices -- /dev/md0 which is /boot (and
does NOT use LVM) and is made up of /dev/sda2 and /dev/sdb2,
and /dev/md1 which has LVM (and the rest of the system) over it
and is made up of /dev/sda3 and /dev/sdb3.

I've grub'd both /dev/sda and /dev/sdb, and as long as
both drives are plugged in, I can boot from either drive
and everything works and is happy.

However...

When I try to boot with a drive removed (for testing purposes),
grub comes up fine and the system begins to boot, but it appears
that the arrays will not start, which means there's no root filesystem
available and everything grinds to a halt. This doesn't seem right.
Given the nature of RAID1, the arrays darn well should start up even
with a missing drive.

The boot output when the drive is missing has:
md: md0 stopped
md: bind <sda2>
md: md1 stopped
md: md0 stopped
md: unbind <sda2>
md: export_rdev(sda2)
md: bind <sda2>
md: md1 stopped
md: bind <sda3>
and then everything grinds to a halt.

When both drives are present, this section goes:
md: md0 stopped.
md: unbind<sdb2>
md: export_rdev(sdb2)
md: bind<sdb2>
md: bind<sda2>
raid1: raid set md0 active with 2 out of 2 mirrors
(etc.)

One thing I've discovered is that if while the system is running in
normal mode I do:
mdadm /dev/md0 --fail /dev/sdb2 --remove /dev/sdb2
mdadm /dev/md1 --fail /dev/sdb3 --remove /dev/sdb3
and then shut down the machine and unplug the drive, then I can
boot from the remaining drive, and I see a
raid1: raid set md0 active with 1 out of 2 mirrors
(etc.)

So if the drive fails while the system is running, it can be punted
from the array and the machine can boot off one drive. That's
nice and all, but that doesn't help in the case where the drive
fail event causes the machine to crash (say the drive dies
and takes down the bus and the machine locks up).

So what am I missing on why the arrays won't start up when I have
a drive unplugged (it doesn't matter which drive)?

--
Rich Carreiro



Reply to: