[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [Little OT] grub and udev root device

Hemlock wrote:
> Thanks for the info guys!
> Sounds like too much of a headache to go much further.
> I guess my idea about this came from a different problem I was thinking 
> about. Maybe you have some ideas about it.
> Here it is.
> I have a software raid as / with 4 sata drives. (sda,sdb,sdc,sdd)
> If I lose a drive (say sdc), completely, and the system reboots, the system 
> will see 3 drives from sda-sdc. 
> Since sdc isn't the drive that its supposed to be, couldn't this totally 
> corrupt the array?
> What should happen is the system should only see sda,sdb and sdd.
> So what if I used udev and created the software raid arrays via mdadm.conf 
> with devices like /dev/sata1 and /dev/sata2 and so on. 
> Since my /boot is /dev/md0 and / is /dev/md1 I wonder if this would work?
> Do you think there may be some other way to avoid this problem?
> Thanks again for all your help! :)
> Cheers,
> Mike


Our setup here is a little different than yours, but it should work the
same.  We are doing software RAID1 across two SATA drives.

The controllers that we have (low-end HP hardware) control which drive
appears where in the chain.  The drive plugged into connector 0 is sda
and the drive plugged into connector 1 is sdb.  When we lose sda, sdb
stays sdb.

Additionally, mdadm can tell which drive is which.  I'm not sure if its
writing some identifying information onto the drive or what, but each
disk seems to know its place in the configuration.

One thing that you are going to want to make sure that you do is to use
the grub console to place an MBR onto each of the drives in the array,
not just the first one.  Otherwise, if you remove drive sda, the machine
will look for an MBR from sdb, not find one, and not boot.

Our servers do not have hot-swappable drives, so we have something of a
problem when we lose sda.  We have to shut down the box to replace the
drive, but then the replacement sda drive is blank, and the controller
can't boot from it.  Some of the machines have BIOSes that will let you
pick which drive to boot from, others require a grub boot floppy or USB
stick from which you can direct the machine to boot from sdb.  Check
your BIOS before disaster strikes.

Nothing replaces some good testing.  Before you place the box into
production, play with it.  Shut it down, pull sda, and boot it up.  See
what happens.  Then you'll know exactly what steps are required in the
event of a failure, and when it happens with real data on the box,
you'll know what to do.  You might even document the required recovery
steps, print them out, and attach them to the machine.  The next person
who has to maintain the box will thank you.


Reply to: