Re: Grub2 reinstall on raid1 system.
On Sat, 15 Jan 2011 16:57:46 -0700
Bob Proulx <bob@proulx.com> wrote:
> Jack Schneider wrote:
> > Bob Proulx wrote:
> > > Jack Schneider wrote:
> > > > I have a raid1 based W/S running Debian Squeeze uptodate. (was
> > > > until ~7 days ago) There are 4 drives, 2 of which had never been
> > > > used or formatted. I configured a new array using Disk Utility
> > > > from a live Ubuntu CD. That's where I screwed up... The end
> > > > result was the names of the arrays were changed on the working
> > > > 2 drives. IE: /dev/md0 to /dev/126 and /dev/md1 became md127.
> > >
> > > Something else must have happened too. Because normally just
> > > adding arrays will not rename the existing arrays. I am not
> > > familiar with the "Disk Utility" that you mention.
> >
> > Hi, Bob
> > Thanks for your encouraging advice...
>
> I believe you should be able to completely recover from the current
> problems. But it may be tedious and not completely trivial. You will
> just have to work through it.
>
> Now that there is more information available, and knowing that you are
> using software raid and lvm, let me guess. You added another physical
> extent (a new /dev/md2 partition) to the root volume group? If so
> that is a common problem. I have hit it myself on a number of
> occasions. You need to update the mdadm.conf file and rebuild the
> initrd. I will say more details about it as I go here in this
> message.
>
> > As I mentioned in a prior post,Grub was leaving me at a Grub
> > rescue>prompt.
> >
> > I followed this procedure:
> > http://www.gnu.org/software/grub/manual/html_node/GRUB-only-offers-a-rescue-shell.html#GRUB-only-offers-a-rescue-shell
>
> That seems reasonable. It talks about how to drive the grub boot
> prompt to manually set up the boot.
>
> But you were talking about using a disk utility from a live cd to
> configure a new array with two new drives and that is where I was
> thinking that you had been modifying the arrays. It sounded like it
> anyway.
>
> Gosh it would be a lot easier if we could just pop in for a quick peek
> at the system in person. But we will just have to make do with the
> correspondence course. :-)
>
> > Booting now leaves me at a busy box: However the Grub menu is
> > correct. With the correct kernels. So it appears that grub is now
> > finding the root/boot partitions and files.
>
> That sounds good. Hopefully not too bad off then.
>
> > > Next time instead you might just use mdadm directly. It really is
> > > quite easy to create new arrays using it. Here is an example that
> > > will create a new device /dev/md9 mirrored from two other devices
> > > /dev/sdy5 and /dev/sdz5.
> > >
> > > mdadm --create /dev/md9 --level=mirror
> > > --raid-devices=2 /dev/sdy5 /dev/sdz5
> > >
> > > > Strangely the md2 array which I setup on the added drives
> > > > remains as /dev/md2. My root partition is/was on /dev/md0. The
> > > > result is that Grub2 fails to boot the / array.
>
> > This is how I created /dev/md2.
>
> Then that explains why it didn't change. Probably the HOMEHOST
> parameter is involved on the ones that changed. Using mdadm from the
> command line doesn't set that parameter.
>
> There was just a long discussion about this topic just recently.
> You might want to jump into it in the middle here and read our
> learnings with HOMEHOST.
>
> http://lists.debian.org/debian-user/2010/12/msg01105.html
>
> > mdadm --examine /dev/sda1 & /dev/sda2 gives I think a clean result
> > I have posted the output at : http://pastebin.com/pHpKjgK3
>
> That looks good to me. And healthy and normal. Looks good to me for
> that part.
>
> But that is only the first partition. That is just /dev/md0. Do you
> have any information on the other partitions?
>
> You can look at /proc/partitions to get a list of all of the
> partitions that the kernel knows about.
>
> cat /proc/partitions
>
> Then you can poke at the other ones too. But it looks like the
> filesystems are there okay.
>
> > mdadm --detail /dev/md0 --> gives mdadm: md device /dev/md0 does
> > not appear to be active.
> >
> > There is no /proc/mdstat data output.
>
> So it looks like the raid data is there on the disks but that the
> multidevice (md) module is not starting up in the kernel. Because it
> isn't starting then there aren't any /dev/md* devices and no status
> output in /proc/mdstat.
>
> > > I would boot a rescue image and then inspect the current
> > > configuration using the above commands. Hopefully that will show
> > > something wrong that can be fixed after you know what it is.
>
> I still think this is the best course of action for you. Boot a
> rescue disk into the system and then go from there. Do you have a
> Debian install disk #1 or Debian netinst or other installation disk?
> Any of those will have a rescue system that should boot your system
> okay. The Debian rescue disk will automatically search for raid
> partitions and automatically start the md modules.
>
> > So it appears that I must rebuild my arrays.
>
> I think your arrays might be fine. More information is needed.
>
> You said your boot partition was /dev/md0. I assume that your root
> partition was /dev/md1? Then you added two new disks as /dev/md2?
>
> /dev/md0 /dev/sda1 /dev/sdc1
>
> Let me guess at the next two:
>
> /dev/md1 /dev/sda2 /dev/sdc2 <-- ?? missing info ??
> /dev/md2 /dev/sdb1 /dev/sdd1 <-- ?? missing info ??
>
> Are those even close to being correct?
>
> > I think I can munge thru the mdadm man pages or Debian Reference to
> > get the tasks.
>
> If you only have the Ubuntu live cd system then you can boot it up
> from the cdrom and then use it to get the arrays started. I still
> think using a rescue disk is better. But with the live cd you
> mentioned using before you can also boot and do system repair. You
> will probably need to manually load the md and dm modules.
>
> $ sudo modprobe md_mod
> $ sudo modprobe dm_mod
>
> And then after those have been loaded that should create a
> /proc/mdstat interface from the kernel. It is only present after the
> driver is loaded.
>
> $ cat /proc/mdstat
>
> I am hoping that it will show some good information about the state of
> things at that point. Since you were able to post the other output
> from mdadm --examine I am hoping that you will be able to post this
> too.
>
> To manually start an array you would assemble the existing components.
>
> mdadm --assemble /dev/md0 /dev/sda1 /dev/sdc1
>
> And if we knew the disk partitions of the other arrays then we could
> assemble them too. Hopefully. I have my fingers crossed for you.
> But something similar to the above but with /dev/md1 and the two
> partitions for that array.
>
> Or if the array is claiming that it is failed (if that has happened)
> then you could --add the partitions back in.
>
> mdadm --assemble /dev/md1 --add /dev/sdc2 ## for example only
>
> After being added back in the array will be sync'ing between them.
> This data sync can be monitored by looking at /proc/mdstat.
>
Here's a out put from a sysrescuecd-2.0.0 program, fsarchiver.
It may help with the current status question:
http://pastebin.com/csAK3kGw
> cat /proc/mdstat
>
> If you get to that point and things are sync'ing then I would tend to
> let that finish before doing anything else.
>
> Note that there is a file /etc/mdadm/mdadm.conf that contains the
> UUIDs (and in previous releases also the /dev/sda1 type devices) of
> the arrays. If things have changed then that file will need to be
> updated.
>
> That file /etc/mdadm/mdadm.conf is used to build the mdadm.conf file
> in the boot initrd (initial ramdisk). If it changes for the root
> volume for any volume used by lvm then the initrd will need to be
> rebuilt in order to update that file.
>
> Let me say this again because it is important. If you add a partition
> to the root lvm volume group then you must rebuild the initrd or your
> system will not boot. I have been there and done that myself on more
> than one occasion.
>
> dpkg-reconfigure linux-image-2.6.32-5-i686 # choose kernel package
>
> The easiest way to do this is to boot a Debian rescue cd and boot into
> the system, update the /etc/mdadm/mdadm.conf file and then issue the
> above dpkg-reconfigure command.
>
> You can get the information for the /etc/mdadm/mdadm.conf file by
> using the mdadm command to scan for it.
>
> $ sudo mdadm --detail --scan
root@sysresccd /root % mdadm --detail --scan
ARRAY /dev/md/Speeduke:2 metadata=1.2 name=Speeduke:2
UUID=91ae6046:969bad93:92136016:116577fd ARRAY /dev/md/126_0
metadata=0.90 UUID=c06c0ea6:5780b170:ea2fd86a:09558bd1
ARRAY /dev/md/125_0 metadata=0.90
UUID=e45b34d8:50614884:1f1d6a6a:d9c6914c
>
> Use that information to update the /etc/mdadm/mdadm.conf file.
>
> > Thanks for the help.. You need all the help you can get at 76yrs..
>
> I am still a few decades behind you but we are all heading that same
> direction. :-) Patience, persistence and tenacity will get you
> through it. You are doing well. Just keep working the problem.
>
> Bob
root@sysresccd /root % cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md125 : active raid1 sda1[0] sdc1[1]
9767424 blocks [2/2] [UU]
md126 : active raid1 sda5[0] sdc5[1]
302801024 blocks [2/2] [UU]
md127 : active raid1 sdb[0] sdd[1]
488385424 blocks super 1.2 [2/2] [UU]
unused devices: <none>
That is the output from /cat/proc/mdstat on the sysrecuecd system
which is running in memory.
My thinking is that I should rerun mdadm and reassemble the arrays to
the original definitions... /md0 from sda1 & sdb1
/md1 from sda5 & sdb5 note: sdb2 is a
legacy msdos extended partition.
I would not build a md device with msdos extended partitions under LVM2
at this time.. Agree?
Is the above doable? If I can figure the right mdadm commands...8-)
Thanks again, Bob & Tom
Jack
Reply to: