[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Grub2 reinstall on raid1 system.



On Sat, 15 Jan 2011 16:57:46 -0700
Bob Proulx <bob@proulx.com> wrote:

> Jack Schneider wrote:
> > Bob Proulx wrote:
> > > Jack Schneider wrote:
> > > > I have a raid1 based W/S running Debian Squeeze uptodate. (was
> > > > until ~7 days ago) There are 4 drives, 2 of which had never been
> > > > used or formatted. I configured a new array using Disk Utility
> > > > from a live Ubuntu CD. That's where I screwed up... The end
> > > > result was the names of the arrays were changed on the working
> > > > 2 drives. IE: /dev/md0 to /dev/126 and /dev/md1 became md127.
> > > 
> > > Something else must have happened too.  Because normally just
> > > adding arrays will not rename the existing arrays.  I am not
> > > familiar with the "Disk Utility" that you mention.
> >
> > Hi, Bob 
> > Thanks for your encouraging advice...
> 
> I believe you should be able to completely recover from the current
> problems.  But it may be tedious and not completely trivial.  You will
> just have to work through it.
> 
> Now that there is more information available, and knowing that you are
> using software raid and lvm, let me guess.  You added another physical
> extent (a new /dev/md2 partition) to the root volume group?  If so
> that is a common problem.  I have hit it myself on a number of
> occasions.  You need to update the mdadm.conf file and rebuild the
> initrd.  I will say more details about it as I go here in this
> message.
> 
> > As I mentioned in a prior post,Grub was leaving me at a Grub
> > rescue>prompt.  
> > 
> > I followed this procedure:
> > http://www.gnu.org/software/grub/manual/html_node/GRUB-only-offers-a-rescue-shell.html#GRUB-only-offers-a-rescue-shell
> 
> That seems reasonable.  It talks about how to drive the grub boot
> prompt to manually set up the boot.
> 
> But you were talking about using a disk utility from a live cd to
> configure a new array with two new drives and that is where I was
> thinking that you had been modifying the arrays.  It sounded like it
> anyway.
> 
> Gosh it would be a lot easier if we could just pop in for a quick peek
> at the system in person.  But we will just have to make do with the
> correspondence course.  :-)
> 
> > Booting now leaves me at a busy box: However the Grub menu is
> > correct. With the correct kernels. So it appears that grub is now
> > finding the root/boot partitions and files. 
> 
> That sounds good.  Hopefully not too bad off then.
> 
> > > Next time instead you might just use mdadm directly.  It really is
> > > quite easy to create new arrays using it.  Here is an example that
> > > will create a new device /dev/md9 mirrored from two other devices
> > > /dev/sdy5 and /dev/sdz5.
> > > 
> > >   mdadm --create /dev/md9 --level=mirror
> > > --raid-devices=2 /dev/sdy5 /dev/sdz5
> > > 
> > > > Strangely the md2 array which I setup on the added drives
> > > > remains as /dev/md2. My root partition is/was on /dev/md0. The
> > > > result is that Grub2 fails to boot the / array.
> 
> > This is how I created /dev/md2.
> 
> Then that explains why it didn't change.  Probably the HOMEHOST
> parameter is involved on the ones that changed.  Using mdadm from the
> command line doesn't set that parameter.
> 
> There was just a long discussion about this topic just recently.
> You might want to jump into it in the middle here and read our
> learnings with HOMEHOST.
> 
>   http://lists.debian.org/debian-user/2010/12/msg01105.html
> 
> > mdadm --examine /dev/sda1 & /dev/sda2  gives I think a clean result 
> > I have posted the output at : http://pastebin.com/pHpKjgK3
> 
> That looks good to me.  And healthy and normal.  Looks good to me for
> that part.
> 
> But that is only the first partition.  That is just /dev/md0.  Do you
> have any information on the other partitions?
> 
> You can look at /proc/partitions to get a list of all of the
> partitions that the kernel knows about.
> 
>   cat /proc/partitions
> 
> Then you can poke at the other ones too.  But it looks like the
> filesystems are there okay.
> 
> > mdadm --detail /dev/md0 --> gives  mdadm: md device /dev/md0 does
> > not appear to be active. 
> > 
> > There is no /proc/mdstat  data output.  
> 
> So it looks like the raid data is there on the disks but that the
> multidevice (md) module is not starting up in the kernel.  Because it
> isn't starting then there aren't any /dev/md* devices and no status
> output in /proc/mdstat.
> 
> > > I would boot a rescue image and then inspect the current
> > > configuration using the above commands.  Hopefully that will show
> > > something wrong that can be fixed after you know what it is.
> 
> I still think this is the best course of action for you.  Boot a
> rescue disk into the system and then go from there.  Do you have a
> Debian install disk #1 or Debian netinst or other installation disk?
> Any of those will have a rescue system that should boot your system
> okay.  The Debian rescue disk will automatically search for raid
> partitions and automatically start the md modules.
> 
> > So it appears that I must rebuild my arrays.
> 
> I think your arrays might be fine.  More information is needed.
> 
> You said your boot partition was /dev/md0.  I assume that your root
> partition was /dev/md1?  Then you added two new disks as /dev/md2?
> 
>   /dev/md0   /dev/sda1  /dev/sdc1
> 
> Let me guess at the next two:
> 
>   /dev/md1   /dev/sda2  /dev/sdc2  <-- ?? missing info ??
>   /dev/md2   /dev/sdb1  /dev/sdd1  <-- ?? missing info ??
> 
> Are those even close to being correct?
> 
> > I think I can munge thru the mdadm man pages or Debian Reference to
> > get the tasks.
> 
> If you only have the Ubuntu live cd system then you can boot it up
> from the cdrom and then use it to get the arrays started.  I still
> think using a rescue disk is better.  But with the live cd you
> mentioned using before you can also boot and do system repair.  You
> will probably need to manually load the md and dm modules.
> 
>   $ sudo modprobe md_mod
>   $ sudo modprobe dm_mod
> 
> And then after those have been loaded that should create a
> /proc/mdstat interface from the kernel.  It is only present after the
> driver is loaded.
> 
>   $ cat /proc/mdstat
> 
> I am hoping that it will show some good information about the state of
> things at that point.  Since you were able to post the other output
> from mdadm --examine I am hoping that you will be able to post this
> too.
> 
> To manually start an array you would assemble the existing components.
> 
>   mdadm  --assemble /dev/md0   /dev/sda1 /dev/sdc1
> 
> And if we knew the disk partitions of the other arrays then we could
> assemble them too.  Hopefully.  I have my fingers crossed for you.
> But something similar to the above but with /dev/md1 and the two
> partitions for that array.
> 
> Or if the array is claiming that it is failed (if that has happened)
> then you could --add the partitions back in.
> 
>   mdadm  --assemble /dev/md1   --add /dev/sdc2  ## for example only
> 
> After being added back in the array will be sync'ing between them.
> This data sync can be monitored by looking at /proc/mdstat.
> 

Here's a out put from a sysrescuecd-2.0.0 program, fsarchiver.
It may help with the current status question:

http://pastebin.com/csAK3kGw


>   cat /proc/mdstat
> 
> If you get to that point and things are sync'ing then I would tend to
> let that finish before doing anything else.
> 
> Note that there is a file /etc/mdadm/mdadm.conf that contains the
> UUIDs (and in previous releases also the /dev/sda1 type devices) of
> the arrays.  If things have changed then that file will need to be
> updated.
> 
> That file /etc/mdadm/mdadm.conf is used to build the mdadm.conf file
> in the boot initrd (initial ramdisk).  If it changes for the root
> volume for any volume used by lvm then the initrd will need to be
> rebuilt in order to update that file.
> 
> Let me say this again because it is important.  If you add a partition
> to the root lvm volume group then you must rebuild the initrd or your
> system will not boot.  I have been there and done that myself on more
> than one occasion.
> 
>   dpkg-reconfigure linux-image-2.6.32-5-i686   # choose kernel package
> 
> The easiest way to do this is to boot a Debian rescue cd and boot into
> the system, update the /etc/mdadm/mdadm.conf file and then issue the
> above dpkg-reconfigure command.
> 
> You can get the information for the /etc/mdadm/mdadm.conf file by
> using the mdadm command to scan for it.
> 
>   $ sudo mdadm --detail --scan
root@sysresccd /root % mdadm --detail --scan
ARRAY /dev/md/Speeduke:2 metadata=1.2 name=Speeduke:2
UUID=91ae6046:969bad93:92136016:116577fd ARRAY /dev/md/126_0
metadata=0.90 UUID=c06c0ea6:5780b170:ea2fd86a:09558bd1
ARRAY /dev/md/125_0 metadata=0.90
UUID=e45b34d8:50614884:1f1d6a6a:d9c6914c


> 
> Use that information to update the /etc/mdadm/mdadm.conf file.
> 
> > Thanks for the help..  You need all the help you can get at 76yrs..
> 
> I am still a few decades behind you but we are all heading that same
> direction.  :-) Patience, persistence and tenacity will get you
> through it.  You are doing well.  Just keep working the problem.
> 
> Bob
root@sysresccd /root % cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
md125 : active raid1 sda1[0] sdc1[1]
      9767424 blocks [2/2] [UU]
      
md126 : active raid1 sda5[0] sdc5[1]
      302801024 blocks [2/2] [UU]
      
md127 : active raid1 sdb[0] sdd[1]
      488385424 blocks super 1.2 [2/2] [UU]
      
unused devices: <none>




That is the output from /cat/proc/mdstat on the sysrecuecd system
which is running in memory.
My thinking is that I should rerun mdadm and reassemble the arrays to
the original definitions...  /md0  from sda1 & sdb1
			     /md1  from sda5 & sdb5  note: sdb2 is a
			     legacy msdos extended partition.
I would not build a md device with msdos extended partitions under LVM2
at this time..   Agree?
Is the above doable?  If I can figure the right mdadm commands...8-)



Thanks again, Bob & Tom

Jack


Reply to: