Re: Grub2 reinstall on raid1 system.
On Fri, 14 Jan 2011 05:25:45 -0700
Bob Proulx <bob@proulx.com> wrote:
> Jack Schneider wrote:
> > I have a raid1 based W/S running Debian Squeeze uptodate. (was
> > until ~7 days ago) There are 4 drives, 2 of which had never been
> > used or formatted. I configured a new array using Disk Utility from
> > a live Ubuntu CD. That's where I screwed up... The end result was
> > the names of the arrays were changed on the working 2 drives.
> > IE: /dev/md0 to /dev/126 and /dev/md1 became md127.
>
> Something else must have happened too. Because normally just adding
> arrays will not rename the existing arrays. I am not familiar with
> the "Disk Utility" that you mention.
>
> Next time instead you might just use mdadm directly. It really is
> quite easy to create new arrays using it. Here is an example that
> will create a new device /dev/md9 mirrored from two other devices
> /dev/sdy5 and /dev/sdz5.
>
> mdadm --create /dev/md9 --level=mirror
> --raid-devices=2 /dev/sdy5 /dev/sdz5
This is how I created /dev/md2.
>
> > Strangely the md2 array which I setup on the added drives remains as
> > /dev/md2. My root partition is/was on /dev/md0. The result is that
> > Grub2 fails to boot the / array.
>
> You may have to boot a rescue cd. I recommend booting the Debian
> install disk in rescue mode. Then you can inspect and fix the
> problem. But as of yet you haven't said enough to let us know what
> the problem might be yet.
>
> > I have tried three REINSTALLING GRUB procedures from Sysresccd
> > online docs and many others GNU.org, Ubuntu etc.
>
> This isn't encouraging. I can tell that you are grasping at straws.
> You have my sympathy. But unfortunately that doesn't help diagnose
> the problem. Remain calm. And repeat exactly the problem that you
> are seeing and the steps you have taken to correct it.
>
I have not made any changes to any files on the root partition. I have
only used the procedures from SystemRescueCD and then backed out. All
seem to fail with the same "linux_raid_member" error.
> > The errors occur when I try to mount the partition with the /boot
> > directory. 'Complains about file system type 'linux_raid_member'
>
> I haven't seen that error before. Maybe someone else will recognize
> it.
>
> I don't understand why you would get an error mounting /boot that
> would prevent the system from coming online. Because by the time the
> system has booted enough to mount /boot it has already practically
> booted completely. The system doesn't actually need /boot mounted to
> boot. Grub reads the files from /boot and sets things in motion and
> then /etc/fstab instructs the system to mount /boot.
I get that when using the live rescue disk.
>
> Usually when the root device cannot be assembled the error I see is
> that the system is "Waiting for root filesystem" and can eventually
> get to a recovery shell prompt.
>
> > This machine has worked for 3 years flawlessly.. Can anyone help
> > with this? Or point me to a place or link to get this fixed. Google
> > doesn't help... I can't find a article/posting where it ended
> > successfully. I have considered a full reinstall after Squeeze goes
> > stable, since this O/S is a crufty upgrade from sarge over time. But
> > useless now..
>
> The partitions for raid volumes should be 'autodetect' 0xFD. This
> will enable mdadm to assemble then into raid at boot time.
>
> You can inspect the raid partitions with --detail and --examine.
>
> mdadm --examine /dev/sda1
> mdadm --detail /dev/md0
>
> That will list information about the devices. Replace with your own
> series of devices.
>
> I would boot a rescue image and then inspect the current configuration
> using the above commands. Hopefully that will show something wrong
> that can be fixed after you know what it is.
>
> A couple of other hints: If you are not booting a rescue system but
> using something like a live boot then you may need to load the kernel
> modules manually. You may need to load the dm_mod and md_mod modules.
>
> modprobe md_mod
>
> You might get useful information from looking at the /proc/mdstat
> status.
>
> cat /proc/mdstat
>
> There is a configuration file /etc/mdadm/mdadm.conf that holds the
> UUIDs of the configured devices. If those have become corrupted then
> mdadm won't be able to assemble the /dev/md* devices. Check that file
> and compare against what you see with the --detail output.
>
> The initrd contains a copy of the mdadm.conf file with the components
> needed to assemble the root filesystem. If the UUIDs change over what
> is recorded in the initrd then the initrd will need to be rebuilt. To
> do that make sure that the /etc/mdadm/mdadm.conf file is correct and
> then reconfigure the kernel with dpkg-reconfigure.
>
> dpkg-reconfigure linux-image-2.6.32-5-i686
>
> Good luck!
>
> Bob
Thanks, Bob
I will do as you suggest shortly.. BTW, A little more info in my reply
to Tom..
TIA, Jack
Reply to: