[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

grub-legacy fails with its root on raid1



Please CC me on reply, since I am not subscribed to the list.

Hello,

just recently I was faced with a non-booting remote box running Debian 6.0 (Squeeze). Lacking access to any kind of console, I was quite clueless why it wouldn´t come up after what should have been a routine reboot. I have access to a rescue system though, which, when requested from the provider, reboots the machine into a live system. Now I think I have found the problem in grub-legacy trying to boot of /dev/md0 which indeed is my boot partition:

# Excerpt of grub´s menu.lst

  title           Debian GNU/Linux, kernel 2.6.32-5-amd64
  root            (md0) # software raid1 consisting of /dev/sd[ab]1
  kernel          /vmlinuz-2.6.32-5-amd64 root=/dev/md1 ro
  initrd          /initrd.img-2.6.32-5-amd64

note that grub´s root is set to (md0) by update-grub. I am not entirely sure but I thought that wasn´t possible with grub-legacy. I might be wrong though. However changing the menu.lst manually to use (hd0,0) as root for grub finally fixed the machine to boot successfully again, after some very frustrating attempts to find the cause of the problem.

Now, I honestly don´t know how it has worked before, not knowing if the menu.lst always pointed grub-legacy to (md0). I haven´t found a conclusive answer to whether grub-legacy is capable of using raid1 as root or not, since what documentation I have found online was either about grub2 or not clearly distinguishing between the two grub versions.

So my questions: Is (the Debian Squeeze version of) grub-legacy capable of "root (md0)"? If so, why doesn´t it work anymore and if not, why is update-grub writing a faulty menu.lst?

Having said that, I want to elaborate some more, since I did have some rebooting problems and fixed them not so long before that. Maybe I broke something myself and am not aware of it. Last weekend I wanted to try Xen for virtualization, so I installed linux-image-2.6.32-5-xen-amd64. I checked the menu.lst and the 1st entry was the newly installed kernel. So I rebooted and at first the machine seemed unresponsive and I was already about to reboot the machine into the rescue system, when suddenly my pings were replied and I could log in. The uptime command revealed the system had been running for 11 minutes already. But for about 10 minutes there was no echo reply coming back and no login possible. I thought I missed something for a fully functional Xen hypervisor system and indeed I did, so I installed xen-linux-system-2.6-xen-amd64 and rebooted again. Now the machine kept unresponsive for over an hour and I assumed it would for all eternity, so I booted into the rescue system to manually change the "default" entry in menu.lst. /dev/md0 was not available and not knowing how to change that at the time I mounted /dev/sda1 and /dev/sdb1 directly and changed both menu.lst files. After that the machine booted with the normal non-xen-kernel as expected. I did some more reading on Xen and figured it might not be worth the trouble to try and get it running on a remote box if at all possible, I still don´t know. So I installed qemu-kvm, libvirt-bin and virt-manager and had a go with kvm, decided I wanted to stick with it. Meanwhile I had also installed gparted to see, if it was capable of resizing my root partition /dev/md1 (/dev/sda3, /dev/sdb3) without ever trying it. So the last successful (re)boot was roughly 2 days before I found myself faced with above non-booting machine:

# zgrep reboot ~log/syslog*
/var/log/syslog.1:Dec 3 20:01:32 shutdown[2521]: shutting down for system reboot # unsuccessful boot
[...]
/var/log/syslog.3.gz:Dec 1 14:46:13 shutdown[30931]: shutting down for system reboot /var/log/syslog.3.gz:Dec 1 14:48:17 /usr/sbin/cron[1614]: (CRON) INFO (Running @reboot jobs)
[...]

According to my aptitude log I installed gparted including dependencies, updated libxml2, libxml2-dev, libxml2-utils and python-libxml2. Also I purged linux-image-2.6-xen-amd64, xen-linux-system-2.6-xen-amd64, xen-qemu-dm-4.0 and linux-image-2.6.32-5-xen-amd64 including their dependencies. After the purge action I rebooted and that´s when the machine wouldn´t come up anymore. And the only reason for not doing so is grub using (md0) as its root, I believe.

So, have I broken something myself or is there something wrong with grub-legacy or a third option? Any help will be greatly appreciated and sorry if this message is a bit long.

Cheers
Marcus


Reply to: