[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#489006: debian-installer: After grub software raid installation, machine fails to boot with first drive removed or blanked.



Package: debian-installer
Version: 20070308etch2
Severity: normal

I preseeded an Etch installtion with:

d-i grub-installer/bootdev  string (hd0) (hd1)

but it didn't do what I expected with respect to software RAID, and
failed drives.

The current behaviour seems to be:

boot from sda if it is first BIOS drive (0x80)
boot from sdb if it is second BIOS drive (0x81)

If the first hard drive is completely failed, or missing, it will
attempt to boot from the second drive, but fail, because it tries to
acces drive 0x81, but the BIOS has now assigned 0x80 to the remaining
drive.

If I "dd if=/dev/zero of=/dev/sda count=1" (i.e. remove partition
table, and beginning of grub code) then the boot still fails, as the
BIOS still assigns the second drive 0x80.

OTOH, if the first hard drive is present, but unreadable, AND the BIOS
attempts to boot from the second hard drive, but still assigns the
second drive 0x81, then boot will succeed with the existing code, but
not with my post-install commands (however you could still get the
machine to boot by disabling or removing the first drive).  However, I
haven't observed a BIOS which behaves like this.  

I got better behaviour by executing this as a
post-install step:

echo '(hd0) /dev/sdb' > /target/boot/grub/device.map
in-target /usr/sbin/grub-install hd0
echo '(hd0) /dev/sda' > /target/boot/grub/device.map
in-target /usr/sbin/grub-install hd0

Behaviour with the additional post-install steps seems to be:

boot from sda if it is first BIOS drive (0x80)
boot from sdb if it is first BIOS drive (0x80)

It is possible to make sdb the second BIOS drive (0x80) by telling
the BIOS to boot from the second drive, and/or physically removing
sda.  If I "dd if=/dev/zero of=/dev/sda count=1" (i.e. remove partition
table, and beginning of grub code) then the boot still succeeds.

A better fix, which should work in all situations, would be to change
grub so that it attempts to attempt to fall-back to reading from BIOS
drive 0x81 if 0x80 doesn't work (is blank / or unwell i.e. reads fail
etc.).  Without reviewing the grub code in detail, I don't know if this
second fix would be possible.


-- System Information:
Debian Release: 4.0
  APT prefers stable
  APT policy: (500, 'stable')
Architecture: amd64 (x86_64)
Shell:  /bin/sh linked to /bin/bash
Kernel: Linux 2.6.25-2-amd64
Locale: LANG=en_GB, LC_CTYPE=en_GB (charmap=ISO-8859-1)

-- no debconf information



Reply to: