[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Solved! (Re: Now won't boot (was: Re: Squeeze assembles one RAID array, at boot but not the other



Hendrik Boom wrote:
> The first problem was that I hadn't fixed up mdadm.conf and hadn't 
> rebuilt initrd.
> That fixed, I could find the gpt partitions.

Yay!

> But reconfiguring the kernel package also had the effect of 
> overwriting my boot disk, and the version of the system I 
> was running happened not to have a proper lilo configuration,
> so I ended up with an unusable boot floppy.

I am sure that I don't understand your boot arrangement.  There is the
MBR and then "/boot" and then "/" in the boot sequence.  If you are
booting from floppies then your MBR is on the floppy.  It sounds like
your /boot is on the floppy too?

If I remember correctly, it has been a while since I used lilo, it is
required for lilo to update the MBR with the updated boot
information.  If you are using lilo then the lilo boot area must be
re-frozen with the new kernel boot information every time the kernel
is updated.

When using grub this is different.  Grub uses a multi-stage boot
loader.  The MBR portion doesn't change.  It reads the configuration
from the filesystem.  So with grub the MBR no no need for updates.

As I recall in the Lenny 5.0 release notes when using grub it was
recommended that the MBR should be updated with the latest bits since
it isn't otherwise updated and there had been changes in that release
cycle.

With both grub and lilo it will update the /boot/initrd.img-X.Y.Z-$ARCH
matching the currently running kernel.  But that is what we wanted.

> The lilo.conf was out of date, and referred to a nonexistent 
> root partitino.

Out of date because of the recent changes in disks?  That sounds like
the root of the problem.  Good that you were able to debug to that
point.

> Somehow, the boot process failed to notice this and instead
> of the kernel complaining about not having a root partition, 
> it complained about the absence of /sbin/init.

That does seem odd.

> Or maybe it found a stray partition and happened to mount it on
> /root, uselessly.  Hard to say.

I don't think it mounts "stray" partitions.

> When I use the mount command in the initramfs shell all it'll rell
> me is that /dev/root is mounted on /.  Not very informative.

Nope.

> But I managed to boot using an old lilo floppy left over from
> another Linux I had on the machine a while ago.
> It still had the ability to boot into squeeze but with an very 
> old kernel.  I immediately write-protected the floppy, just 
> in case.

Good idea!

> Using that, I could edit the lilo.conf file.  I inserted a variety
> of stanzas, requesting different kernels and specifying the (same)
> root partition in a variety of ways.

Yay!  System mostly rescued.

> root=/dev/mapper/VG1-squeezeroot worked.
> root=/dev/VG1/squeezeroot did not.
> Both the old and new kernels would boot and find the gpt partitiions,
> but only the  new kernel could access the ext4 file systems on
> them.

Ah, yes, because ext4 is newer than those kernels.  Makes sense.

> But what still doesn't work is booting with grub2.  I thing the BIOS
> is having a hard time figuring out which of my four hard drives to boot.
> I suppose it's time to look into the BIOS parameters and/or switch cables
> connecting the disks to the disk controller cards.

I have debugged one issue lately where the motherboard silkscreen
labels for the SATA ports were different from what the BIOS order
reported.  That is, SATA 0 mapped to BIOS 4, SATA 1 to BIOS 2, and so
forth in a strange order.  The motherboard labels were mismatched from
the BIOS order.  I had to determine the correct mapping so that I
could get my boot drive into the BIOS SATA 0 position.

The standard BIOS will try to boot from the lowest numbered SATA port.
For example if disks are in SATA 2 and SATA 3 then it will try to boot
from 2.  If in SATA 0 and SATA 1 then it will try to boot from SATA 0.
That lowest SATA port disk is the disk that should be your desired
boot disk with the MBR on it.  Any other disk order won't work.

This is sometimes confusing.  Adding new disks can cause the "first"
BIOS disk to change to a different disk.  For example if you had
previously had disks in 2 & 3 with the system booting fine from number
2 and then added new disks to 0 & 1.  It would then try to boot from 0
and unless an MBR had been installed there it will fail.  I always
order the cables so that the BIOS order meets my desire of which disk
I want to have as the boot disk.

After the kernel has booted almost all of the modern systems such as
mdadm, lvm, the fstab use filesystem UUIDs to identify the drives.  So
after the kernel has booted everything using UUIDs will be okay
regardless of the disk cable ordering.

Bob

Attachment: signature.asc
Description: Digital signature


Reply to: