[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Now won't boot (was: Re: Squeeze assembles one RAID array, at boot but not the other



On Mon, 07 Jan 2013 11:20:45 -0700, Bob Proulx wrote:

> Hendrik Boom wrote:
>> Now won't boot
>> ...
>> I had to mdadm --assemble /dev/md1 /dev/sdd2 /dev/sdb2 first.
> 
> Oh!  I am glad to hear that you were able to work through it manually
> though and get going.
> 
>> > Did you rebuild the initrd images for the booting kernel after having
>> > done this?
>> > 
>> > Example:
>> > 
>> >   dpkg-reconfigure linux-image-3.2.0-4-amd64
>> 
>> dpkg-reconfigure linux-image-2.6.32-5-amd64
>> 
>> Different kernel version (this is squeeze, after all)
>> but it seemed to work.
> 
> Yep.
> 
>> Next to reboot.
>> 
>> OOPS
>> 
>> Won't boot.  Gets stick at the initramfs prompt after complaining 
>> that it can't run /sbin/init 
> 
> Scary!
> 
> And just noting for the archive readers that in addition to working
> through the initrd busybox prompt it is also possible to use the
> Debian install media as a rescue system in the case this happens and
> can't make other progress.  The Debian installer image has a rescue
> mode that will automatically assemble all of the autoraid partitions.
> And from there you can chroot into the system and fix things.

Now downloading netinstall disk.  I haven't needed an install disk on 
this system for years and years.  If the coffee shop gets tired of my
dowload I should be able to boot my son's Ubuntu live CD and get 
something running, anyway.

> 
>> I look with ls, and discover that /sbin exists, bit is totally
>> empty.  that seems a good reason to be unable to run /sbin/init.
> 
> Does seem to be a good reason.  I can't remember what is in the bin
> directories of an initrd bootstrapping image.  Pretty sure there is at
> least something in /sbin though.  And noting that for the most part I
> think you will be working out of a busybox shell which is used to
> reduce the amount of file system storage space needed in that early
> boot time environment.

Would it really be complaining about missing /sbin/init on the initrd
ramdisk, or about /sbin/init being missing on the real root partition,
which could be caused by not having mounted it yet?

> 
>> And ls / clearly shows me a root directory which is *not* my
>> system's root directory. Presumably it's at the stage where it
>> hasn't gotten around to mounting the real root partition yet.
> 
> It will be the initrd (initramfs actually) ram filesystem directory.
> 
>> It looks to me that dpkg-reconfigure linux-image-2.6.32-5-amd64
>> has somehow built me an unusable initrd.  Puzzling, because doesn't
>> is do that every time it upgrades the kernel anyway?  And this is 
>> Debiann stable I'm running.
> 
> Yes.  It does that every time you take a kernel upgrade.  Because the
> initrd is not a packaged file.  It is build specifically for your
> system after installation.
> 
>> I tried booting with an old kernel, but this made no difference.
>> Puzzling, because why would dpkg-reconfigure linux-image-2.6.32-5-amd64
>> mess with the initrds of other kernels?
> 
> It wouldn't.  You can look at the file dates of /boot/initrd.img-* and
> see when they were last touched.

I'll check that when I've booted my rescue.

> 
>> Maybe that's not the problem.
> 
> Sometimes we have two independent problems.  Or sometimes we snag our
> foot on something while trying to fix one and create another.  I have
> done that too often.
> 
>> Nor did it make a difference whether I booted with grub2 or lilo.
>> Annoying, since I maintain these two independent boot methods just
>> in case.
> 
> They will both use the same initrd image.  If that image is a problem
> then both will be affected the same way.
> 
>> Yes, it has recognised both RAIDs.
>> 
>> Once I had to tell initramfs
>> 
>>     vgchange -a y
>> 
>> to make sure it saw the logical volumes inside my RAIDs.
> 
> That step is usually done by the /etc/init.d/lvm2 script called at
> /etc/rcS.d/S??lvm2 time just after /etc/rcS.d/S09mdadm-raid is
> called.  Because your raid did not assemble (for whatever reason) the
> vgchange could not work.  I am sure that once your raid assembles then
> it will once again enable this to work too.
> 
>> Help!
>> ...
>> And let me preapologise for my future slow responses on this mailing list 
>> Without my server I have to go to a local coffee shop to read and post.
> 
> Oh, painful!  But you do have the machine booting now with your manual
> help and so you should have normal email through it at that point, right?

No.  It's not getting past initramfs.

> 
> I would start at the mkinitrd part of things.  The only change to the
> initrd image should have been the embedded mdadm.conf file.  That
> should have only been an updated to the /etc/mdadm/mdadm.conf file.
> 
> If you have a system backup (and if not then shame on you) 

Well, agreed, shame on me.  All user data and /etc and /var are
backed up, but not this.

I have an initrd from an earlier kernel, though.  And I should be able
to edit the grub2.conf to try boot that kernel somehow once I get into
the system with a rescue disk. 

> pull the
> immediately previous initrd /boot/initrd.img-2.6.32-5-amd64 image from
> your backup and restore it.  It booted before.  It should boot now.
> With the same problem you had previously of it not recognizing the
> latest raid array that you just created.
> 
> I would then generate a new initrd image.  Then unpack both the old
> and new images and compare them to see what is different between the
> two images.  They should be very much the same with the exception of
> the embedded mdadm.conf file.  If they are not then the differences
> should provide clues to the problem.  Something else on your system
> must have been changed which is breaking this.
> 
> I usually like dpkg-reconfigure and recommend it because it simply
> calls the package postinst script which does whatever it does.  A
> standard interface.  Same for every package.  But I should note that
> there is also update-initramfs which is the next more specific tool
> called by postinst to create or update the images.  If something is
> going wrong then you will probably want to call it directly so as to
> see any messages being produced by it without anything else between.
> 
> Bob

-- hendrik




Reply to: