[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: MD device not found on boot



On Sunday, 04 March 2012 09:25:25 +0100,
tv.debian@googlemail.com wrote:

> >> What steps did you take to configure kernel-package ? Did you read
> >> /usr/share/doc/kernel-package/README.gz ? Just a short citation:
> >>
> >> "With the new kernel-package conventions, you also need the example
> >> scripts in /etc/kernel/postinst.d/ and /etc/kernel/postrm.d, to
> >> create and remove the initramfs. Make sure that these scripts pay
> >> attention to the INITRD env variable to determine whether or not to
> >> take any action.
> >>
> >>
> >>  Let me repeat:
> >>  Since nothing is created automatically. you need to provide a hook
> >>  script for things to happen when you install the kernel image
> >>  package.  The user provides such scripts. For example, to invoke
> >>  mkinitramfs, I did:
> >>
> >> --8<---------------cut here---------------start------------->8---
> >>  cp /usr/share/kernel-package/examples/etc/kernel/postinst.d/initramfs \
> >>     /etc/kernel/postinst.d/
> >>  cp /usr/share/kernel-package/examples/etc/kernel/postrm.d/initramfs \
> >>     /etc/kernel/postrm.d/
> >> --8<---------------cut here---------------end--------------->8---
> >> "
> >>
> >> Maybe that's part of the problem.

> > Here I am not physically in front of the host to be tested, however
> > I installed Debian GNU/Linux 5.0.9 Lenny in a KVM virtual machine in
> > order to replicate the problem.
> > 
> > I was reading the file /usr/share/doc/kernel-package/README.gz on
> > the KVM virtual machine, but I didn't found the quote you mentioned,
> > nor the two files.

> Sorry, my bad, I forgot that kernel-package undertook a major overhaul
> after Lenny, so it's no wonder why you can't find the quote, it doesn't
> exist in your version. With Lenny kernel-package everything is supposed
> to be configured automatically at install time.

No problem :-)

> > However I tried the kernel compiled with the above steps
> > (without additional steps) and I could not reproduce the problem on my
> > virtual machine. i.e. it boots without problems. One test failed with a
> > message that the initrd was too big, but then to expand the RAM of the
> > VM, it boots without problems.
> > 
> > So now the issue is finding the difference between these two scenarios.
> > 
> > 1) Debian GNU/Linux Lenny 5.0.8:
> > 
> >    * 2.6.32-bpo.5-amd64 boots without problems.
> >    * 2.6.32-layer7-imq-amd64 fails to find the MD devices.
> >    * The environment is the same, so there were no changes in settings
> >      or metadata, which makes me think that the difference is something
> >      on the compiled kernel. But...
> > 
> > 2) Debian GNU/Linux Lenny 5.0.9 (KVM):
> > 
> >    * 2.6.32-bpo.5-amd64 boots without problems.
> >    * 2.6.32-layer7-imq-amd64 compiled using the same procedure as in (1)
> >      boots without problems. This makes me think that then the problem
> >      is not the procedure used to compile.
> > 
> > Then the problem is due to a difference between 5.0.8 and 5.0.9? But, if
> > so, why precompiled kernel does not produce problems and the manually
> > compiled kernel fails using a procedure that did not bring problems in
> > another scenario?

> Looks like you are triggering a corner case the 5.0.8, and maybe one of
> the 5.0.9 updates fixed that ? I guess you checked that all
> configuration files (/etc/kernel-img.conf, /etc/initramfs-tools/modules
> etc...) are the same on the two hosts. Checked the changelogs for mdadm,
> initramfs-tools and kernel-package between 5.08 and 5.09 ?

> The virtual machine hardware could also influence the outcome, if you
> could image the non working host and test it with the exact same setup
> you could bisect the origin of the problem more easily. Could be in
> initrd generation, but also simply in the boot loader or devices
> addressing (something changed in grub in between 5.08 and 5.09 ?). The
> problem occurs when the initrd is looking for the root fs on /dev/md2,
> when dropped to a shell did you try to see if /dev/md2 really doesn't
> exist, or is named differently. Can it be assembled in initrd shell and
> the boot process resumed ?
> 
> Anyway, even if you find a bug in Lenny 5.0.9 (and all the more in
> 5.0.8) in initramfs-tools, mdadm or another essential package, it is not
> going to be taken care of since Lenny reached it's end of life. Maybe
> it's time for an upgrade ?

I know it's time to update, but I see it as a mid-term target as the
situation is as follows: there are several hosts with Debian GNU/Linux
Lenny which are provisioning servers for Internet access using cable
modem or PPPoE.

In the short term, if there is any problem with them, I am preparing
some HP ML110 G7 servers with an image that I have with all installed and
configured software. The problem is that the compiled kernel on this
disk image has some problems on this hardware (goes into a loop of
restarts). Since backports 2.6.32 works fine on this hardware, I decided
to compile a new version with required patches (IMQ and Layer7). Then,
as more long term, make an image with Debian GNU/Linux Squeeze
containing all installed and configured software, which will take much
longer to prepare.

Compiling the patched kernel in Squeeze, worked without problems.
Yesterday I went back to testing on 5.0.8 Lenny. In the BusyBox shell I
tried doing a "cat /proc/mdstat" and despite observing the raid1
Personalities, are not detected associated MD device. Moreover, I was
surprised not to see /dev/sdX devices. This made me think that maybe the
arrays could not be assembled because the disks were not detected.

So I checked the settings in the BIOS of the computer and I was checking
the SATA controller configuration. This was configured in "legacy" mode.
After switching to "AHCI" mode, the compiled kernel boots like a charm.

Maybe there is some additional parameter in the precompiled kernel,
which makes it boots in "legacy" mode. On the other hand, I still did
not get to review the G7 with Squeeze, but if the two kernels worked, I
guess the controller would be configured in "AHCI" mode.



Thanks for your reply.

Regards,
Daniel
-- 
Fingerprint: BFB3 08D6 B4D1 31B2 72B9  29CE 6696 BF1B 14E6 1D37
Powered by Debian GNU/Linux Squeeze - Linux user #188.598

Attachment: signature.asc
Description: Digital signature


Reply to: