Bug#286276: kernel-image-2.6.8-9-amd64-k8: Unable to mount md devices
On Wed, Dec 22, 2004 at 03:39:32PM +0900, Horms wrote:
> On Tue, Dec 21, 2004 at 10:15:54AM -0800, Marc Singer wrote:
> > On Mon, Dec 20, 2004 at 02:52:50AM +0100, Goswin von Brederlow wrote:
> > > Marc Singer <elf@buici.com> writes:
> > >
> > > > On running mdadm --assemble /dev/md0
> > > >
> > > > mdadm[4247]: segfault at 000000000000002c rip 000000000804b19e rsp 00000000ffffdb80 error 4
> > > > mdadm[4253]: segfault at 000000000000002c rip 000000000804b19e rsp 00000000ffffdb80 error 4
> > >
> > > That doesn't mean much. Unaligned access gets reported as such as
> > > well for example.
> > >
> > > Could you paste a gdb stack backtrace preverably after rebuilding
> > > mdadm with debug info?
> >
> > I've had a look at the mdadm side of this problem. It looks like it
> > is crashing because there is no configuration file. From mdadm.c
> >
> > break;
> > case ASSEMBLE:
> > if (devs_found == 1 && ident.uuid_set == 0 &&
> > ident.super_minor == UnSet && !scan ) {
> > /* Only a device has been given, so get details from config file */
> > mddev_ident_t array_ident = conf_get_ident(configfile, devlist->devname);
> > mdfd = open_mddev(devlist->devname, array_ident->autof);
> >
> > The conf_get_ident() is returning 0 which is then dereferenced by the
> > open_mddev() call and segfault'ing. configfile is NULL because none
> > was given. AFAICT, configfile never defaults. And even if it did,
> > there is no config file on my machine.
> >
> > I haven't yet looked into the code within the initrd to see what it
> > does. It is possible that the only way that my setup can work is if
> > the md driver is compiled in.
> >
> > Still, I wonder about the ioctl errors that I see with fdisk.
>
> I took a breif look at initrd and for reference have included
> how it handles mdadm below. I think that the important bit
> is that it passes not only the md device but its component
> devices to mdadm -A (--assemble). It seems to me that as your
> invocation does not do that, it is going into a code path
> in mdadm that has a bug in it and thus segfaults. In a nutshell,
> I don't think it is a kernel problem.
>
> --
> Horms
>
> getraid_mdadm() {
> mdadm=$(mdadm -D "$device") || {
> echo "$PROG: mdadm -D $device failed" >&2
> exit 1
> }
> eval "$(
> echo "$mdadm" | awk '
> $1 == "Raid" && $2 == "Level" { print "echo " $4; next }
> $1 == "Number" && $2 == "Major" { start = 1; next }
> $1 == "UUID" { print "uuid=" $3; start = 0; next }
> !start { next }
> $2 == 0 && $3 == 0 { next }
> { devices = devices " " $NF }
> END { print "devices='\''" devices "'\''" }
> '
> )"
>
> printf '%s\n' $devices > getroot
> echo mdadm -A /devfs/md/$minor -R -u $uuid $devices \
> > md$minor-script
> echo /sbin/mdadm >&6
> }
So, if I understand correctly, if I were to add the UUID of the
devices then I'd probably have a working solution? I can certainly
try that.
Reply to: