[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#286276: kernel-image-2.6.8-9-amd64-k8: Unable to mount md devices



On Tue, Dec 21, 2004 at 10:15:54AM -0800, Marc Singer wrote:
> On Mon, Dec 20, 2004 at 02:52:50AM +0100, Goswin von Brederlow wrote:
> > Marc Singer <elf@buici.com> writes:
> > 
> > > On running mdadm --assemble /dev/md0
> > >
> > >   mdadm[4247]: segfault at 000000000000002c rip 000000000804b19e rsp 00000000ffffdb80 error 4
> > >   mdadm[4253]: segfault at 000000000000002c rip 000000000804b19e rsp 00000000ffffdb80 error 4
> > 
> > That doesn't mean much. Unaligned access gets reported as such as
> > well for example.
> > 
> > Could you paste a gdb stack backtrace preverably after rebuilding
> > mdadm with debug info?
> 
> I've had a look at the mdadm side of this problem.  It looks like it
> is crashing because there is no configuration file.  From mdadm.c
> 
>           break;
>   case ASSEMBLE:
>           if (devs_found == 1 && ident.uuid_set == 0 &&
>               ident.super_minor == UnSet && !scan ) {
>                   /* Only a device has been given, so get details from config file */
>                   mddev_ident_t array_ident = conf_get_ident(configfile, devlist->devname);
>                   mdfd = open_mddev(devlist->devname, array_ident->autof);
> 
> The conf_get_ident() is returning 0 which is then dereferenced by the
> open_mddev() call and segfault'ing.  configfile is NULL because none
> was given.  AFAICT, configfile never defaults.  And even if it did,
> there is no config file on my machine.
> 
> I haven't yet looked into the code within the initrd to see what it
> does.  It is possible that the only way that my setup can work is if
> the md driver is compiled in.
> 
> Still, I wonder about the ioctl errors that I see with fdisk.

I took a breif look at initrd and for reference have included
how it handles mdadm below. I think that the important bit
is that it passes not only the md device but its component
devices to mdadm -A (--assemble). It seems to me that as your
invocation does not do that, it is going into a code path
in mdadm that has a bug in it and thus segfaults. In a nutshell,
I don't think it is a kernel problem.

-- 
Horms

getraid_mdadm() {
        mdadm=$(mdadm -D "$device") || {
                echo "$PROG: mdadm -D $device failed" >&2
                exit 1
        }
        eval "$(
                echo "$mdadm" | awk '
                        $1 == "Raid" && $2 == "Level" { print "echo " $4; next }
                        $1 == "Number" && $2 == "Major" { start = 1; next }
                        $1 == "UUID" { print "uuid=" $3; start = 0; next }
                        !start { next }
                        $2 == 0 && $3 == 0 { next }
                        { devices = devices " " $NF }
                        END { print "devices='\''" devices "'\''" }
                '
        )"

        printf '%s\n' $devices > getroot
        echo mdadm -A /devfs/md/$minor -R -u $uuid $devices \
                > md$minor-script
        echo /sbin/mdadm >&6
}




Reply to: