[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#286276: kernel-image-2.6.8-9-amd64-k8: Unable to mount md devices



On Wed, Dec 22, 2004 at 03:39:32PM +0900, Horms wrote:
> On Tue, Dec 21, 2004 at 10:15:54AM -0800, Marc Singer wrote:
> > On Mon, Dec 20, 2004 at 02:52:50AM +0100, Goswin von Brederlow wrote:
> > > Marc Singer <elf@buici.com> writes:
> > > 
> > > > On running mdadm --assemble /dev/md0
> > > >
> > > >   mdadm[4247]: segfault at 000000000000002c rip 000000000804b19e rsp 00000000ffffdb80 error 4
> > > >   mdadm[4253]: segfault at 000000000000002c rip 000000000804b19e rsp 00000000ffffdb80 error 4
> > > 
> > > That doesn't mean much. Unaligned access gets reported as such as
> > > well for example.
> > > 
> > > Could you paste a gdb stack backtrace preverably after rebuilding
> > > mdadm with debug info?
> > 
> > I've had a look at the mdadm side of this problem.  It looks like it
> > is crashing because there is no configuration file.  From mdadm.c
> > 
> >           break;
> >   case ASSEMBLE:
> >           if (devs_found == 1 && ident.uuid_set == 0 &&
> >               ident.super_minor == UnSet && !scan ) {
> >                   /* Only a device has been given, so get details from config file */
> >                   mddev_ident_t array_ident = conf_get_ident(configfile, devlist->devname);
> >                   mdfd = open_mddev(devlist->devname, array_ident->autof);
> > 
> > The conf_get_ident() is returning 0 which is then dereferenced by the
> > open_mddev() call and segfault'ing.  configfile is NULL because none
> > was given.  AFAICT, configfile never defaults.  And even if it did,
> > there is no config file on my machine.
> > 
> > I haven't yet looked into the code within the initrd to see what it
> > does.  It is possible that the only way that my setup can work is if
> > the md driver is compiled in.
> > 
> > Still, I wonder about the ioctl errors that I see with fdisk.
> 
> I took a breif look at initrd and for reference have included
> how it handles mdadm below. I think that the important bit
> is that it passes not only the md device but its component
> devices to mdadm -A (--assemble). It seems to me that as your
> invocation does not do that, it is going into a code path
> in mdadm that has a bug in it and thus segfaults. In a nutshell,
> I don't think it is a kernel problem.
> 
> -- 
> Horms
> 
> getraid_mdadm() {
>         mdadm=$(mdadm -D "$device") || {
>                 echo "$PROG: mdadm -D $device failed" >&2
>                 exit 1
>         }
>         eval "$(
>                 echo "$mdadm" | awk '
>                         $1 == "Raid" && $2 == "Level" { print "echo " $4; next }
>                         $1 == "Number" && $2 == "Major" { start = 1; next }
>                         $1 == "UUID" { print "uuid=" $3; start = 0; next }
>                         !start { next }
>                         $2 == 0 && $3 == 0 { next }
>                         { devices = devices " " $NF }
>                         END { print "devices='\''" devices "'\''" }
>                 '
>         )"
> 
>         printf '%s\n' $devices > getroot
>         echo mdadm -A /devfs/md/$minor -R -u $uuid $devices \
>                 > md$minor-script
>         echo /sbin/mdadm >&6
> }

So, if I understand correctly, if I were to add the UUID of the
devices then I'd probably have a working solution?  I can certainly
try that.




Reply to: