[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#446323: mdadm: recovery in infinite loop



fOn Wed, Oct 17, 2007 at 04:16:39PM -0500, Lukasz Szybalski wrote:
> On 10/15/07, Neil Brown <neilb@suse.de> wrote:
> >
> > As you say, the devices are exactly the same size, thanks.
> >
> > On Monday October 15, szybalski@gmail.com wrote:
> > >
> > > how do I undo?  mdadm /dev/md2 -f /dev/hda2
> > > So I could try the sync in init 1
> > > Lucas
> >
> > Well, you could:
> >   mdadm /dev/md2 -f /dev/hda2
> >   mdadm /dev/md2 -r /dev/hda2
> >
> > then when you are ready to try again
> >
> >   mdadm /dev/md2 -a /dev/hda2
> >
> Ok,
> So I went into
> init 1
> to get rid of any program that might want to access the hda2.
> I unmounted my '/files' which mounts my hda2 in '/' folder.
> 
> I was watching it as it went for 30 min to sync the drives and suddenly I got
> unrecognizable error 5 i believe it was. Unable to read sector lba
> 88604764 on hdb1.
> 
> I guess that showed up in stderr or something because I couldn't find
> reference to it anywhere other then a terminal screen.
> 
> What would be the proper command for testing every single block on
> that hardrive using e2fsck
> 
> I used e2fsck -acf /dev/hdb2 but that took all night and it still
> wasn't finished. I've canceled it. What would be the proper options
> for this command that would get me to clean this drive.
> 
> The weird part is when I checked the drives last Thursday using
> knoppix nothing has shown any problems or bad sectors.
> 
> 
> 
> > I think there must be something odd happening with the drive or
> > controller.  I notice that the two devices are on the same IDE
> > channel, which is sometimes a source of problems, though it should
> > behave like this.
> >
> > If you feel up to patching the kernel, recompiling, and experimenting,
> > I can send you a patch which should provide more detailed information
> > on what is happening.  Let me know what kernel version you will be
> > working with.
> Never done recompiling of a kernel before, but I guess if everything
> fails then we can try it. For now let me clean this drive and we go
> from there.

What's the status of this bug? 
Does this error still occur with more recent kernel versions?

If you're running Etch, you could try to reproduce this bug
with the 2.6.24 based kernel added in 4.0r4:
http://packages.qa.debian.org/l/linux-2.6.24.html

Cheers,
        Moritz














Reply to: