[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: how to reconstruct MD RAID device?



Clive Menzies --> debian-user (2005-08-24 16:27:38 +0100):
> On (24/08/05 16:31), Jukka Salmi wrote:
> > Clive Menzies --> debian-user (2005-08-24 15:08:11 +0100):
> > > Have you tried something like:
> > > 
> > > $ mdadm /dev/md2 -a /dev/hdb3
> > 
> > No. Unfortunately it's a production system, hence I'm a little bit
> > cautious with "trying" things... So, considering md2 is used as the
> > root file system device, is adding hdb3 to it as you describe dangerous?
> > What exactly does this command do? Does it also start reconstruction
> > onto hdb3? As you notice, I'm not familiar with Linux software RAID
> > at all...
> 
> Well, I'm no expert and I can understand your reluctance to experiment.
> 
> It's been a while since I set up 3 RAID servers but I did find the
> following links helpful:
> # http://rootraiddoc.alioth.debian.org/
> # http://juerd.nl/site.plp/debianraid
> # http://xtronics.com/reference/SATA-RAID-Debian.htm
> 
> As I understand it, adding hdb3 to the /dev/md2 will reassemble the
> array; however, if hdb3 has become corrupted in some way, it may fail if
> it can't recover but this should have no adverse impact on hda3.

I added the "non-fresh" device to the failed md:

$ mdadm /dev/md2 -a /dev/hdb3
mdadm: hot added /dev/hdb3

and could see hda3 being rebuilt onto the new spare:

Aug 24 17:11:12 sv005 kernel: md: trying to hot-add unknown-block(3,67) to md2 ... 
Aug 24 17:11:12 sv005 kernel: md: bind<hdb3>
Aug 24 17:11:12 sv005 kernel: RAID1 conf printout:
Aug 24 17:11:12 sv005 kernel:  --- wd:1 rd:2
Aug 24 17:11:12 sv005 kernel:  disk 0, wo:0, o:1, dev:hda3
Aug 24 17:11:12 sv005 kernel:  disk 1, wo:1, o:1, dev:hdb3
Aug 24 17:11:12 sv005 kernel: md: syncing RAID array md2
Aug 24 17:11:12 sv005 kernel: md: minimum _guaranteed_ reconstruction speed: 1000 KB/sec/disc.
Aug 24 17:11:12 sv005 kernel: md: using maximum available idle IO bandwith (but not more than 200000 KB/sec) for reconstruction.
Aug 24 17:11:12 sv005 kernel: md: using 128k window, over a total of 39107264 blocks.

Excellent! Unfortunately, some minutes after the sync failed because
of errors on the "good" disk:

Aug 24 17:16:06 sv005 kernel: hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
Aug 24 17:16:06 sv005 kernel: hda: dma_intr: error=0x40 { UncorrectableError }, LBAsect=80293024, sector=80293024
Aug 24 17:16:06 sv005 kernel: end_request: I/O error, dev hda, sector 80293024
[...]
Aug 24 17:16:27 sv005 kernel: md: md2: sync done.
Aug 24 17:16:27 sv005 kernel: md: syncing RAID array md2
[...]
Aug 24 17:17:01 sv005 kernel: hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
Aug 24 17:17:01 sv005 kernel: hda: dma_intr: error=0x01 { AddrMarkNotFound }, LBAsect=2078496, sector=2078496
Aug 24 17:17:01 sv005 kernel: hda: DMA disabled
Aug 24 17:17:01 sv005 kernel: hdb: DMA disabled
Aug 24 17:17:01 sv005 kernel: ide0: reset: success
Aug 24 17:17:01 sv005 kernel: hda: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
Aug 24 17:17:01 sv005 kernel: hda: task_in_intr: error=0x01 { AddrMarkNotFound }, LBAsect=2078496, sector=2078496
[...]

Seems I need to replace the disk or the controller. At least I know
now how to reconstruct a failed device.

Thank you for your help!


Regards, Jukka

-- 
bashian roulette:
$ ((RANDOM%6)) || rm -rf ~



Reply to: