Re: OT: Disk check on Raid1 area

To: daniel huhardeaux <daniel.huhardeaux@tootai.com>
Cc: Debian Users <debian-user@lists.debian.org>
Subject: Re: OT: Disk check on Raid1 area
From: Alvin Oga <aoga@Maggie.Linux-Consulting.com>
Date: Sun, 20 Apr 2003 14:20:33 -0700 (PDT)
Message-id: <[🔎] Pine.LNX.3.96.1030420141102.2467A-100000@Maggie.Linux-Consulting.com>
In-reply-to: <[🔎] 3EA2B5F0.5070802@tootai.com>

hi ya

On Sun, 20 Apr 2003, daniel huhardeaux wrote:

> Hi,
> 
> I have a production server under woody with software raid1 setup on 3 
> IDE disks (partition type is fd). According to mdadm daemon and 
> /proc/mdstat, one of them (ide0 which is #3 in raidtab) seems to have 
> failure. At first, hda2 stopped (mirror of /boot) and 8 hours after hda3 
> (mirror of /). My question is: how to check/know that hda is really dead 
> *without* shuting down server (is still in production with the 2 spare 
> disks) and knowing that I just have an ssh access on it.

if it stopped working, it should say that its running in "degraded mode"
	( degraded mode -- > no longer raiding )

if ide0 is /dev/hda .. and it claims it to be faulty...
	take it out w/ hotraidremove /dev/hda
	and see if /proc/mdstat and other feedback is happy

	put it back in with hotraidadd /dev/hda
	and it should busily resync itself

you can see if  you can create a file and see that the file shows
up on hdc and also on hde ( assuming you're using only master disks )

if the other 2 "spare" drives gets the data and not on hda, than
you have a dead disk and you now have to change your config for
1 spare and 1 drive mirrored ( raid1 ) setup

you should also need to power down ... and replace hda
one day soon

you cannot afford a minute of down time, build another raid1 server
and start pointing the ip# to the new server instead of the one
with the dead drive .. take the dead disk out and resync and now
back to biz as usual except you now have 2 independent live boxes


-- having a raid1 mirror or raid5 does not get around "downtime"
   if the disk dies ..  you will have to fix it (downtime) sooner or later
	- multiple independent machines does allow for no downtime
	as long as at least one box is up and running 24x7x365

c ya
alvin

Reply to:

References:
- OT: Disk check on Raid1 area
  - From: daniel huhardeaux <daniel.huhardeaux@tootai.com>

Prev by Date: Re: winmodem
Next by Date: Re: Switching from SuSE
Previous by thread: OT: Disk check on Raid1 area
Next by thread: How to listem to Mpeg3 stream?
Index(es):
- Date
- Thread