Re: OT: Disk check on Raid1 area
On Sun, 20 Apr 2003, daniel huhardeaux wrote:
> I have a production server under woody with software raid1 setup on 3
> IDE disks (partition type is fd). According to mdadm daemon and
> /proc/mdstat, one of them (ide0 which is #3 in raidtab) seems to have
> failure. At first, hda2 stopped (mirror of /boot) and 8 hours after hda3
> (mirror of /). My question is: how to check/know that hda is really dead
> *without* shuting down server (is still in production with the 2 spare
> disks) and knowing that I just have an ssh access on it.
if it stopped working, it should say that its running in "degraded mode"
( degraded mode -- > no longer raiding )
if ide0 is /dev/hda .. and it claims it to be faulty...
take it out w/ hotraidremove /dev/hda
and see if /proc/mdstat and other feedback is happy
put it back in with hotraidadd /dev/hda
and it should busily resync itself
you can see if you can create a file and see that the file shows
up on hdc and also on hde ( assuming you're using only master disks )
if the other 2 "spare" drives gets the data and not on hda, than
you have a dead disk and you now have to change your config for
1 spare and 1 drive mirrored ( raid1 ) setup
you should also need to power down ... and replace hda
one day soon
you cannot afford a minute of down time, build another raid1 server
and start pointing the ip# to the new server instead of the one
with the dead drive .. take the dead disk out and resync and now
back to biz as usual except you now have 2 independent live boxes
-- having a raid1 mirror or raid5 does not get around "downtime"
if the disk dies .. you will have to fix it (downtime) sooner or later
- multiple independent machines does allow for no downtime
as long as at least one box is up and running 24x7x365