[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Replacing failed drive in software RAID



Hi guys,

I'm using four 3TB drives, so I had to use GPT. Although I'm pretty sure I know what I need to do, I want to make sure so I don't loose data. Three drives are dying so I'm gonna replace them one by one.

This is the situation:
sda and sdb have four partitions.
sda1, sdb1 - 1MB partitions at the beginning
sda2, sdb2 - boot partition (RAID1 - md0)
sda3, sdb3 - root partition (RAID10 - md1)
sda4, sdb4 - data (RAID10 - md2)

sdc and sdd have three partitions:
sdc1, sdd1 - 1MB partitions at the beginning
sdc2, sdd2 - root partition (RAID10 - md1)
sdc3, sdd3 - data (RAID10 - md2)

There is one more unnecessary complication, I have root and swap logical volume on md1 (sda3, sdb3, sdc2, sdd2). Don't know if I should take that into account when replacing the drives.

This is what I plan to do:

Replacing sda
1. Removing sda from all RAID devices
mdadm --manage /dev/md0 --fail /dev/sda2
mdadm --manage /dev/md0 --remove /dev/sda2

mdadm --manage /dev/md1 --fail /dev/sda3
mdadm --manage /dev/md1 --remove /dev/sda3

mdadm --manage /dev/md2 --fail /dev/sda4
mdadm --manage /dev/md2 --remove /dev/sda4

Checking what is the serial number of sda:
# hdparm -i /dev/sda

2. Replacing failed drive
halt
Replace drive with the right serial number.

3. Adding the new hard drive
Here I need to copy partition data from sdb to newly inserted sda. sfdisk won't work with GPT so I'm installing gdisk.
# aptitude install gdisk
# sgdisk --backup=table /dev/sdb
# sgdisk --load-backup=table /dev/sda
# sgdisk -G /dev/sda

# mdadm --manage /dev/md0 --add /dev/sda2
# mdadm --manage /dev/md1 --add /dev/sda3
# mdadm --manage /dev/md2 --add /dev/sda4

4. Check if synchronization is in progress:
# cat /proc/mdstat


After sync complete I will do this for all other drives, so all of them will be WD from Red series.

Did I overlook something? Will this going to work? I was also thinking about inserting one drive and copying data from RIAD to it so I have backup if something goes wrong. Would that be right thing to do, or that would just load drives unnecessarily and accelerate their failure?

Regards,
Veljko


Reply to: