[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

hosed raid 1 under lvm



calling all lvm/raid gurus....

I'm running debian sarge and I have/had a 200GB raid 1 devices with LVM2 on top. The raid one was made from hdb1 and hdc1. Recently when I was trying to change the size of a logical volume. I noticed that my vg was using /dev/hdc1 as a physical volume instead of /dev/md0.
$ pvscan
Found duplicate PV YDMoRSv4EKZXHNPuchid5hPIwmXDCBCm: using /dev/hdb1 not /dev/hdc1
     PV /dev/hda4   VG LVM    lvm2 [68.96 GB / 0    free]
     PV /dev/hdc1   VG data   lvm2 [186.30 GB / 4.00 MB free]
     Total: 2 [255.26 GB] / in use: 2 [255.26 GB] / in no VG: 0 [0   ]

some additional digging and I found that my raid devices is inactive

$ mdadm -D /dev/md0
   mdadm: md device /dev/md0 does not appear to be active.

but the mdsuperblock still seems present on both devices

$mdadm -E /dev/hdb1
/dev/hdb1:
         Magic : a92b4efc
       Version : 00.90.00
          UUID : 9dc1df9b:771959ca:53ed751d:7b3b40ba
 Creation Time : Sun Apr 17 13:24:05 2005
    Raid Level : raid1
  Raid Devices : 2
 Total Devices : 2
Preferred Minor : 0

   Update Time : Fri May 13 16:52:37 2005
         State : clean
Active Devices : 2
Working Devices : 2
Failed Devices : 0
 Spare Devices : 0
      Checksum : 1dcab098 - correct
        Events : 0.482221


     Number   Major   Minor   RaidDevice State
this 1 3 65 1 active sync /dev/.static/dev/hdb1

0 0 22 1 0 active sync /dev/.static/dev/hdc1 1 1 3 65 1 active sync /dev/.static/dev/hdb1

$ mdadm /dev/hdc1
/dev/hdc1:
         Magic : a92b4efc
       Version : 00.90.00
          UUID : 9dc1df9b:771959ca:53ed751d:7b3b40ba
 Creation Time : Sun Apr 17 13:24:05 2005
    Raid Level : raid1
  Raid Devices : 2
 Total Devices : 2
Preferred Minor : 0

   Update Time : Fri May 13 16:52:37 2005
         State : clean
Active Devices : 2
Working Devices : 2
Failed Devices : 0
 Spare Devices : 0
      Checksum : 1dcab069 - correct
        Events : 0.482221


     Number   Major   Minor   RaidDevice State
this 0 22 1 0 active sync /dev/.static/dev/hdc1

0 0 22 1 0 active sync /dev/.static/dev/hdc1 1 1 3 65 1 active sync /dev/.static/dev/hdb1


I'm not sure how this happen but I've come up with a couple of possibilities that happen several months ago. 1. I migrated these disk from a previous test box for this server. I may have messed on the initial install into the new box. 2. I replaced hdb1 as hdc1 when doing some maintenance. In any case LVM recognized the UUID of one of disks and continued normally. Now I'm try to fix the situation. All the data on the disks are backup so I could just start fresh but I would like to repair the situation if I can. Here is the steps I've come up with. Please check them out for pitfalls etc or missing commands.

Outline:
To fix the LVM, I will treat the situation as if I am migrating the disks to a new system. Instead of moving drives I will be repairing raid. To fix Raid I need to determine which disk LVM has been using then restart the RAID with that disk then add the second disk as a spare.

Detail:
1. Umount file systems

# unmount /backup
# unmount /share/ftp
# unmount /share

2. Mark Volume Group as inactive

# vgchange -an data

3. Export volume Group

# vgexport data

4. Mount Disk and examine contents for recent data

# mount /dev/hdb1 /mnt/hdb
# mount /dev/hdc1 /mnt/hdc

5.  Restart array with devices with most recent data

#  mdadm -A /dev/md0 /dev/hdb1 or /dev/hdc1
(will this start the array degraded or do I need to create a new array with mdadm -C /dev/md0 --level=1 --raid-devices=/dev/hdb1 missing)

6. Add second devices as spare

# mdadm /dev/md0 --add /dev/hdc1 or /dev/hdb1

7. Fix LVM (this is assuming that pvscan now recognizes PV /dev/md0 as being apart of VG data

#  vgimport data
#  vgchange -ay data
#  mount /share
#  mount /share/ftp
#  mount /backup

8.  Set up mdadm in monitor mode and have a beer!


Thanks for the help in advance

Colin



Reply to: