[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Status of RAID (md)



Hi all!

On the day I found this e-mail notification of fail event from mdadm:

---------------------------------------------------------------------
This is an automatically generated mail message from mdadm
running on antares

A Fail event had been detected on md device /dev/md2.

It could be related to component device /dev/sdd3.

Faithfully yours, etc.

P.S. The /proc/mdstat file currently contains the following:

Personalities : [raid1] [raid6] [raid5] [raid4]
md2 : active raid5 sda3[0] sdd3[4](F) sdc3[2]
      2136170880 blocks level 5, 64k chunk, algorithm 2 [4/2] [U_U_]

md1 : active raid1 sda2[0] sdd2[3] sdc2[2]
      19534976 blocks [4/3] [U_UU]

md0 : active raid1 sda1[0] sdd1[3] sdc1[2] sdb1[1]
      979840 blocks [4/4] [UUUU]

unused devices: <none>

---------------------------------------------------------------------

It seems that a disk of the RAID-5 disappeared and another one has
failed. A closer inspection shows:

antares:~# mdadm --detail /dev/md2
/dev/md2:
        Version : 00.90
  Creation Time : Thu Dec 17 13:18:29 2009
     Raid Level : raid5
     Array Size : 2136170880 (2037.21 GiB 2187.44 GB)
  Used Dev Size : 712056960 (679.07 GiB 729.15 GB)
   Raid Devices : 4
  Total Devices : 3
Preferred Minor : 2
    Persistence : Superblock is persistent

    Update Time : Sun Aug  1 16:38:59 2010
          State : clean, degraded
 Active Devices : 2
Working Devices : 2
 Failed Devices : 1
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 64K

           UUID : be723ed5:c2ac3c34:a640c0ed:43e24fc2
         Events : 0.726249

    Number   Major   Minor   RaidDevice State
       0       8        3        0      active sync   /dev/sda3
       1       0        0        1      removed
       2       8       35        2      active sync   /dev/sdc3
       3       0        0        3      removed

       4       8       51        -      faulty spare   /dev/sdd3


That is to say, the RAID has four disks and failed both the spare disk
and other disk from array. What is unclear to me is why if there are two
active disks, it seems that the RAID is broken because the filesystem is
on read-only mode:

# pvs
  PV         VG     Fmt  Attr PSize PFree
  /dev/md2   backup lvm2 a-   1,99T    0

# lvs
  LV    VG     Attr   LSize Origin Snap%  Move Log Copy%  Convert
  space backup -wi-ao 1,99T


# mount
/dev/md1 on / type ext3 (rw,errors=remount-ro)
tmpfs on /lib/init/rw type tmpfs (rw,nosuid,mode=0755)
proc on /proc type proc (rw,noexec,nosuid,nodev)
sysfs on /sys type sysfs (rw,noexec,nosuid,nodev)
procbususb on /proc/bus/usb type usbfs (rw)
udev on /dev type tmpfs (rw,mode=0755)
tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev)
devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=620)
/dev/mapper/backup-space on /space type ext3 (ro)


Then, I tried to add the missing disk, but the situation did not change:


# mdadm --add /dev/md2 /dev/sdb3
mdadm: re-added /dev/sdb3


antares:~# mdadm --detail /dev/md2
/dev/md2:
        Version : 00.90
  Creation Time : Thu Dec 17 13:18:29 2009
     Raid Level : raid5
     Array Size : 2136170880 (2037.21 GiB 2187.44 GB)
  Used Dev Size : 712056960 (679.07 GiB 729.15 GB)
   Raid Devices : 4
  Total Devices : 4
Preferred Minor : 2
    Persistence : Superblock is persistent

    Update Time : Sun Aug  1 17:03:19 2010
          State : clean, degraded
 Active Devices : 2
Working Devices : 3
 Failed Devices : 1
  Spare Devices : 1

         Layout : left-symmetric
     Chunk Size : 64K

           UUID : be723ed5:c2ac3c34:a640c0ed:43e24fc2
         Events : 0.726256

    Number   Major   Minor   RaidDevice State
       0       8        3        0      active sync   /dev/sda3
       1       0        0        1      removed
       2       8       35        2      active sync   /dev/sdc3
       3       0        0        3      removed

       4       8       19        -      spare   /dev/sdb3
       5       8       51        -      faulty spare   /dev/sdd3


# cat /proc/mdstat
Personalities : [raid1] [raid6] [raid5] [raid4]
md2 : active raid5 sdb3[4](S) sda3[0] sdd3[5](F) sdc3[2]
      2136170880 blocks level 5, 64k chunk, algorithm 2 [4/2] [U_U_]

md1 : active raid1 sda2[0] sdd2[3] sdc2[2]
      19534976 blocks [4/3] [U_UU]

md0 : active raid1 sda1[0] sdd1[3] sdc1[2] sdb1[1]
      979840 blocks [4/4] [UUUU]

unused devices: <none>


# mount
/dev/md1 on / type ext3 (rw,errors=remount-ro)
tmpfs on /lib/init/rw type tmpfs (rw,nosuid,mode=0755)
proc on /proc type proc (rw,noexec,nosuid,nodev)
sysfs on /sys type sysfs (rw,noexec,nosuid,nodev)
procbususb on /proc/bus/usb type usbfs (rw)
udev on /dev type tmpfs (rw,mode=0755)
tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev)
devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=620)
/dev/mapper/backup-space on /space type ext3 (ro)



What could be the problem?

Thanks in advance for your reply.

Regards,
Daniel
-- 
Fingerprint: BFB3 08D6 B4D1 31B2 72B9  29CE 6696 BF1B 14E6 1D37
Powered by Debian GNU/Linux Lenny - Linux user #188.598

Attachment: signature.asc
Description: Digital signature


Reply to: