[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#791794: RAID device not active during boot




When all disks are available during boot the system is starting without problems:

   >  ls -l /dev/disk/by-uuid/ 
   total 0
   lrwxrwxrwx 1 root root 10 Jul 13 18:15 2138f67e-7b9e-4960-80d3-2ac2ce31d882 -> ../../sdc2
   lrwxrwxrwx 1 root root 10 Jul 13 18:15 21a660eb-729d-48fe-b9e3-140ae0ee79f4 -> ../../sdd2
   lrwxrwxrwx 1 root root  9 Jul 13 18:15 c4263f89-eb0c-4372-90ae-ce1a1545613e -> ../../md0
   lrwxrwxrwx 1 root root 10 Jul 13 18:15 cbeaebcb-2c55-48c0-b6bd-d5e8a5c4ac06 -> ../../sdb2
   lrwxrwxrwx 1 root root 10 Jul 13 18:15 ff2bae51-c5b8-41e3-855b-68ee57b61c0c -> ../../sda2


When starting the system with only two (instead of four) disks I'm droped into emergency shell with the following error message:

   ALERT!  /dev/disk/by-uuid/c4263f89-eb0c-4372-90ae-ce1a1545613e does not exist.  Dropping to a shell!

... which seems to be consistent with the fact that the UUID for  /dev/md0  is not available ...

   (initramfs)  ls -l /dev/disk/by-uuid/
   total 0
   lrwxrwxrwx    1 0        0               10 Jul 13 15:20 cbeaebcb-2c55-48c0-b6bd-d5e8a5c4ac06 -> ../../sdb2 
   lrwxrwxrwx    1 0        0               10 Jul 13 15:20 ff2bae51-c5b8-41e3-855b-68ee57b61c0c -> ../../sda2


... which in turn is caused the RAID device itself being inactive at that time:

   (initramfs)  cat /proc/mdstat
   Personalities :
   md0 : inactive sdb1[5](S) sda1[6](S)
         39028736 blocks super 1.2 
 
   unused devices: <none>


In order to re-activate  /dev/md0  I use the following commands:

   (initramfs)  mdadm --stop /dev/md0
   [  178.719551] md: md0 stopped.
   [  178.722463] md: unbind<sdb1>
   [  178.725386] md: export_rdev(sdb1) 
   [  178.728804] md: unbind<sda1> 
   [  178.731711] md: export_rdev(sda1) 
   mdadm: stopped /dev/md0

   (initramfs)  mdadm --assemble /dev/md0 
   [  214.171191] md: md0 stopped. 
   [  214.184471] md: bind<sda1> 
   [  214.195838] md: bind<sdb1> 
   [  214.218253] md: raid1 personality registered for level 1
   [  214.226156] md/raid1:md0: active with 1 out of 3 mirrors
   [  214.231651] md0: detected capacity change from 0 to 19982581760 
   [  214.247893]  md0: unknown partition table
   mdadm: /dev/md0 has been started with 1 drive (out of 3) and 1 spare.

   (initramfs)  cat /proc/mdstat
   Personalities : [raid1]
   md0 : active (auto-read-only) raid1 sdb1[5] sda1[6](S) 
         19514240 blocks super 1.2 [3/1] [U__]
 
   unused devices: <none>


... which will make the RAID device available in /dev/disk/by-uuid/

   (initramfs)  ls -l /dev/disk/by-uuid/
   total 0
   lrwxrwxrwx    1 0        0                9 Jul 13 15:24 c4263f89-eb0c-4372-90ae-ce1a1545613e -> ../../md0
   lrwxrwxrwx    1 0        0               10 Jul 13 15:20 cbeaebcb-2c55-48c0-b6bd-d5e8a5c4ac06 -> ../../sdb2
   lrwxrwxrwx    1 0        0               10 Jul 13 15:20 ff2bae51-c5b8-41e3-855b-68ee57b61c0c -> ../../sda2


Now, if I  exit  the emergency shell  the system is able to boot without problems.

In bug report #784070 it is mentioned that "with the version of mdadm shipping with Debian Jessie, the --run parameter seems to be ignored when used in conjunction with --scan. According to the man page it is supposed to activate all arrays even if they are degraded. But instead, any arrays that are degraded are marked as 'inactive'. If the root filesystem is on one of those inactive arrays, the boot process is halted."

As suggested in the bug report (see message#109) I have changed the file  /usr/share/initramfs-tools/scripts/local-top/mdadm  and used the comand  update-initramfs -u  in order to update  /boot/initrd.img-3.16.*  (... you might first want to make a copy of this file before update.)
After reboot the system is able to start even if some disks (out of the RAID device) are missing (see bootlog from serial console below):

   ...
   Begin: Running /scripts/init-premount ... done.
   Begin: Mounting root file system ... Begin: Running /scripts/local-top ... Begin: Assembling all MD arrays ... [   24.799665] random
   : nonblocking pool is initialized
   Failure: failed to assemble all arrays.
   done.
   Begin: Assembling all MD arrays ... Warning: failed to assemble all arrays...attempting individual starts
   Begin: attempting mdadm --run md0 ... [   24.883069] md: raid1 personality registered for level 1
   [   24.889111] md/raid1:md0: active with 2 out of 3 mirrors
   [   24.894598] md0: detected capacity change from 0 to 19982581760 
   mdadm: started array /dev/md/0 
   [   24.908255]  md0: unknown partition table
   Success: started md0 
   done. 
   done.  
   Begin: Running /scripts/local-premount ... done. 
   Begin: Checking root file system ... fsck from util-linux 2.25.2 
   /dev/md0: clean, 36905/1220608 files, 398026/4878560 blocks
   done.
   ...


Problem solved ...
... and many thanks to Phil.



PS:
There is still one thing I do not understand:
The file  etc/mdadm/mdadm.conf  (within initrd.img.*) contains an UUID (see below) ...

   ARRAY /dev/md/0 metadata=1.2 UUID=92da2301:37626555:6e73a527:3ccc045f name=debian:0
      spares=1

... wich seems to be different from the output of  ls -l /dev/disk/by-uuid:

   lrwxrwxrwx 1 root root  9 Jul 14 11:27 c4263f89-eb0c-4372-90ae-ce1a1545613e -> ../../md0

















Attachment: smime.p7s
Description: S/MIME Cryptographic Signature


Reply to: