[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Debian machine not booting



For ruther information:
/dev/sdb3:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : a529cd1b:c055887e:bfe78010:bc810f04
  Creation Time : Fri Nov 20 09:37:34 2009
     Raid Level : raid1
  Used Dev Size : 972550912 (927.50 GiB 995.89 GB)
     Array Size : 972550912 (927.50 GiB 995.89 GB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 126

    Update Time : Tue Jul  2 13:49:18 2013
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0
       Checksum : 6203fa40 - correct
         Events : 1036616


      Number   Major   Minor   RaidDevice State
this     0       8       19        0      active sync   /dev/sdb3

   0     0       8       19        0      active sync   /dev/sdb3
   1     1       8       35        1      active sync   /dev/sdc3
/dev/sdc3:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : a529cd1b:c055887e:bfe78010:bc810f04
  Creation Time : Fri Nov 20 09:37:34 2009
     Raid Level : raid1
  Used Dev Size : 972550912 (927.50 GiB 995.89 GB)
     Array Size : 972550912 (927.50 GiB 995.89 GB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 126

    Update Time : Tue Jul  2 13:49:18 2013
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0
       Checksum : 6203fa52 - correct
         Events : 1036616


      Number   Major   Minor   RaidDevice State
this     1       8       35        1      active sync   /dev/sdc3

   0     0       8       19        0      active sync   /dev/sdb3
   1     1       8       35        1      active sync   /dev/sdc3
/dev/sdd1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : a544829f:33778728:79870439:241c5c51
           Name : Hawaiian:1  (local to host Hawaiian)
  Creation Time : Thu Jan 31 22:43:49 2013
     Raid Level : raid1
   Raid Devices : 2

 Avail Dev Size : 3907021954 (1863.01 GiB 2000.40 GB)
     Array Size : 3907021682 (1863.01 GiB 2000.40 GB)
  Used Dev Size : 3907021682 (1863.01 GiB 2000.40 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : 1e0de6be:bbcc874e:e00e2caa:593de9b1

    Update Time : Tue Jul  2 13:51:19 2013
       Checksum : a8cf720f - correct
         Events : 108


   Device Role : Active device 0
   Array State : A. ('A' == active, '.' == missing)
/dev/sde1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : a544829f:33778728:79870439:241c5c51
           Name : Hawaiian:1  (local to host Hawaiian)
  Creation Time : Thu Jan 31 22:43:49 2013
     Raid Level : raid1
   Raid Devices : 2

 Avail Dev Size : 3907021954 (1863.01 GiB 2000.40 GB)
     Array Size : 3907021682 (1863.01 GiB 2000.40 GB)
  Used Dev Size : 3907021682 (1863.01 GiB 2000.40 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : 926788c3:9dfbf62b:26934208:5a72d05d

    Update Time : Tue Jul  2 13:51:05 2013
       Checksum : 94e2b4a1 - correct
         Events : 114


   Device Role : Active device 1
   Array State : .A ('A' == active, '.' == missing)


Thanks
James


On 2 July 2013 13:52, James Allsopp <jamesaallsopp@googlemail.com> wrote:
One other point sda isn't the boot hard drive, that's the partitions /sdb1 and sdc1, but these should be the same (I thought I'd mirrored them to be honest).

I tried mdadm --detail /dev/sdd1 but it didn't work. I have these results if they help?
/dev/md1:
        Version : 1.2
  Creation Time : Thu Jan 31 22:43:49 2013
     Raid Level : raid1
     Array Size : 1953510841 (1863.01 GiB 2000.40 GB)
  Used Dev Size : 1953510841 (1863.01 GiB 2000.40 GB)
   Raid Devices : 2
  Total Devices : 1
    Persistence : Superblock is persistent

    Update Time : Tue Jul  2 13:49:55 2013
          State : clean, degraded
 Active Devices : 1
Working Devices : 1
 Failed Devices : 0
  Spare Devices : 0

           Name : Hawaiian:1  (local to host Hawaiian)
           UUID : a544829f:33778728:79870439:241c5c51
         Events : 112


    Number   Major   Minor   RaidDevice State
       0       0        0        0      removed
       1       8       65        1      active sync   /dev/sde1
/dev/md127:
        Version : 1.2
  Creation Time : Thu Jan 31 22:43:49 2013
     Raid Level : raid1
     Array Size : 1953510841 (1863.01 GiB 2000.40 GB)
  Used Dev Size : 1953510841 (1863.01 GiB 2000.40 GB)
   Raid Devices : 2
  Total Devices : 1
    Persistence : Superblock is persistent

    Update Time : Tue Jul  2 13:49:29 2013
          State : clean, degraded
 Active Devices : 1
Working Devices : 1
 Failed Devices : 0
  Spare Devices : 0

           Name : Hawaiian:1  (local to host Hawaiian)
           UUID : a544829f:33778728:79870439:241c5c51
         Events : 106


    Number   Major   Minor   RaidDevice State
       0       8       49        0      active sync   /dev/sdd1
       1       0        0        1      removed

How should I proceed from here?
James



On 2 July 2013 09:50, James Allsopp <jamesaallsopp@googlemail.com> wrote:
Thanks Bob, I'll get back to after I've followed your instructions. I think I'm going to have to learn to type with crossed fingers!

I think I initially sorted out all my partitions manually, rather than directly using the installer to do it automatically,
Really appreciated,
James


On 2 July 2013 00:46, Bob Proulx <bob@proulx.com> wrote:
James Allsopp wrote:
> Personalities : [raid0] [raid1] [raid6] [raid5] [raid4]
> md126 : active raid1 sdb3[0] sdc3[1]
>       972550912 blocks [2/2] [UU]

So sdb3 and sdc3 are assembled into /dev/md126.  That seems good.  One
full array is assembled.

Is /dev/md126 your preferred name for that array?  I would guess not.
Usually it is /dev/md0 or some such.  But when that name is not
available because it is already in use then mdadm will rotate up to a
later name like /dev/md126.

You can fix this by using mdadm with --update=super-minor to force it
back to the desired name.  Something like this using your devices:

  mdadm --assemble /dev/md0 --update=super-minor /dev/sdb3 /dev/sdc3

But that can only be done at assembly time.  If it is already
assembled then you would need to stop the array first and then
assemble it again.

> md127 : active raid1 sdd1[0]
>       1953510841 blocks super 1.2 [2/1] [U_]
>
> md1 : active raid1 sde1[1]
>       1953510841 blocks super 1.2 [2/1] [_U]

I think this array is now has a split brain problem.  At this point
the original single mirrored array has had both halves of the mirror
assembled and both are running.  So now you have two clones of each
other and both are active.  Meaning that each think they are newer
than the other.  Is that right?  In which case you will eventually
need to pick one and call it the master.  I think the sde1 is the
natural master since it is assembled on /dev/md1.

> cat /etc/mdadm/mdadm.conf
> ...
> # definitions of existing MD arrays
> ARRAY /dev/md0 UUID=a529cd1b:c055887e:bfe78010:bc810f04

Only one array specified.  That is definitely part of your problem.
You should have at least two arrays specified there.

> mdadm --detail --scan:
>
> ARRAY /dev/md/0_0 metadata=0.90 UUID=a529cd1b:c055887e:bfe78010:bc810f04

That mdadm --scan only found one array is odd.

> fdisk -l
>
> Disk /dev/sda: 120.0 GB, 120033041920 bytes
> 255 heads, 63 sectors/track, 14593 cylinders
> Units = cylinders of 16065 * 512 = 8225280 bytes
> Sector size (logical/physical): 512 bytes / 512 bytes
> I/O size (minimum/optimal): 512 bytes / 512 bytes
> Disk identifier: 0x0002ae52
>
>    Device Boot      Start         End      Blocks   Id  System
> /dev/sda1               1       14593   117218241   83  Linux

I take it that this is your boot disk?  Your boot disk is not RAID?

I don't like that the first used sector is 1.  That would have been 63
using the previous debian-installer to leave space for the MBR and
other things.  But that is a different issue.

> Disk /dev/sdd: 2000.4 GB, 2000398934016 bytes
> 255 heads, 63 sectors/track, 243201 cylinders
> Units = cylinders of 16065 * 512 = 8225280 bytes
> Sector size (logical/physical): 512 bytes / 4096 bytes
> I/O size (minimum/optimal): 4096 bytes / 4096 bytes
                              ^^^^         ^^^^

That is an Advanced Format 4k sector drive.  Meaning that the
partitions should start on a 4k sector alignment.  The
debian-installer would do this automatically.

> Disk identifier: 0xe044b9be
>
>    Device Boot      Start         End      Blocks   Id  System
> /dev/sdd1               1      243201  1953512001   fd  Linux raid autodetect
                      ^^^^^
> /dev/sde1               1      243201  1953512001   fd  Linux raid autodetect
                      ^^^^^
> Partition 1 does not start on physical sector boundary.


I don't recall if the first sector is 0 or 1 but I think the first
sector is 0 for the partition table.  Meaning that sector 1 is not
going to be 4k aligned.  (Can someone double check me on this?)
Meaning that this will require a lot of read-modify-write causing
performance problems for those drives.

The new standard for sector alignment would start at 2048 to leave
space for the partition table and other things and still be aligned
properly.

> I don't know if this helps or where to go from here, but I think I need to
> get the mdadm up and running properly before I do anything.

Probably a good idea.

> If there's any commands you need me to run, please ask,

How are you booted now?  Are you root on the system through something
like the debian-installer rescue boot?  Or did you use a live cd or
something?

Please run:

  # mdadm --detail /dev/sdd1
  # mdadm --detail /dev/sde1

Those are what look to be the split brain of the second array.  They
will list something at the bottom that will look like:

        Number   Major   Minor   RaidDevice State
  this     1       8       17        1      active sync   /dev/sdb1

     0     0       8        1        0      active sync   /dev/sda1
     1     1       8       17        1      active sync   /dev/sdb1

Except in your case each will list one drive and will probably have
the other drive listed as removed.  But importantly it will list the
UUID of the array in the listing.

            Magic : a914bfec
          Version : 0.90.00
             UUID : b8eb34b1:bcd37664:2d9e4c59:117ab348
    Creation Time : Fri Apr 30 17:21:12 2010
       Raid Level : raid1
    Used Dev Size : 497856 (486.27 MiB 509.80 MB)
       Array Size : 497856 (486.27 MiB 509.80 MB)
     Raid Devices : 2
    Total Devices : 2
  Preferred Minor : 0

Check each physical volume and verify that the UUID and other stats
verify that the same array has been forked and is running on both.
The data in that header should be the same for both halves of the
cloned and split mirror.

Corrective Action:

I _think_ you should stop the array on /dev/md127.  Then add that disk
to the array running on /dev/md1.  Don't do this until you have
confirmated that the two drives are clones of each other.  If they are
split then you need to join them.  I think something like this:

  mdadm --stop /dev/md127
  mdadm --manage /dev/md1 --add /dev/sdd1

Be sure to double check all of my device nodes and agree with those
before you do these commands.  But I think those are what you want to
do.  That will basically destroy anything what is currently sdd1 and
sync sde1 upon sdd1.

At that point you should have both arrays running.  You could stop
there and live with /dev/md126 but I think you want to fix the device
minor numbering on /dev/md126 by stopping the array and assembling it
again with the correct name.

  mdadm --stop /dev/md126
  mdadm --assemble /dev/md0 --update=super-minor /dev/sdb1 /dev/sdc1

At that point you should have two arrays up and running on /dev/md0
and /dev/md1 and both should have the low level lvm physical volumes
needed to assemble the lvm volume groups.  Run the --scan again.

  mdadm --detail --scan

Any errors at this time?  Hopefully it will list two arrays.  If not
then something is still wrong.  Here are some additional commands to
get the same information anyway.

  mdadm --detail /dev/md0
  mdadm --detail /dev/md1

  mdadm --examine /dev/sdb3
  mdadm --examine /dev/sdc3

  mdadm --examine /dev/sdd1
  mdadm --examine /dev/sde1

If that turns out favorable then edit the /etc/mdadm/mdadm.conf file
and update the list of ARRAY lines there.  I don't have the UUID
numbers from your system so can't suggest anything.  But the above
will list out the UUID numbers for the arrays.  Use them to update the
mdadm.conf file.

Then after updating that file update the initramfs.  I usually
recommend using dpkg-reconfigure of the current kernel package.  But
using 'update-initramfs -u' if you want is okay too.  The important
concept is that the initrd needs to be rebuilt including the new
arrays as listed in mdadm.conf so that the arrays are assembled at
initramfs time.

  dpkg-reconfigure linux-image-$(uname -r)

At this point if everything worked then you should be good to go.  I
would cross your fingers and reboot.  If all is good then it should
reboot okay.

Just as additional debug, after having both arrays up and online then
you can activate the lvm manually.  I would probably try letting the
system reboot first.  But just as low-level commands to further debug
things as hints of where to look next in case they might be needed.

  modprobe dm-mod
  vgscan
  vgchange -aly

That should activate the LVM.  You should have devices in
/dev/mapper/* corresponding to them.  You should be able to see a
listing of the logical volumes on the system.

  lvs

Good luck!
Bob




Reply to: