[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Debian machine not booting



Thanks Bob for your e-mail, it was really helpful. I think you've identified the nub of the problem, not updating mdadm.conf and the initramfs. However things are a bit unusual on the  other side. I'm not sure if the rescue disk or myself has screwed something up, but the second raid which has home extended onto it has divided into two raid arrays. Here's a summary,

cat /proc/mdstat:

Personalities : [raid0] [raid1] [raid6] [raid5] [raid4]
md126 : active raid1 sdb3[0] sdc3[1]
      972550912 blocks [2/2] [UU]
     
md127 : active raid1 sdd1[0]
      1953510841 blocks super 1.2 [2/1] [U_]
     
md1 : active raid1 sde1[1]
      1953510841 blocks super 1.2 [2/1] [_U]
     
unused devices: <none>

cat /etc/mdadm/mdadm.conf
# mdadm.conf
#
# Please refer to mdadm.conf(5) for information about this file.
#

# by default, scan all partitions (/proc/partitions) for MD superblocks.
# alternatively, specify devices to scan, using wildcards if desired.
DEVICE partitions

# auto-create devices with Debian standard permissions
CREATE owner=root group=disk mode=0660 auto=yes

# automatically tag new arrays as belonging to the local system
HOMEHOST <system>

# instruct the monitoring daemon where to send mail alerts
MAILADDR root

# definitions of existing MD arrays
ARRAY /dev/md0 UUID=a529cd1b:c055887e:bfe78010:bc810f04

# This file was auto-generated on Mon, 11 Jan 2010 22:18:22 +0000
# by mkconf 3.0.3-2

mdadm --detail --scan:

ARRAY /dev/md/0_0 metadata=0.90 UUID=a529cd1b:c055887e:bfe78010:bc810f04

ls -l /dev/disk/by-uuid
total 0
lrwxrwxrwx 1 root root 10 Jun 30 23:25 5e39b4bc-3b24-4df3-978d-1b3d3dca97da -> ../../sdb1
lrwxrwxrwx 1 root root 10 Jun 30 23:25 93a8d1f1-96f2-4169-852a-b37100b3e497 -> ../../sda1
lrwxrwxrwx 1 root root 10 Jun 30 23:25 a5c8d2c0-e454-4288-9987-ea7712242858 -> ../../sdc1
lrwxrwxrwx 1 root root 10 Jun 30 23:25 ba9f44ad-d43e-4863-801d-2de96d80ca08 -> ../../sdc2
lrwxrwxrwx 1 root root 10 Jun 30 23:25 ea2afa32-26b3-42af-83a3-57efc3ae3dce -> ../../sdb2

fdisk -l

Disk /dev/sdb: 1000.2 GB, 1000203804160 bytes
255 heads, 63 sectors/track, 121601 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0xf229fe3e

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1   *           1          37      297171   83  Linux
/dev/sdb2              38         524     3911827+  82  Linux swap / Solaris
/dev/sdb3             525      121601   972551002+  fd  Linux raid autodetect

Disk /dev/sda: 120.0 GB, 120033041920 bytes
255 heads, 63 sectors/track, 14593 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x0002ae52

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1               1       14593   117218241   83  Linux

Disk /dev/sdc: 1000.2 GB, 1000203804160 bytes
255 heads, 63 sectors/track, 121601 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00049c5c

   Device Boot      Start         End      Blocks   Id  System
/dev/sdc1               1          37      297171   83  Linux
/dev/sdc2              38         524     3911827+  82  Linux swap / Solaris
/dev/sdc3             525      121601   972551002+  fd  Linux raid autodetect

Disk /dev/sdd: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0xe044b9be

   Device Boot      Start         End      Blocks   Id  System
/dev/sdd1               1      243201  1953512001   fd  Linux raid autodetect
Partition 1 does not start on physical sector boundary.

Disk /dev/sde: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0xcfa9d090

   Device Boot      Start         End      Blocks   Id  System
/dev/sde1               1      243201  1953512001   fd  Linux raid autodetect
Partition 1 does not start on physical sector boundary.

Disk /dev/md1: 2000.4 GB, 2000395101184 bytes
2 heads, 4 sectors/track, 488377710 cylinders
Units = cylinders of 8 * 512 = 4096 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Alignment offset: 512 bytes
Disk identifier: 0x00000000


Disk /dev/md127: 2000.4 GB, 2000395101184 bytes
2 heads, 4 sectors/track, 488377710 cylinders
Units = cylinders of 8 * 512 = 4096 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Alignment offset: 512 bytes
Disk identifier: 0x00000000


Disk /dev/md126: 995.9 GB, 995892133888 bytes
2 heads, 4 sectors/track, 243137728 cylinders
Units = cylinders of 8 * 512 = 4096 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000


Disk /dev/dm-0: 10.5 GB, 10485760000 bytes
255 heads, 63 sectors/track, 1274 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000


Disk /dev/dm-1: 36.7 GB, 36700160000 bytes
255 heads, 63 sectors/track, 4461 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000


Disk /dev/dm-2: 1375.7 GB, 1375731712000 bytes
255 heads, 63 sectors/track, 167256 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Alignment offset: 512 bytes
Disk identifier: 0x00000000


Disk /dev/dm-3: 10.5 GB, 10485760000 bytes
255 heads, 63 sectors/track, 1274 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

------------------------------------------------------

I don't know if this helps or where to go from here, but I think I need to get the mdadm up and running properly before I do anything.
I get some errors running those commands but they didn't get written to the file.

E.g. mdadm --detail --scan
mdadm: cannot open /dev/md/Hawaiian:1: No such file or directory
mdadm: cannot open /dev/md/1: No such file or directory
ARRAY /dev/md/0_0 metadata=0.90 UUID=a529cd1b:c055887e:bfe78010:bc810f04

If there's any commands you need me to run, please ask,

Thanks,
James


On 18 June 2013 20:47, Bob Proulx <bob@proulx.com> wrote:
James Allsopp wrote:
> I have a debian machine which was on for a long time (~months). Just moved
> house and rebooted and now it doesn't boot.

Bummer.

> My 4 harddrives are organised in pairs of RAID 1 (Mirrored) with LVM
> spanning them. Originally there was just one pair, but then I got two new
> hard drives and added them. I then increased the space of VolGroup-LogVol03
> to cover these new drives and increase the space of Home (/ wass on one of
> the other logical volume groups). This all worked fine for ages.

Sounds fine.  Assuming that it booted after those changes.

> When I boot all four drives are detected in BIOS and I've check all the
> connections.

Good.

> It gets to "3 logical volumes in volume group "VolGroup" now active" which
> sounds good.

That does sound good.

> Then here's the error:
> "fsck.ext4: No such file or directory while trying to open
> /dev/mapper/VolGroup-LogVol03
> /dev/mapper/VolGroup-LogVol03:
> The superblock could not be read or does not describe a correct ext2
> ........."

Hmm...  I am not familiar with that error.  But searching the web
found several stories about it.  Most concerned recent changes to the
system that prevented it from booting.

> I have a debian machine which was on for a long time (~months). Just
> moved house and rebooted and now it doesn't boot.
>
> My 4 harddrives are organised in pairs of RAID 1 (Mirrored) with LVM
> spanning them. Originally there was just one pair, but then I got two
> new hard drives and added them. I then increased the space of
> VolGroup-LogVol03 to cover these new drives and increase the space of
> Home (/ wass on one of the other logical volume groups). This all
> worked fine for ages.

And you rebooted in that time period?  Otherwise these changes, if not
done completely correct, seem prime to have triggered your current
problem independent of any other action.  You say it was on for a long
time.  If you had not rebooted in that long time then this may have
been a hang-fire problem for all of that time.

> I'm wondering if some of the drive id's have been switched.

If you mean the drive UUIDs then no those would not have changed.

> Any help would be really appreciated. I'm worried I've lost all my data on
> home

First, do not despair.  You should be able to get your system working
again.  You are probably simply missing the extra raid pair
configuration.

I strongly recommend using the debian-installer rescue mode to gain
control of your system again.  It works well and is readily
available.  Use a standard Debian installation disk.  Usually we
recommend the netinst disk because it is the smallest image.  But any
of the netinst or CD#1 or DVD#1 images will work fine for rescue mode
since it is not actually installing but booting your system at that
point so the difference between them does not matter.  You have a
disk?  Go fish it out and boot it.

Here is the official documentation for it:

  http://www.debian.org/releases/stable/i386/ch08s07.html.en

But that is fairly terse.  Let me say that the rescue mode looks just
like the install mode initially.  It will ask your keyboard and locale
questions and you might wonder if you are rescuing or installing!  But
it will have "Rescue" in the upper left corner so that you can tell
that you are not in install mode and be assured.  Get the tool set up
with keyboard, locale, timezone, and similar and eventually it will
give you a menu with a list of actions.  Here is a quick run-through.

  Advanced options...
  Rescue mode
  keyboard dialog
  ...starts networking...
  hostname dialog
  domainname dialog
  ...apt update release files...
  ...loading additional components, Retrieving udebs...
  ...detecting disks...

Then eventually it will get to a menu "Enter rescue mode" that will
ask what device to use as a root file system.  It will list the
partitions that it has automatically detected.  If you have used a
RAID then one of the menu entry items near the bottom will be
"Assemble RAID array" and you should assemble the raid at that point.
That will bring up the next dialog menu asking for partitions to
assemble.  Select the appropriate for your system.  Then continue.
Since you have two RAID configurations I think you will need to do
this twice.  Once for each.  I believe that you won't be able to use
the automatically select partitions option but not sure.  In any case
get both raid arrays up and online at this step before proceeding.

At that point it presents a menu "Execute a shell in /dev/...".  That
should get you a shell on your system with the root partition
mounted.  It is a /bin/sh shell.  I usually at that point start bash
so as to have bash command line recall and editing.  Then mount all of
the additional disks.

  # /bin/bash
  root@hostname:~# mount -a

At that point you have a root superuser shell on the system and can
make system changes.  After doing what needs doing you can reboot to
the system.  Remove the Debian install media and boot to the normal
system and see if the changes were able to fix the problem.

Now what is your original problem?  I think (not sure) you have added
a second raid pair but have not propagated the changes completely
through the boot system.

Basically make sure that mdadm.conf is updated correctly and rebuild
the initramfs to make sure that it includes it.

  /etc/mdadm/mdadm.conf

  dpkg-reconfigure linux-image-$(uname -r)

Here are some previous messages on this topic.

  https://lists.debian.org/debian-user/2013/01/msg00195.html

  https://lists.debian.org/debian-user/2013/01/msg00392.html

Start by getting your system booted using rescue mode and then work
through the problems of the raid arrays not being assembled at boot
time.  Come back here and report your progress.

Bob


Reply to: