[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Fwd: Do others see install failures on DNS-323 to ext2 FS from Debian 7.0 netboot.img?



Martin, and list,

I've very grateful for your work on the DNS-323 and your efforts to
make an ssh-console installer available. I've also learned a lot from
the debian-arm mailing list gurus along the way -- thanks to you, too.

Summary: the current installer for the DNS-323, using ext2
filesystem, cannot make the system bootable because necessary links
are not created in /dev/disk/by-uuid/ . (Ext2 seems to be the only
linux-native FS available in the installer.) A human can fix this, but
if the system is simply rebooted, it will require serial console
intervention and/or skilled reworking of the boot disk.

Are we alone in seeing this? Should I file a bug or re-open an existing bug?

-------------------------

Long Description with logs:

I believe I have observed a bug that has bitten the recent ssh-console
installer for Debian 7.0 on the D-Link DNS-323. I'm sorry the
description is so long but it's complicated for a simple user like me.

The alleged bug was previously reported by others in other contexts
but not really dealt with as yet:
 -- http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=721485 [Paul]
 -- http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=701223 [Alexander] )

This description includes work by myself (dwarf in the transcript
below) and a pseudonymous associate, (timdau). The failure occurred on
my DNS-323 B1 machine and also on one belonging to another colleague.
We both started with stock firmwares from D-Link and used the current
netboot.img from Martin's DNS-323 page, 6638688 bytes in size.

The debian installer runs well (although lacking modern FS options
like ext3 and ext4) until it tries to make the new system bootable.

Installer is running in :
      Low memory mode

  Configuring flash memory to boot the system
  Installed flash-kernel
[ Installer locks up here ^^ ]

Tail of /var/log/syslog:  [Likely you want to jump to end of log]

Oct  8 21:27:48 in-target: update-initramfs: deferring update (trigger
activated)
Oct  8 21:27:48 in-target: Processing triggers for python-support ...
Oct  8 21:27:51 in-target: Processing triggers for ca-certificates ...
Oct  8 21:27:51 in-target: Updating certificates in /etc/ssl/certs...
Oct  8 21:28:36 in-target: 158 added, 0 removed; done.
Oct  8 21:28:36 in-target: Running hooks in /etc/ca-certificates/update.d....
Oct  8 21:28:36 in-target: done.
Oct  8 21:28:37 in-target: Processing triggers for sgml-base ...
Oct  8 21:28:37 in-target: Processing triggers for initramfs-tools ...
Oct  8 21:28:37 in-target: update-initramfs: Generating
/boot/initrd.img-3.2.0-4-orion5x
Oct  8 21:29:10 pkgsel: finishing up
Oct  8 21:29:11 main-menu[1464]: DEBUG: resolver (libgcc1): package
doesn't exist (ignored)
Oct  8 21:29:11 main-menu[1464]: INFO: Menu item
'flash-kernel-installer' selected
Oct  8 21:29:20 in-target: Reading package lists...
Oct  8 21:29:20 in-target:
Oct  8 21:29:20 in-target: Building dependency tree...
Oct  8 21:29:26 in-target:
Oct  8 21:29:26 in-target: Reading state information...
Oct  8 21:29:26 in-target:
Oct  8 21:29:28 in-target: The following extra packages will be installed:
Oct  8 21:29:28 in-target:   devio
Oct  8 21:29:28 in-target: Suggested packages:
Oct  8 21:29:28 in-target:   u-boot-tools
Oct  8 21:29:28 in-target: The following NEW packages will be installed:
Oct  8 21:29:28 in-target:   devio flash-kernel
Oct  8 21:29:29 in-target: 0 upgraded, 2 newly installed, 0 to remove
and 0 not upgraded.
Oct  8 21:29:29 in-target: Need to get 42.4 kB of archives.
Oct  8 21:29:29 in-target: After this operation, 223 kB of additional
disk space will be used.
Oct  8 21:29:29 in-target: Get:1 http://ftp.us.debian.org/debian/
wheezy/main devio armel 1.2-1+b1 [16.9 kB]
Oct  8 21:29:29 in-target: Get:2 http://ftp.us.debian.org/debian/
wheezy/main flash-kernel armel 3.3 [25.5 kB]
Oct  8 21:29:42 in-target: Fetched 42.4 kB in 0s (66.1 kB/s)
Oct  8 21:29:43 in-target: Selecting previously unselected package devio.
Oct  8 21:29:43 in-target: (Reading database ...
Oct  8 21:29:43 in-target: 22721 files and directories currently installed.)
Oct  8 21:29:43 in-target: Unpacking devio (from
.../devio_1.2-1+b1_armel.deb) ...
Oct  8 21:29:44 in-target: Selecting previously unselected package flash-kernel.
Oct  8 21:29:44 in-target: Unpacking flash-kernel (from
.../flash-kernel_3.3_armel.deb) ...
Oct  8 21:29:44 in-target: Processing triggers for man-db ...
Oct  8 21:29:56 in-target: Setting up devio (1.2-1+b1) ...
Oct  8 21:29:56 in-target: Setting up flash-kernel (3.3) ...
Oct  8 21:30:06 in-target: update-initramfs: Generating
/boot/initrd.img-3.2.0-4-orion5x
Oct  8 21:30:11 in-target: UUID 422805e4-bd52-4ee8-8fce-2abd5079a1ea
doesn't exist in /dev/disk/by-uuid
Oct  8 21:30:11 in-target: Warning: root device
/dev/disk/by-uuid/422805e4-bd52-4ee8-8fce-2abd5079a1ea does not exist
Oct  8 21:30:11 in-target:
Oct  8 21:30:11 in-target: Press Ctrl-C to abort build, or Enter to continue

On my colleague's machine with an apparently identical problem, the
relevant lines were:

Oct  7 18:41:56 in-target: UUID 13f9b045-eb8a-4c4e-8f4d-4e999d2d0061
doesn't exist in /dev/disk/by-uuid
Oct  7 18:41:56 in-target: Warning: root device
/dev/disk/by-uuid/13f9b045-eb8a-4c4e-8f4d-4e999d2d0061 does not exist
Oct  7 18:41:56 in-target:
Oct  7 18:41:56 in-target: Press Ctrl-C to abort build, or Enter to continue

It stalls there, obviously, since this is syslog and not a console.
This is the famous 'installer line 33' mentioned in
    http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=721485

Previous reports suggest perhaps there is a missing "udevadm trigger"
for some reason, apparently restricted to ext2 filesystems.

For context, I can offer some information from the installer's
emergency shell after the failure:

/dev/disk/by-id # uname -a
Linux exile 3.2.0-4-orion5x #1 Debian 3.2.46-1 armv5tel GNU/Linux

/dev/disk/by-id # free
             total         used         free       shared      buffers
Mem:         60620        57676         2944            0         1124
-/+ buffers:              56552         4068
Swap:       248828        13588       235240

/dev/disk/by-id # df -h
Filesystem                Size      Used Available Use% Mounted on
[...]
/dev/sda2                 1.8T    772.7M      1.7T   0% /target
/dev/sda2                 1.8T    772.7M      1.7T   0% /dev/.static/dev


/dev/disk/by-id # mount
rootfs on / type rootfs (rw)
none on /run type tmpfs (rw,nosuid,relatime,size=6064k,mode=755)
none on /proc type proc (rw,relatime)
none on /sys type sysfs (rw,relatime)
tmpfs on /dev type tmpfs (rw,relatime,mode=755)
none on /dev/pts type devpts (rw,relatime,gid=5,mode=620,ptmxmode=000)
/dev/sda2 on /target type ext2 (rw,relatime,errors=remount-ro)
/dev/sda2 on /dev/.static/dev type ext2 (rw,relatime,errors=remount-ro)
tmpfs on /target/dev type tmpfs (rw,relatime,mode=755)
proc on /target/proc type proc (rw,relatime)
sysfs on /target/sys type sysfs (rw,relatime)
devpts on /target/dev/pts type devpts
(rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000)
none on /target/run type tmpfs (rw,nosuid,relatime,size=6064k,mode=755)


Based on the error message it seemed like this should be key:

/dev/disk # blkid
/dev/sda2: UUID="422805e4-bd52-4ee8-8fce-2abd5079a1ea" TYPE="ext2"
/dev/sda1: UUID="03e471c7-c4c9-477d-a6fd-0544b93bf3fe" TYPE="swap"
/dev/mtdblock0: TYPE="minix"
/dev/mtdblock1: TYPE="minix"
/dev/sdb1: TYPE="swap"
/dev/sdb2: UUID="ab0c42a7-ad0a-4f80-9795-d6a7aa551883" TYPE="ext2"
/dev/sdb4: UUID="446aa69a-f6ef-4d12-9de7-46174b332365" TYPE="ext2"

Sorry, don't let sdb* confuse you, it's there but has no relevance.

/dev/disk # ls -al *
by-id:
drwxr-xr-x    2 root     root           460 Oct  8 20:15 .
drwxr-xr-x    5 root     root           100 Oct  8 14:42 ..
[...]
lrwxrwxrwx    1 root     root             9 Oct  8 20:39
ata-TOSHIBA_DT01ACA200_73UUARRKS -> ../../sda
lrwxrwxrwx    1 root     root            10 Oct  8 20:15
ata-TOSHIBA_DT01ACA200_73UUARRKS-part1 -> ../../sda1
lrwxrwxrwx    1 root     root            10 Oct  8 20:15
ata-TOSHIBA_DT01ACA200_73UUARRKS-part2 -> ../../sda2
[...]
lrwxrwxrwx    1 root     root             9 Oct  8 20:39
scsi-SATA_TOSHIBA_DT01ACA_73UUARRKS -> ../../sda
lrwxrwxrwx    1 root     root            10 Oct  8 20:15
scsi-SATA_TOSHIBA_DT01ACA_73UUARRKS-part1 -> ../../sda1
lrwxrwxrwx    1 root     root            10 Oct  8 20:15
scsi-SATA_TOSHIBA_DT01ACA_73UUARRKS-part2 -> ../../sda2
[...]

by-path:
[...]

by-uuid:
drwxr-xr-x    2 root     root            80 Oct  8 14:42 .
drwxr-xr-x    5 root     root           100 Oct  8 14:42 ..
lrwxrwxrwx    1 root     root            10 Oct  8 19:58
446aa69a-f6ef-4d12-9de7-46174b332365 -> ../../sdb4
lrwxrwxrwx    1 root     root            10 Oct  8 19:58
ab0c42a7-ad0a-4f80-9795-d6a7aa551883 -> ../../sdb2

Just as the installer said, no links there to sda*.

The bug description per se ends here; what follows describes how I
solved the problem from an emergency shell, and why my solution
confirms the problem is as shown above.
--------------------------------------------------------------------

/target/etc # more fstab
# /etc/fstab: static file system information.
#
[...]
#
# <file system> <mount point>   <type>  <options>       <dump>  <pass>
# / was on /dev/sda2 during installation
UUID=422805e4-bd52-4ee8-8fce-2abd5079a1ea /               ext2
errors=remount-ro 0       1
# swap was on /dev/sda1 during installation
UUID=03e471c7-c4c9-477d-a6fd-0544b93bf3fe none            swap    sw
           0       0

Based on the above info, I manually created the softlinks that were
missing, for a root FS and a swap partition.

/dev/disk/by-uuid # ln -s ../../sda2  422805e4-bd52-4ee8-8fce-2abd5079a1ea
/dev/disk/by-uuid # ln -s ../../sda1 03e471c7-c4c9-477d-a6fd-0544b93bf3fe

/dev/disk/by-uuid # ls -al
[...]
lrwxrwxrwx    1 root     root            10 Oct  9 01:36
03e471c7-c4c9-477d-a6fd-0544b93bf3fe -> ../../sda1
lrwxrwxrwx    1 root     root            10 Oct  9 01:36
422805e4-bd52-4ee8-8fce-2abd5079a1ea -> ../../sda2
lrwxrwxrwx    1 root     root            10 Oct  8 19:58
446aa69a-f6ef-4d12-9de7-46174b332365 -> ../../sdb4
lrwxrwxrwx    1 root     root            10 Oct  8 19:58
ab0c42a7-ad0a-4f80-9795-d6a7aa551883 -> ../../sdb2

Links are there now but the installer is stuck waiting for input, so...

21:38 <@dwarf> I didn't know what to do next. :)
21:38 <@dwarf> I should manually run update-intramfs?
21:39 <@timdau> you can try the installer
21:39 <@dwarf> ... it's waiting for an input that can't come, right now.
21:39 <@timdau> right, you can kill that and it will return.

ps shows several things waiting that I might kill:
  959 root      1640 S    {debian-installe} /bin/sh /sbin/debian-installer
 5227 root      1740 S    {update-initramf} /bin/sh
/usr/sbin/update-initramfs -u
 5519 root      1740 S    {flash_kernel_se} /bin/sh
/usr/share/initramfs-tools/hooks/flash_kernel_set_root

21:43 <@timdau> are you inside the chroot?  [...] run: chroot /target /bin/bash
21:44 <@timdau> and then pstree :)
21:45 <@dwarf> I couldn't figure out how to run executables from
within /target :) Thanks.
21:46 <@dwarf> So, one likely line from the pstree reads:

sshd─┬─sshd──debian-installe─debconf─main-menu─udpkg─flash-kernel-in─in-target─log-output──update-initramf──mkinitramfs──flash_kernel_se

21:47 <@timdau> right, so kill 5519

I kill 5519.

Installer tries to pick up where it left off and says:
__
| An installation step failed. You can try to run the failing item again
| from the menu, or skip it and choose something else. The failing
| step is: Make the system bootable
---

The installer then fails back to the main menu, which is good.

I try again to
     ( ) Make the system bootable
but no:

This time the install doesn't halt silently.  I quickly get the same
failure message as above. The syslog shows:

Oct  9 01:52:22 main-menu[1464]: INFO: Menu item
'flash-kernel-installer' selected
Oct  9 01:52:31 in-target: Reading package lists...
Oct  9 01:52:31 in-target: Building dependency tree...
Oct  9 01:52:37 in-target: Reading state information...
Oct  9 01:52:40 in-target: flash-kernel is already the newest version.
Oct  9 01:52:40 in-target: 0 upgraded, 0 newly installed, 0 to remove
and 0 not upgraded.
Oct  9 01:52:43 in-target: update-initramfs: Generating
/boot/initrd.img-3.2.0-4-orion5x
Oct  9 01:53:12 in-target: flash-kernel: installing version 3.2.0-4-orion5x
Oct  9 01:53:14 in-target: Generating kernel u-boot image...
/usr/sbin/flash-kernel: 232: /usr/sbin/flash-kernel: mkimage: not
found
Oct  9 01:53:14 flash-kernel-installer: error: flash-kernel failed
Oct  9 01:53:14 main-menu[1464]: WARNING **: Configuring
'flash-kernel-installer' failed with error code 1
Oct  9 01:53:14 main-menu[1464]: WARNING **: Menu item
'flash-kernel-installer' failed.

I quote the log to timdau, he says I need to chroot again and within
the /target, install the u-boot-tools. (I'm not sure why this is
necessary.)

root@exile:/dev/disk/by-uuid# apt-get install u-boot-tools
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following NEW packages will be installed:
  u-boot-tools
0 upgraded, 1 newly installed, 0 to remove and 0 not upgraded.
Need to get 58.1 kB of archives.
After this operation, 160 kB of additional disk space will be used.
Get:1 http://ftp.us.debian.org/debian/ wheezy/main u-boot-tools armel
2012.04.01-2 [58.1 kB]
Fetched 58.1 kB in 0s (143 kB/s)
Can not write log, openpty() failed (/dev/pts not mounted?)
Selecting previously unselected package u-boot-tools.
(Reading database ... 22748 files and directories currently installed.)
Unpacking u-boot-tools (from .../u-boot-tools_2012.04.01-2_armel.deb) ...
Processing triggers for man-db ...
Can not write log, openpty() failed (/dev/pts not mounted?)
Setting up u-boot-tools (2012.04.01-2) ...

[...]

On the next try, the attempt to make the system bootable works, and
syslog shows the installer is now happy:
Oct  9 02:08:17 finish-install: info: Running
/usr/lib/finish-install.d/10update-initramfs
Oct  9 02:08:18 finish-install: info: Running
/usr/lib/finish-install.d/20final-message

We reboot, and ssh in as a regular user.
Oh happy day!

Discussion ensued as to whether this was a serious bug, and if so how
it had escaped detection so far.

21:55 <@timdau> looking back, you should have tried with squeeze or
lenny or something.
21:56 <@dwarf> Yes. I realized yesterday, after vio's install fell
apart, that I *have* a wheezy DNS-323, but I didn't install it.
21:57 <@dwarf> I installed squeeze, which worked fine and was current
at the time, and then when wheezy came out, I upgraded. Which also
worked fine.

I wrote a summary email to one of the bug filers who noted the various
factors that have to be in place for the failure to manifest itself --
low memory, ext2 FS, and possibly a very bare task selection.

I have preserved a few additional logs and I could probably reproduce
the problem on demand if it isn't happening elsewhere and you'd like
to see it happen.

My system now works fine.

My colleague now has an apparently-bricked DNS-323 and is working on
putting together a serial cable. I'm going to see if I can recover his
system just by pulling the drive and mucking with /etc/fstab, /boot,
/dev/disk/by-uuid, etc; this seems easier to me than soldering. :)

I have to acknowledge that I haven't demonstrated the bug is only on
ARM. It's possible the installer has this bug on x86 too, but doesn't
manifest (often?) because Linux is rarely installed headless on x86,
or because an ext2 root FS is rarely selected or ...

My apologies for being so verbose. Any comments? Should I re-open one
of the bugs above?

Thanks,
tai viinikka

--
tai@eastpole.ca   ::::   East Pole Productions


-- 
tai@eastpole.ca   ::::   East Pole Productions


Reply to: