[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: boot times out after dist-upgrade on Stretch




On Fri, 17 Jun 2016 at 03:12, Borden Rhodes <jrvp@bordenrhodes.com> wrote:
Thank you for getting back to me, Sven,

I normally run apt-get update; apt-get dist-upgrade immediately after
my computer boots. According to the messages log, I turned the
computer on about 5 minutes before running that command and the last
log entry was about 3.5 hours later at 22:59. I hadn't fiddled with
any other settings during that boot. Unfortunately, without /var
loading on the dead boots, I can't get any log information except when
I successfully boot into the recovery console.

I should have mentioned that I also tried booting from the 4.5 kernel
and got the exact same symptoms. I also tried running update-grub in
case it had made a mistake whilst installing the 4.6 kernel.

Notwithstanding better ideas, I'm thinking of doing the Windows Safe
Mode troubleshooting method where I work out the systemd differences
between the default and recovery targets and gradually add services
until I find the one that breaks the system. I'm inferring that, since
recovery mode works but normal mode doesn't, then one of the
targets/services in normal mode is to blame for the lock up. I don't
suppose I could trouble the list for a resource on how to do that?

> Date: Wed, 15 Jun 2016 19:35:34 +0200
> From: Sven Joachim <svenjoac@gmx.de>
> To: debian-user@lists.debian.org
> Subject: Re: boot times out after dist-upgrade on Stretch
> Message-ID: <[🔎] 8760tapd7d.fsf@turtle.gmx.de" target="_blank">[🔎] 8760tapd7d.fsf@turtle.gmx.de>
> Content-Type: text/plain
>
> On 2016-06-15 07:58 +0000, Borden Rhodes wrote:
>
>> I ran apt dist-upgrade on Stretch (with a few Sid packages) which made
>> the following changes:
>>
>> Start-Date: 2016-06-14  19:42:39
>> Commandline: apt-get dist-upgrade
>> Requested-By: me (1000)
>> Install: libdw1:amd64 (0.163-5.1, automatic),
>> linux-image-4.6.0-1-amd64:amd64 (4.6.1-1, automatic)
>> Upgrade: wwwconfig-common:amd64 (0.2.2, 0.3.0), libcomerr2:amd64
>> (1.43-3, 1.43.1-1), libcomerr2:i386 (1.43-3, 1.43.1-1), libcups2:amd64
>> (2.1.3-5, 2.1.3-6), fuse2fs:amd64 (1.43-3, 1.43.1-1), e2fsprogs:amd64
>> (1.43-3, 1.43.1-1), boinc-client:amd64 (7.6.32+dfsg-2, 7.6.33+dfsg-1),
>> libbabeltrace1:amd64 (1.3.2-1, 1.4.0-1), cups-server-common:amd64
>> (2.1.3-5, 2.1.3-6), e2fslibs:amd64 (1.43-3, 1.43.1-1),
>> cups-common:amd64 (2.1.3-5, 2.1.3-6), libspice-server1:amd64
>> (0.12.6-4, 0.12.6-4.1), boinc-manager:amd64 (7.6.32+dfsg-2,
>> 7.6.33+dfsg-1), libss2:amd64 (1.43-3, 1.43.1-1), live-config-doc:amd64
>> (5.20151121, 5.20160608), libdatetime-timezone-perl:amd64
>> (1:1.98-1+2016d, 1:2.00-1+2016d), cups-ppdc:amd64 (2.1.3-5, 2.1.3-6),
>> libcupsmime1:amd64 (2.1.3-5, 2.1.3-6), python-paramiko:amd64
>> (1.16.0-1, 2.0.0-1), linux-image-amd64:amd64 (4.5+73, 4.6+74),
>> libboinc7:amd64 (7.6.32+dfsg-2, 7.6.33+dfsg-1), libcupsppdc1:amd64
>> (2.1.3-5, 2.1.3-6), libbabeltrace-ctf1:amd64 (1.3.2-1, 1.4.0-1),
>> live-config:amd64 (5.20151121, 5.20160608), cups-bsd:amd64 (2.1.3-5,
>> 2.1.3-6), cups-core-drivers:amd64 (2.1.3-5, 2.1.3-6),
>> cups-daemon:amd64 (2.1.3-5, 2.1.3-6), libcupsimage2:amd64 (2.1.3-5,
>> 2.1.3-6), cups:amd64 (2.1.3-5, 2.1.3-6), boinc:amd64 (7.6.32+dfsg-2,
>> 7.6.33+dfsg-1), libcupscgi1:amd64 (2.1.3-5, 2.1.3-6),
>> cups-client:amd64 (2.1.3-5, 2.1.3-6), live-config-systemd:amd64
>> (5.20151121, 5.20160608), libjpeg62-turbo:amd64 (1:1.4.2-2,
>> 1:1.5.0-1), libjpeg62-turbo:i386 (1:1.4.2-2, 1:1.5.0-1), xterm:amd64
>> (324-2, 325-1)
>> End-Date: 2016-06-14  19:46:44
>
> The only package related to the boot process seems to be
> linux-image-4.6.0-1-amd64.  However, there could be others which were
> upgraded earlier.  When did you last boot before this upgrade?
>
>> The system worked normally until I rebooted a few hours later. After
>> entering my encryption password (more on that later), boot up stalls
>> with a message saying that "A start job is running for" and then
>> switches between sda5_crypt.device, x2dhome.device, x2dvar.device,
>> x2dtmp.device, <UUID-for-my-root-partition>.device and
>> <UUID-for-my-dm-crypt-partition>.device.
>>
>> After 90 seconds, the start up jobs timeout and the boot tries to
>> start an emergency shell. However, the prompt never appears, responds
>> to ^C or ^D as some suggest it might. However, CTRL+ALT+DEL works, so
>> I know the system isn't completely locked up.
>>
>> The only error messages I can read after that, as earlier ones would
>> get truncated, are that systemd-tmpfiles.setup.service,
>> binfmt-support.service and networking.service all failed to start.
>
> Those probably fail because /tmp and /var could not be mounted.
>
>> I can, however, boot into single user recovery without the stall,
>> timetout or any error messages.
>>
>> I think it's relevant to note that my hard drive has a msdos partition
>> table (and a legacy BIOS), a LVM partition containing dm-crypt'd
>> partitions, each of which is formatted with a btrfs file system. Put
>> another way, here's my fstab:
>> # <file system> <mount point>   <type>  <options>       <dump>  <pass>
>> /dev/mapper/LVG-root /               btrfs   defaults        0       1
>> # /boot was on /dev/sda1 during installation
>> UUID=<UUID here> /boot           btrfs   defaults        0       2
>> /dev/mapper/LVG-home /home           btrfs   defaults        0       2
>> /dev/mapper/LVG-tmp /tmp            btrfs   defaults        0       2
>> /dev/mapper/LVG-var /var            btrfs   defaults        0       2
>> /dev/sr0        /media/cdrom0   udf,iso9660 user,noauto     0       0
>>
>> What hasn't worked:
>> - One site suggested that systemd requires acl. I added acl to all of
>> the options in fstab without success.
>
> That's red herring, acl is only needed to tune the permissions for the
> journal.
>
>> - Another user on Arch had very similar symptoms to mine:
>> https://bbs.archlinux.org/viewtopic.php?id=210008 . However, my system
>> doesn't have mkinitcpio, so I can't try the solution that worked for
>> him. However, I have initramfs, so maybe adapting his solution would
>> work. I'd need guidance as to how so that I don't waste hours
>> experimenting with config files.
>
> I guess lvm already works in the initramfs, otherwise your root
> filesystem could not be mounted.
>
>> Could I get direction on how to troubleshoot this?
>
> Does the problem show up when you boot with the previous kernel
> (probably 4.5)?
>
> Cheers,
>        Sven


The services needed for a full boot are linked in /etc/systemd/system/multi.user-wants/ (from memory, I am not in front of my machine right now). I'd suggest noting down all the links in that directory, removing them all, then adding them back one by one until it breaks. And you'll have your culprit.

Mark

Reply to: