[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Boot stalling in jessie with systemd



Quoting Don Armstrong (don@debian.org):
> On Tue, 28 Apr 2015, David Wright wrote:
> > There are one of two services that stall on odd occasions, but most
> > have a timeout which is honoured. binfmt-support has an indefinite
> > timeout. I have no idea what it's meant to do. All its configuration
> > directories/files are empty.
> 
> They really shouldn't be empty. If binfmt-support is hanging
> indefinitely, something is wrong. Is this a Debian kernel? Or your own?

$ uname -a
Linux west 3.16.0-4-686-pae #1 SMP Debian 3.16.7-ckt9-3~deb8u1 (2015-04-24) i686 GNU/Linux
$

> Is the binfmt_misc module loaded?

This seems to be the root of the problem. It's missing from
lsmod. Sometimes it's the only one missing, sometimes a variety of
others are.

> What do you see in dmesg?

Comparing a good one with a bad one, I see
 [  196.343851] perf interrupt took too long (2519 > 2500), lowering kernel.perf_event_max_sample_rate to 50000
 [ 1402.652220] perf interrupt took too long (5005 > 5000), lowering kernel.perf_event_max_sample_rate to 25000
in the bad, whereas the good has
 [   41.997465] systemd-journald[161]: File /var/log/journal/9...0/system.journal corrupted or uncleanly shut down, renaming and replacing.
 [  104.143727] systemd-journald[161]: File /var/log/journal/9...0/user-1000.journal corrupted or uncleanly shut down, renaming and replacing.
as it cleans up after the hard reset. The chip temperature is hotter
in the good (75°C rather than 44), which might correlate with the
number of times that a first boot fails whereas the second succeeds
(not invariably).

> Is /proc mounted? Is /proc/sys/fs/binfmt_misc mounted? What is in
> /proc/sys/fs/binfmt_misc/status?

Well, here's a consequence of the missing module, I guess.
$ ls -lR /proc/sys/fs
/proc/sys/fs:                             (ssh session freezes)
Connection to west closed by remote host. (ssh killed locally)

> There's also the possibility that you've got corruption of some sort or
> another on this machine. What does ls -l /var/lib/binfmts/; look like?

$ ls -lR /var/lib/binfmts
/var/lib/binfmts:
total 16
-rw-r--r-- 1 root root 47 Oct  3  2014 jar
-rw-r--r-- 1 root root 58 Mar  6 17:28 python2.7
-rw-r--r-- 1 root root 58 Oct 23  2014 python3.4
-rw-r--r-- 1 root root 51 Oct 31 10:42 sbcl
$

> Checking the output of debsums -s; or similar will help to see if
> there's something specific which has been corrupted.

Script started on Tue 28 Apr 2015 14:47:23 CDT
# debsums -l
# debsums -ca
/etc/issue
/etc/default/cups
/etc/exim4/passwd.client
/etc/kbd/config
# date
Tue 28 Apr 15:29:22 CDT 2015
# exit

The slowness of the system may be because systemd gets in a loop:
 systemd[1]: Looping too fast. Throttling execution a little.
countless times in the logs. Slowness of logging in, that is.

The system when working normally is generally very much slower than
wheezy. It seems to be thrashing the disk much more. Possibly the
parallelisation of booting may work against me here. But everything
that involves the disk happens more slowly. (To be fair, wheezy is
on ext4, jessie is ext3.) This is a 1.2GHz dual core, and it's
easily outperformed at booting/starting X by a 1.5GHz Pentium M.

Things I don't understand (apart from the module not loading):

. Why can I login through ssh but not the console? Surely having a
  console is a top priority on any system.

. Why can't systemd reboot/shutdown/poweroff? It really can't require
  java, python or common lisp so this issue shouldn't affect it.

Would I be right in filing bugs against both the kernel (for the
module loading) and systemd (for these two points)?

My workaround at the moment is obviously to put binfmt_misc in
/etc/modules.

Thanks for your quick help, and apologies for the time taken to
investigate and reply.

Cheers,
David.


Reply to: