[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#425397: some debug info



I can reproduce this bug pretty readily, but unfortunately 
not with the strace patch for pkgsel.postinst - tis a heisenbug.

Here's my notes so far - let me know if you have ideas on
how I can make more progress.

The system is hung in the "Select and install software" menu
18% complete, message says "Retrieving file 66 of 78".

ps -ef output shows:
  root@dl380g5:/proc# ps -ef | cat
  UID        PID  PPID  C STIME TTY          TIME CMD
  root         1     0  0 10:40 ?        00:00:01 busybox init
  root         2     1  0 10:40 ?        00:00:00 [ksoftirqd/0]
  root         3     1  0 10:40 ?        00:00:00 [events/0]
  root         4     1  0 10:40 ?        00:00:00 [khelper]
  root         5     1  0 10:40 ?        00:00:00 [kthread]
  root         8     5  0 10:40 ?        00:00:00 [kblockd/0]
  root         9     5  0 10:40 ?        00:00:00 [kacpid]
  root       125     5  0 10:40 ?        00:00:00 [kseriod]
  root       169     5  0 10:40 ?        00:00:00 [kswapd0]
  root       170     5  0 10:40 ?        00:00:00 [aio/0]
  root       448     1  0 10:40 ?        00:00:00 udevd --daemon
  root       664     5  0 10:40 ?        00:00:00 [khubd]
  root       900     1  0 10:40 ?        00:00:00 /sbin/syslogd -m 0 -O /var/log/syslog -S
  root       902     1  0 10:40 ?        00:00:00 /sbin/klogd -c 2
  root      1252     1  0 10:40 ttyS1    00:00:00 /bin/sh /sbin/debian-installer
  root      1264     1  0 10:40 tty4     00:00:00 /usr/bin/tail -f /var/log/syslog
  root      1265     1  0 10:40 tty2     00:00:00 busybox init
  root      1266     1  0 10:40 tty3     00:00:00 busybox init
  root      1281  1252  0 10:40 ttyS1    00:00:10 debconf -o d-i /usr/bin/main-menu
  root      1282  1281  0 10:40 ttyS1    00:00:00 /usr/bin/main-menu
  root      3048     1  0 10:41 ?        00:00:00 dhclient -e eth0
  root      8687     5  0 10:42 ?        00:00:00 [jfsIO]
  root      8689     5  0 10:42 ?        00:00:00 [jfsCommit]
  root      8690     5  0 10:42 ?        00:00:00 [jfsSync]
  root      8817     5  0 10:42 ?        00:00:00 [xfslogd/0]
  root      8819     5  0 10:42 ?        00:00:00 [xfsdatad/0]
  root     14533     1  0 10:43 ?        00:00:00 /usr/sbin/sshd
  root     14547 14533  0 10:43 ?        00:00:00 sshd: installer@pts/0
  root     14550 14547  0 10:43 ?        00:00:00 /bin/sh /sbin/debian-installer /bin/network-console-menu
  root     14572 14550  0 10:43 ?        00:00:00 debconf -o d-i /bin/network-console-menu
  root     14573 14572  0 10:43 ?        00:00:00 main-menu
  root     17197  1282  0 10:43 ttyS1    00:00:00 udpkg --configure --force-configure di-utils-shell
  root     17198 17197  0 10:43 ttyS1    00:00:00 /bin/sh -e /var/lib/dpkg/info/di-utils-shell.postinst configure
  root     17201 17198  0 10:43 ttyS1    00:00:00 /bin/sh -e /bin/debconf-disconnect /bin/sh
  root     17202 17201  0 10:43 ttyS1    00:00:00 /bin/sh
  root     24715     5  0 10:44 ?        00:00:00 [pdflush]
  root     24716     5  0 10:44 ?        00:00:00 [pdflush]
  root     26497     5  0 10:44 ?        00:00:00 [kjournald]
  root      6179 14573  0 10:50 ?        00:00:00 udpkg --configure --force-configure pkgsel
  root      6180  6179  0 10:50 ?        00:00:00 /bin/sh /var/lib/dpkg/info/pkgsel.postinst configure
  root      6371  6180  0 10:50 ?        00:00:00 /bin/sh /bin/in-target sh -c DEBCONF_APT_PROGRESS_DB_REPLACE="$DEBCONF_DB_REPLACE";?DEBCONF_APT_PROGRESS_DB_OVERRIDE="$DEBCONF_DB_OVERRIDE";?export DEBCONF_APT_PROGRESS_DB_REPLACE DEBCONF_APT_PROGRESS_DB_OVERRIDE;?DEBCONF_DB_REPLACE=configdb;?DEBCONF_DB_OVERRIDE='Pipe{infd:none outfd:none}';?export DEBCONF_DB_REPLACE DEBCONF_DB_OVERRIDE; tasksel --new-install --debconf-apt-progress='--from 5 --to 95 --logstderr'
  root      6417  6371  0 10:50 ?        00:00:00 log-output -t in-target chroot /target sh -c DEBCONF_APT_PROGRESS_DB_REPLACE="$DEBCONF_DB_REPLACE";?DEBCONF_APT_PROGRESS_DB_OVERRIDE="$DEBCONF_DB_OVERRIDE";?export DEBCONF_APT_PROGRESS_DB_REPLACE DEBCONF_APT_PROGRESS_DB_OVERRIDE;?DEBCONF_DB_REPLACE=configdb;?DEBCONF_DB_OVERRIDE='Pipe{infd:none outfd:none}';?export DEBCONF_DB_REPLACE DEBCONF_DB_OVERRIDE; tasksel --new-install --debconf-apt-progress='--from 5 --to 95 --logstderr'
  root      6418  6417  0 10:50 ?        00:00:00 sh -c DEBCONF_APT_PROGRESS_DB_REPLACE="$DEBCONF_DB_REPLACE";?DEBCONF_APT_PROGRESS_DB_OVERRIDE="$DEBCONF_DB_OVERRIDE";?export DEBCONF_APT_PROGRESS_DB_REPLACE DEBCONF_APT_PROGRESS_DB_OVERRIDE;?DEBCONF_DB_REPLACE=configdb;?DEBCONF_DB_OVERRIDE='Pipe{infd:none outfd:none}';?export DEBCONF_DB_REPLACE DEBCONF_DB_OVERRIDE; tasksel --new-install --debconf-apt-progress='--from 5 --to 95 --logstderr'
  root      6419  6418  0 10:50 ?        00:00:00 /usr/bin/perl /usr/bin/tasksel --new-install --debconf-apt-progress=--from 5 --to 95 --logstderr
  root      6961  6419  0 10:52 ?        00:00:00 /usr/bin/perl -w /usr/share/debconf/frontend /usr/bin/debconf-apt-progress --from 5 --to 95 --logstderr -- aptitude -q --without-recommends -y install ~pstandard ~prequired ~pimportant
  root      6963  6961  0 10:52 ?        00:00:00 /usr/bin/perl -w /usr/bin/debconf-apt-progress --from 5 --to 95 --logstderr -- aptitude -q --without-recommends -y install ~pstandard ~prequired ~pimportant
  root      6964  6963  0 10:52 ?        00:00:00 aptitude -o APT::Status-Fd=4 -o APT::Keep-Fds::=5 -o APT::Keep-Fds::=6 -q --without-recommends -y install ~pstandard ~prequired ~pimportant
  root      6969  6964  0 10:52 ?        00:00:00 /bin/sh -c /usr/sbin/dpkg-preconfigure --apt || true
  root      6970  6969  0 10:52 ?        00:00:00 /usr/bin/perl -w /usr/sbin/dpkg-preconfigure --apt
  root      6975 17202  0 10:52 ttyS1    00:00:00 bash
  root      7048  6975  0 11:07 ttyS1    00:00:00 ps -ef
  root      7049  6975  0 11:07 ttyS1    00:00:00 cat

So let's take a look at 6970.

  root@dl380g5:/proc# strace -p 6970
  Process 6970 attached - interrupt to quit
  read(5, 

It's stuck reading fd 5

  root@dl380g5:/proc# ls -l /proc/6970/fd/5
  lr-x------ 1 root root 64 Jun  5 10:54 /proc/6970/fd/5 -> pipe:[51837]

Which is a pipe. Let's assume 51837 is a unique identifier for this pipe.

  root@dl380g5:/proc# ls -l */fd/* | grep 51837       
  ls: 7057/fd/255: No such file or directory
  ls: 7057/fd/7: No such file or directory
  ls: self/fd/255: No such file or directory
  ls: self/fd/7: No such file or directory
  l-wx------ 1 root root 64 Jun  5 10:56 6961/fd/8 -> pipe:[51837]
  lr-x------ 1 root root 64 Jun  5 10:56 6963/fd/0 -> pipe:[51837]
  lr-x------ 1 root root 64 Jun  5 10:56 6964/fd/5 -> pipe:[51837]
  lr-x------ 1 root root 64 Jun  5 10:56 6969/fd/5 -> pipe:[51837]
  lr-x------ 1 root root 64 Jun  5 10:54 6970/fd/5 -> pipe:[51837]

Only one process has that open for writing, 6961, which is debconf-apt-progress.
Where is it stuck?

  root@dl380g5:/proc# strace -p 6961 
  Process 6961 attached - interrupt to quit
  read(9, 

Its stuck reading fd 9, which is...

  root@dl380g5:/proc# ls -l 6961/fd/9
 lr-x------ 1 root root 64 Jun  5 10:56 6961/fd/9 -> pipe:[51838]

Another pipe.

  root@dl380g5:/proc# ls -l */fd/* | grep 51838
  ls: 7063/fd/255: No such file or directory
  ls: 7063/fd/7: No such file or directory
  ls: self/fd/255: No such file or directory
  ls: self/fd/7: No such file or directory
  lr-x------ 1 root root 64 Jun  5 10:56 6961/fd/9 -> pipe:[51838]
  l-wx------ 1 root root 64 Jun  5 10:56 6963/fd/1 -> pipe:[51838]
  l-wx------ 1 root root 64 Jun  5 10:56 6964/fd/6 -> pipe:[51838]
  l-wx------ 1 root root 64 Jun  5 10:56 6969/fd/6 -> pipe:[51838]
  l-wx------ 1 root root 64 Jun  5 10:54 6970/fd/6 -> pipe:[51838]

hrm.. is this a circular dependency? Let's see if we can break a deadlock:

  root@dl380g5:/proc# echo > 6961/fd/8

Yep, that does it - the installation fails and causes me to repeat
the "Select and install software" step again - although this time, it succeeds.

Checking syslog, I see:

  Jun  5 16:52:19 in-target: Get:78 http://debian-mirror.fc.hp.com etch/main whois 4.7.20 [47.7kB]
  Jun  5 17:15:47 in-target: Use of uninitialized value in string eq at /usr/share/perl5/Debconf/FrontEnd/Passthrough.pm line 109, <GEN0> line 1.

So maybe my echo caused that error?

The other interesting thing about my environment is that I'm behind a sucky
proxy - there are a number of entries like this scattered throughout syslog:

  Jun  5 16:52:17 in-target: Err http://debian-mirror.fc.hp.com etch/main mime-support 3.39-1
  Jun  5 16:52:17 in-target:   Error reading from server - read (104 Connection reset by peer)

-- 
dann frazier




Reply to: