[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Bug#773061: aptitude-robot: Hangs with dpkg zombies under some (not yet 100% clear) circumstances



Control: tag -1 + patch

Hi,

TL;DR for those getting this mail via deity@lists.debian.org: Since
apt 1.0.9.4 aptitude-robot-session hangs at (simplified) "yes '' |
aptitude full-upgrade -y -q" with a dpkg zombie process as child. For
more details please have a look at the bug's history at
https://bugs.debian.org/773061

Axel Beckert wrote:
> > So I assume that the following patch will fix the issue:
> 
> Despite it may not look so, that patch is effectively just a revert of
> 169ee18d77a6a80248bdbd1d95cf626638219cb5 and hence only replaces one
> issue with another, because
> 
> → echo bar > /tmp/bar.txt
> → bash -c 'echo foo | cat -v < /tmp/bar.txt'
> bar
> → dash -c 'echo foo | cat -v < /tmp/bar.txt'
> bar
> 
> i.e. the initial "yes '' |" in the patch would be ignored.
> 
> Need to dig deeper...

Digging deeper revealed two things:

1) The above simplification is too simple. It does not take into
   account that the tool in the middle may fiddle around with file
   descriptors and pseudo ttys besides plain STDIN/STDOUT handling as
   "cat" does.

2) I now know that the trigger for these issues wasn't the dpkg
   1.17.13 → 1.17.21 update, but the apt 1.0.9.3 → 1.0.9.4 update.
   Downgrading apt to 1.0.9.3 makes the issue vanish.

   I suspect that either the fix for https://bugs.debian.org/767774 or
   the one for https://bugs.debian.org/765687 are the actual culprit.

I now wonder if I should apply the initially applied patch to
aptitude-robot or if this is something which needs to fixed anyways.
In the latter case, I'd prefer to not touch aptitude-robot and wait
for the apt fix.

About the first date of occurrence:

1.0.9.4 was uploaded on 3rd of December 2014 to Unstable and hit
Jessie only on 9th of November. My initial statement of having
observed the issue since the 4th of Decembers based upon a single
"yes" process with no according aptitude process anymore. I now
suspect that this was a wrong guess.

I've no checked our monitoring logs: The first times where
aptitude-robot-session was no more able to install updates on Jessie
were on:

* Wed Dec 10 06:39:53 2014
* Wed Dec 10 06:38:11 2014
* Wed Dec 10 06:36:40 2014

(On one of the four machines there were other issues which overlay the
check for pending updates)

Since after the propagation of apt 1.0.9.4 to testing it needs one
aptitude-robot-session run to install the apt update, it was only the
second aptitude-robot-session run after the propagation which actually
hung. That fits quite well to apt having migrated on the 9th of Dec.:
https://packages.qa.debian.org/a/apt/news/20141209T163914Z.html

		Regards, Axel
-- 
 ,''`.  |  Axel Beckert <abe@debian.org>, http://people.debian.org/~abe/
: :' :  |  Debian Developer, ftp.ch.debian.org Admin
`. `'   |  1024D: F067 EA27 26B9 C3FC 1486  202E C09E 1D89 9593 0EDE
  `-    |  4096R: 2517 B724 C5F6 CA99 5329  6E61 2FF9 CD59 6126 16B5


Reply to: