[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#844494: reopen: Should close stderr when becoming multiplex master



Package: openssh-client
Version: 1:7.3p1-3+b1
Severity: normal

Dear Maintainer,

I think that the bug
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=714526 is still
live, after upgrading to 1:7.3p1-3+b1 (should have been fixed still
1:7.2p2-6).

I straced SSH with and without `ControlMaster` and extracted revelant
bits of information, redacting it, and removing useless lines. First,
without `ControlMaster`:

```
# Here, Python 3 Popen prepares pipes for stdin, stdout, and stderr:
7880  1479228055.059186 pipe2([8, 9], O_CLOEXEC) = 0
7880  1479228055.059226 pipe2([10, 11], O_CLOEXEC) = 0
7880  1479228055.059258 pipe2([12, 13], O_CLOEXEC) = 0
# Then ssh is cloned:
7880  1479228055.060158 clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f7bfd7fa9d0) = 7949
7949  1479228055.121259 <... clone resumed> child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fd3f2a63650) = 7954
7954  1479228055.121135 execve("/bin/sh", ["/bin/sh", "-c", "exec ssh user@gate.redacted -W dev2.redacted:22"], [/* 16 vars */] <unfinished ...>
# Normal exit
7949  1479228055.592488 +++ exited with 0 +++
7871  1479228055.592519 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=7949, si_uid=1003, si_status=0, si_utime=0, si_stime=2} ---
7954  1479228055.592786 exit_group(129) = ?
# Python receive EOF on stdout AND stderr
7880  1479228055.694455 read(12,  <unfinished ...>
7880  1479228055.694480 <... read resumed> "", 32768) = 0
7880  1479228055.738380 read(10,  <unfinished ...>
7880  1479228055.738410 <... read resumed> "", 32768) = 0
# So, as the process is exited, AND stdout/stderr had their EOF, Python 3 subprocess.Popen.communicate allows itself to wait for it:
7880  1479228055.924898 wait4(7949,  <unfinished ...>
7880  1479228055.925334 <... wait4 resumed> [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 7949
```

Now, with `ControlMaster`:

```
# Same pipes for stdin, stdout, and stderr from Python Popen:
6749  1479227922.771783 pipe2([8, 9], O_CLOEXEC) = 0
6749  1479227922.771831 pipe2([10, 11], O_CLOEXEC) = 0
6749  1479227922.771865 pipe2([12, 13], O_CLOEXEC) = 0
# A bunch of clones, SSH starting its master process:
6749  1479227922.774839 <... clone resumed> child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fa1957fa9d0) = 6824
6824  1479227922.818219 <... clone resumed> child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7ff4c6f11650) = 6827
6827  1479227922.820940 execve("/bin/sh", ["/bin/sh", "-c", "exec ssh user@redacted -W redacted:22"], [/* 16 vars */] <unfinished ...>
6824  1479227923.204918 <... clone resumed> child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7ff4c6f11650) = 6856
6856  1479227923.205441 <... clone resumed> child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7ff4c6f11650) = 6857
6857  1479227923.206004 <... clone resumed> child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7ff4c6f11650) = 6858
# Normal exit after work is done:
6824  1479227923.335999 +++ exited with 0 +++
6749  1479227923.336042 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=6824, si_uid=1003, si_status=0, si_utime=0, si_stime=0} ---
# Python is getting EOF on STDOUT
6749  1479227923.336086 read(10, "", 32768) = 0
# But here, process is a zombie (dead but not awaited), after like a minute,
# I send a SIGINT and a SIGQUIT to the Python process, so everyone dies and finally:
6749  1479227986.358448 read(12, "", 32768) = 0
6827  1479227986.328027 exit_group(130) = ?
6858  1479227986.329254 exit_group(255) = ?
6749  1479227986.578489 wait4(6824,  <unfinished ...>
6749  1479227986.578546 <... wait4 resumed> [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 6824
```

So here it looks like the bug is exactly the same as before: stdout is
closed but stderr is kept open. As Python's
``subprocess.Popen.communicate`` want to wait for both to be closed before
waiting for the process, it yields to stuck Python process and zombies.

-- System Information:
Debian Release: 8.6
  APT prefers stable
  APT policy: (900, 'stable'), (200, 'testing')
Architecture: amd64 (x86_64)

Kernel: Linux 3.16.0-4-amd64 (SMP w/4 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8) (ignored: LC_ALL set to en_US.utf8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)

Versions of packages openssh-client depends on:
ii  adduser           3.113+nmu3
ii  dpkg              1.17.27
ii  libc6             2.24-5
ii  libedit2          3.1-20140620-2
ii  libgssapi-krb5-2  1.14.3+dfsg-2
ii  libselinux1       2.3-2
ii  libssl1.0.2       1.0.2j-1
ii  passwd            1:4.2-3+deb8u1
ii  zlib1g            1:1.2.8.dfsg-2+b1

Versions of packages openssh-client recommends:
ii  xauth  1:1.0.9-1

Versions of packages openssh-client suggests:
pn  keychain      <none>
pn  libpam-ssh    <none>
pn  monkeysphere  <none>
pn  ssh-askpass   <none>

-- no debconf information


Reply to: