Bug#1109742: upgrade-reports: No new SSH connections possible during large part of upgrade to Debian Trixie
Hi,
On Thu, Jul 24, 2025 at 03:53:05PM +0100, Colin Watson wrote:
> Control: affects -1 openssh-server
> 
> [TL;DR: I think it may not be possible to properly solve this without a
> bookworm update as well as a change to trixie.]
> 
> On Thu, Jul 24, 2025 at 01:19:40PM +0100, Colin Watson wrote:
> > On Tue, Jul 22, 2025 at 07:42:07PM +0200, Manfred Stock wrote:
> > > Further Comments/Problems: I've upgraded several Bookworm systems to
> > > Trixie so far, which went pretty smooth. But there's one thing I keep
> > > noticing, and which I observed a bit more closely while upgrading the
> > > system I'm sending this report from: Starting at roughly the time when
> > > dpkg says something like
> > > 
> > > Unpacking openssh-server (1:10.0p1-5) over (1:9.2p1-2+deb12u6) ...
> > > 
> > > I'm not able anymore to open new SSH connections to the system I'm
> > > upgrading. The SSH daemon is still running, and the existing connections
> > > also still work, but new connections fail with
> > > 
> > > kex_exchange_identification: read: Connection reset by peer
> > > Connection reset by fd... port 22
> > > 
> > > on the client.
> [...]
> > Thanks for the report.  This will be due to the split of sshd-session
> > from the main sshd binary; the old sshd re-executed itself with
> > different arguments, but the new sshd executes sshd-session instead and
> > has removed support for the parameters that it used to rely on during
> > re-execution.
> > 
> > I'll have to set up a suitable environment to test this, but my best
> > idea for now is to have openssh-server.preinst take a copy of the old
> > sshd binary before dpkg unpacks the new files, and patch sshd to re-exec
> > that copy if it exists and it receives the -R option.  The postinst can
> > then remove the copy after it's restarted the new sshd.
> 
> This approach failed in my first test.  To control the order of operations,
> I just ran "dpkg --unpack" on the new .deb, and then the new /usr/sbin/sshd
> failed before it got as far as re-execing the temporary copy because it
> needed a newer libc.  apt might not do that in normal situations, but we
> clearly want to avoid this.
> 
> My next approach was to try a temporary diversion of /usr/sbin/sshd.  This
> works, although it involves a slightly odd invocation for openssh-server to
> be able to divert one of its own files.  See the "openssh_10.0p1-6.debdiff"
> attachment.  Would the release team accept something like this for trixie?
> 
> 
> However, this isn't the whole story.  Once the new libssl3t64 is unpacked,
> new connections fail with "OpenSSL version mismatch. Built against 30000100,
> you have 30500010".  This part of the problem can't be fixed by a change in
> trixie, because the problem is that the _old_ sshd, before restarting, fails
> to tolerate newer minor versions of OpenSSL.  This was fixed upstream in
> OpenSSH 9.4, and if I'd noticed previously that this would be an upgrade
> problem I'd already have included it in a bookworm update.
> 
> So, I think we also need to fix that in bookworm.  See the
> "openssh_9.2p1-2+deb12u7.debdiff" attachment (for brevity I've pruned some
> noise from git-dpm that just updates some commit IDs in patches).
> 
> Timing-wise, this is tricky.  IMO we really need to get this out before
> trixie releases to minimize the chance of users running into this if they
> rush to upgrade.  Would the security team be willing to consider pushing
> this out via -security?  Failing that, we'd have to wait until the next
> point release of bookworm, which I think would be unfortunate given that the
> consequences of sshd being broken between unpack and configure can include a
> failed remote upgrade with no way to access the system (if you forget to
> maintain a separate ssh connection, or if your network connection is
> interrupted).
IMHO, as this is not a security-update releaseing via a DSA is wrong,
but the correct target would be preparing it for bookworm point
release but release the updates with the reasoning above earlier via a
SUA (the release team obviously have to agree with that suggestion).
But IMHO stable-updates would be a perfect candidate for this usecase,
correct?
Regards,
Salvatore
Reply to: