Bug#458154: Processed: Re: Bug#458154: network-console: long time-out time during install
On Mon, Jan 07, 2008 at 12:12:21AM +0100, Frans Pop wrote:
> However, the fact remains that _we_ have so far not been able to reproduce
> the issue. As I've said earlier, I've had an SSH install sitting unused for
> over 4 hours without the connection being lost, with basically default SSH
> settings both on the SSH client machine and in the installer.
> So the question still is _why_ ssh drops the connection in your case.
It's common enough for the network to be at fault here (depending on
your preferred definition of "fault"); for example, entries in NAT
tables can time out, which will cause the connection to die when you
next come back to it and try to send packets. Setting
ServerAliveInterval on the client side, as Del did, is probably the best
ServerAliveInterval is not enabled by default because it has negative
consequences for people with connections that are unreliable in a
different way. Bob Proulx recently put it like this on the
openssh-unix-dev mailing list:
One of the issues with setting a "keepalive" diddle is that it is also
a "makedead" diddle. If the connection is not online at that moment
then the diddle packet will cause the connection failure to be noticed
and will make it die. This causes many people to not refer to this so
much as a keepalive but as a makedead. It makes the connection dead.
Note that BatchMode sets keepalives automatically.
Many people who now have connections that stay alive okay without a
diddle packet would, if it were globally enabled, find that their
connections die because the network connection timed out at times that
they did not care about using it. The diddle would make their
connections dead. Without the diddle then the connection only dies if
it is offline when real data is needed to be transferred. It will
survive brief periods of the network being offline when nothing is
happening. It only has problems if there are real problems. With a
forced keepalive diddle packet sent periodically it may die due to
synthesized data. This may happen at times when nothing would have
been active without the keepalive setting and the connection would
have survived it okay.
There are two valid sides to this problem. There is no clear solution
that solves both problems at the same time. Neither is clearly right
with the other clearly wrong. This is what makes it a religous war
between the two opposing viewpoints. There is no single right answer.
It is a value judgement as to which one is more important or more
common than the other one. In these situations the status quo is
often the path of least resistance because it thrashes the least
number of people.
> Also, the solution you propose is on the _client_ side, so is not something
> we can fix in the installer. The only thing we could do at this point is
> document it.
Setting ClientAliveInterval in the installer's sshd configuration would
have a similar effect, but suffers from the same trade-off mentioned
above. We'd simply get a different set of bugs of approximately the same
severity from a different set of people.
I agree that documenting this is the best approach.
Colin Watson [email@example.com]