[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Idle TCP connections freeze



Hello,

Nikolaus Rath a écrit :
> 
> I'm having trouble with an internet connection that seems to randomly
> "freeze" arbitrary tcp connections when they have not been used for a
> while. The connections stay established, but no data is coming through.

How long is "a while", at a minimum ?

> When this happens, netstat still shows the connection status as
> `ESTABLISHED` on both the local computer:
> 
>     Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name Timer
>     tcp        0     53 192.168.0.10:41129      173.255.235.238:143     ESTABLISHED 8219/gnutls-cli  on (79.31/13/0)
> 
> ..and the remote server:
> 
>     Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name Timer
>     tcp        0      0 173.255.235.238:143     68.5.174.98:41129       ESTABLISHED 5303/imapd       off (0.00/0/0)

It appears that the client has a private addresse and the server has a
public address. So I guess that there is a NAT device between them, and
its stateful NAT engine may be the cause of the problem, by deleting
connections from its translation table after a delay of inactivity.

> When I look at a packet capture of this connection on the client side,
> there is a long (expected) period of inactivity that seems to trigger
> the problem, then the local end tries to transmit some data again but
> never receives an ACK. Instead, 15 TCP Retransmissions go out, with
> intervals increasing from 0.3 seconds to 120 seconds. No activity is
> captured after that.

Can you do a packet capture on the server side well ?

> Does anyone have a suggestion of how I could debug this further to find
> out where the problem lies and how to fix it?
> 
> Also, is is there some way to globally reduce the timeout on client
> and/or server to reduce the time before the local application aborts?

The Linux kernel supports system-wide TCP keepalive. However the
application must enable it on a per-socket basis, and the minimum
recommended value of 2 hours (which is the default in Linux) is quite
high, the inactivity timeout in your NAT device may be shorter. The best
workaround for this is to generate traffic with some kind of
application-level keepalive, either defined in the application protocol
such as in SSH, of by periodically sending dummy commands or data.


Reply to: