[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#590935: NFS client cannot access a share when the TCP connection status is TIME_WAIT



Package: nfs-common
Version: 1:1.1.2-6lenny2

After 5 minutes of inactivity of an nfs share, the status of the TCP connection between the client port (779 in the transcript below) and the NFS server port (2049) switches from "ESTABLISHED" to "TIME_WAIT" which is totally normal. Then, according to the default timeout value for the TIME_WAIT state, the connection remains in this state for one minute (60 seconds, which is twice the value of the MSL). If during this minute, another attempt to access the same NFS share is performed, an Input/output error is generated. After a minute the connection occurs normally with the same client port number (779 in the transcript below). Below is a transcript:

########BEGINNING OF TRANSCRIPT ###############
# netstat -na | grep 10.0.0.1 ; date
Fri Jul 30 10:22:18 CEST 2010
# mount -t nfs
10.0.0.1:/export/test on /share/test type nfs (rw,intr,rsize=8192,wsize=8192,addr=10.0.0.1)
# netstat -na | grep 10.0.0.1
# date; time ls /share/test/ ; date
Fri Jul 30 10:22:55 CEST 2010
testfile

real    0m0.003s
user    0m0.000s
sys     0m0.000s
Fri Jul 30 10:22:55 CEST 2010
# netstat -na | grep 10.0.0.1 ; date
tcp        0      0 10.0.0.2:779       10.0.0.1:2049      ESTABLISHED
Fri Jul 30 10:23:58 CEST 2010
# netstat -na | grep 10.0.0.1 ; date
tcp        0      0 10.0.0.2:779       10.0.0.1:2049      TIME_WAIT  
Fri Jul 30 10:28:08 CEST 2010
# date; time ls /share/test/ ; date
Fri Jul 30 10:28:16 CEST 2010
ls: cannot access /share/test/: Input/output error

real    0m0.186s
user    0m0.000s
sys     0m0.056s
Fri Jul 30 10:28:16 CEST 2010
# netstat -na | grep 10.0.0.1 ; date
tcp        0      0 10.0.0.2:779       10.0.0.1:2049      TIME_WAIT  
Fri Jul 30 10:28:23 CEST 2010
# netstat -na | grep 10.0.0.1 ; date
Fri Jul 30 10:29:15 CEST 2010
# date; time ls /share/test/ ; date
Fri Jul 30 10:29:19 CEST 2010
testfile

real    0m0.003s
user    0m0.000s
sys     0m0.000s
Fri Jul 30 10:29:19 CEST 2010
# netstat -na | grep 10.0.0.1 ; date
tcp        0      0 10.0.0.2:779       10.0.0.1:2049      ESTABLISHED
Fri Jul 30 10:29:22 CEST 2010
# 
############END OF TRANSCRIPT ###############

I am using Debian Lenny 2.6.26-2-amd64 #1 SMP Sun Jun 20 20:16:30 UTC 2010 x86_64 GNU/Linux.

It should be noted that on other system/version (Last updates of Redhat 5.5, Ubuntu 10.04, Debian Squeeze/Sid), the behavior is slightly different: When the connection is reinstated during the "TIME_WAIT minute", another port number (the client port number minus one) is used and the NFS share can be accessed without error.

Sincerely,
Jean-Francois C. Weber
Linux System Engineer
Phone: +33 1 70 44 04 17
jeanfrancois.weber@sfr.com
6 rue Nieuport
78140 Velizy-Villacoublay, France
www.sfrbusinessteam.fr



Reply to: