--- Begin Message ---
Package: libxau6
Version: 1:1.0.3-3
Severity: normal
There is a bug in >= 2.6.24 kernels[1], where a stat() in the nfs client
may return -ESTALE on an .Xauthority file that has been atomically
renamed from another host (ie, .Xauthority now has a different inode;
this happens when sshing to another host for example). A subsequent
open on the file without stat()ing it first (eg, with 'xauth list' on
the client) will succeed, and will update the nfs client's attribute
cache.
This bug can probably be worked around in the xauthority library (as
well as fixed in the kernel, since some people will be able to upgrade
one and not the other in a production environment) such that if stat()
returns -ESTALE, it should be reopened (and perhaps read from) before
closing and retrying the stat() again.
See the following[2] trace of an xterm on the nfs client.
The bug is rare enough (although it's happened to me 3 times today)
that I haven't yet seen the exact sequence necessary to recreate it -
I suspect it involves an X operation on the nfs client, followed
quickly (within the nfs cache timeout) by an ssh into a remote host,
followed by quickly opening a new xterm on the client. Although I say
"quickly", it seems that the buggy nfs client may be caching the stale
handle longer than it's meant to.
[1] debian bug 508866 and ubuntu bug 269954:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/269954
[2]
open("/proc/meminfo", O_RDONLY) = 3
fstat(3, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f4cd0a2a000
read(3, "MemTotal: 3096244 kB\nMemFree"..., 1024) = 774
close(3) = 0
munmap(0x7f4cd0a2a000, 4096) = 0
socket(PF_FILE, SOCK_STREAM, 0) = 3
getsockopt(3, SOL_SOCKET, SO_TYPE, [68719476737], [4]) = 0
connect(3, {sa_family=AF_FILE, path="/tmp/.X11-unix/X0"...}, 110) = 0
getpeername(3, {sa_family=AF_FILE, path="/tmp/.X11-unix/X0"...}, [139964394242068]) = 0
uname({sys="Linux", node="aatpc2", ...}) = 0
access("/home/aatlxz/twc/.Xauthority", R_OK) = -1 ESTALE (Stale NFS file handle)
fcntl(3, F_GETFL) = 0x2 (flags O_RDWR)
fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0
fcntl(3, F_SETFD, FD_CLOEXEC) = 0
select(4, [3], [3], NULL, NULL) = 1 (out [3])
writev(3, [{"l\0\v\0\0\0\0\0\0\0"..., 10}, {"\0\0"..., 2}], 2) = 12
read(3, 0x198e160, 8) = -1 EAGAIN (Resource temporarily unavailable)
select(4, [3], NULL, NULL, NULL) = 1 (in [3])
read(3, "\0\26\v\0\0\0\6\0"..., 8) = 8
read(3, "No protocol specified\n\0\0"..., 24) = 24
write(2, "No protocol specified\n"..., 22No protocol specified
) = 22
close(3) = 0
open("/usr/lib/X11/XtErrorDB", O_RDONLY) = -1 ENOENT (No such file or directory)
-- System Information:
Debian Release: lenny/sid
APT prefers testing
APT policy: (500, 'testing'), (500, 'stable')
Architecture: amd64 (x86_64)
Kernel: Linux 2.6.26-1-amd64 (SMP w/2 CPU cores)
Locale: LANG=en_AU.UTF-8, LC_CTYPE=en_AU.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/bash
Versions of packages libxau6 depends on:
ii libc6 2.7-16 GNU C Library: Shared libraries
libxau6 recommends no packages.
libxau6 suggests no packages.
-- no debconf information
--- End Message ---
--- Begin Message ---
- To: 508867-done@bugs.debian.org
- Subject: Re: Bug#508867: libxau6: ESTALE on nfs mounts
- From: Julien Cristau <jcristau@debian.org>
- Date: Fri, 1 Nov 2024 16:22:41 +0100
- Message-id: <ZyTyQeeI5pRzeQn3@carotte>
- In-reply-to: <20081216041345.18052.82807.reportbug@aatpc2.aao.gov.au>
- References: <20081216041345.18052.82807.reportbug@aatpc2.aao.gov.au>
Hi,
It's been over 15 years, so clearly we're not going to patch this in
debian, even as a stopgap...
If this is still an issue, it should be handled upstream
(https://gitlab.freedesktop.org/xorg/lib/libxau) first, and then filter
through to the distro.
Closing.
Cheers,
Julien
On Tue, Dec 16, 2008 at 15:13:45 +1100, Tim Connors wrote:
> Package: libxau6
> Version: 1:1.0.3-3
> Severity: normal
>
> There is a bug in >= 2.6.24 kernels[1], where a stat() in the nfs client
> may return -ESTALE on an .Xauthority file that has been atomically
> renamed from another host (ie, .Xauthority now has a different inode;
> this happens when sshing to another host for example). A subsequent
> open on the file without stat()ing it first (eg, with 'xauth list' on
> the client) will succeed, and will update the nfs client's attribute
> cache.
>
> This bug can probably be worked around in the xauthority library (as
> well as fixed in the kernel, since some people will be able to upgrade
> one and not the other in a production environment) such that if stat()
> returns -ESTALE, it should be reopened (and perhaps read from) before
> closing and retrying the stat() again.
>
> See the following[2] trace of an xterm on the nfs client.
>
> The bug is rare enough (although it's happened to me 3 times today)
> that I haven't yet seen the exact sequence necessary to recreate it -
> I suspect it involves an X operation on the nfs client, followed
> quickly (within the nfs cache timeout) by an ssh into a remote host,
> followed by quickly opening a new xterm on the client. Although I say
> "quickly", it seems that the buggy nfs client may be caching the stale
> handle longer than it's meant to.
>
> [1] debian bug 508866 and ubuntu bug 269954:
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/269954
>
>
> [2]
> open("/proc/meminfo", O_RDONLY) = 3
> fstat(3, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
> mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f4cd0a2a000
> read(3, "MemTotal: 3096244 kB\nMemFree"..., 1024) = 774
> close(3) = 0
> munmap(0x7f4cd0a2a000, 4096) = 0
> socket(PF_FILE, SOCK_STREAM, 0) = 3
> getsockopt(3, SOL_SOCKET, SO_TYPE, [68719476737], [4]) = 0
> connect(3, {sa_family=AF_FILE, path="/tmp/.X11-unix/X0"...}, 110) = 0
> getpeername(3, {sa_family=AF_FILE, path="/tmp/.X11-unix/X0"...}, [139964394242068]) = 0
> uname({sys="Linux", node="aatpc2", ...}) = 0
> access("/home/aatlxz/twc/.Xauthority", R_OK) = -1 ESTALE (Stale NFS file handle)
> fcntl(3, F_GETFL) = 0x2 (flags O_RDWR)
> fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0
> fcntl(3, F_SETFD, FD_CLOEXEC) = 0
> select(4, [3], [3], NULL, NULL) = 1 (out [3])
> writev(3, [{"l\0\v\0\0\0\0\0\0\0"..., 10}, {"\0\0"..., 2}], 2) = 12
> read(3, 0x198e160, 8) = -1 EAGAIN (Resource temporarily unavailable)
> select(4, [3], NULL, NULL, NULL) = 1 (in [3])
> read(3, "\0\26\v\0\0\0\6\0"..., 8) = 8
> read(3, "No protocol specified\n\0\0"..., 24) = 24
> write(2, "No protocol specified\n"..., 22No protocol specified
> ) = 22
> close(3) = 0
> open("/usr/lib/X11/XtErrorDB", O_RDONLY) = -1 ENOENT (No such file or directory)
>
>
> -- System Information:
> Debian Release: lenny/sid
> APT prefers testing
> APT policy: (500, 'testing'), (500, 'stable')
> Architecture: amd64 (x86_64)
>
> Kernel: Linux 2.6.26-1-amd64 (SMP w/2 CPU cores)
> Locale: LANG=en_AU.UTF-8, LC_CTYPE=en_AU.UTF-8 (charmap=UTF-8)
> Shell: /bin/sh linked to /bin/bash
>
> Versions of packages libxau6 depends on:
> ii libc6 2.7-16 GNU C Library: Shared libraries
>
> libxau6 recommends no packages.
>
> libxau6 suggests no packages.
>
> -- no debconf information
>
>
--- End Message ---