[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#757474: marked as done (libc6: amd copying a SVCXPRT structure leads to libc's RPC code sending packets of incorrect length)



Your message dated Sat, 4 Sep 2021 22:15:48 +0200
with message-id <YTPT9Pgd5SJpXzG5@aurel32.net>
and subject line Re: Bug#757474: libc6: amd copying a SVCXPRT structure leads to libc's RPC code sending packets of incorrect length
has caused the Debian Bug report #757474,
regarding libc6: amd copying a SVCXPRT structure leads to libc's RPC code sending packets of incorrect length
to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact owner@bugs.debian.org
immediately.)


-- 
757474: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=757474
Debian Bug Tracking System
Contact owner@bugs.debian.org with problems
--- Begin Message ---
Package: libc6
Version: 2.13-38+deb7u3
Severity: normal
Tags: upstream patch

This is really a problem with amd (am-utils), not the eglibc, but it's hard to solve on amd's side (see topic "NFS v2 RPC reply on LOOKUP" on the am-utils list) but can easily be hacked around on eglibc's side.

The phenomenon is an amd NFS mount (typically on user login) to stall for 5 or 10 seconds.

The root problem is that amd occasionally copies (the contents of) a SVCXPRT structure to store it away and be able to respond in the background. This is probably illegal, but "used to work" with the traditional SUN RPC implementation.

Now eglibc stores both an iovec and a msghdr structure in a private part of the SVCXPRT, with the embedded msgghdr's msg_iov field set to point at the corresponding embedded iovec. When the structure is copied, the embedded msghdr's msg_iov still points to the original SVCXPT's embedded iovec, not the one embedded in the copy. If the copy is then used to transmit a reply, the embedded iovec's length is set to the desired value, but sendmsg() actually uses the original SVCXPRT's value due to the msg_iov field of the msghdr embedded in the copy pointing at the iovec embedded in the original (which fields are not set to the desired values).
Then, sendmsg() transmits a reply of incorrect length and doesn't return with the expected value, which causes a second (error) reply being sent, confusing the client. The client then discard the reply and resends the request after a (five second) timeout. At that point, amd has probably finished the mount operation, doesn't background the request, replies correctly and everything works as expected.

The problem can obviously be hacked around by forcing the embedded msghdr's msg_iov field to point to the embedded iovec before passing the msghdr to sendmsg(), which the attached (one-line) patch does.

-- System Information:
Debian Release: 7.6
  APT prefers stable
  APT policy: (500, 'stable')
Architecture: amd64 (x86_64)
Foreign Architectures: i386

Kernel: Linux 3.10.42.wap (SMP w/2 CPU cores)
Locale: LANG=de_DE.UTF-8, LC_CTYPE=de_DE.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash

Versions of packages libc6:amd64 depends on:
ii  libc-bin  2.13-38+deb7u3
ii  libgcc1   1:4.7.2-5

libc6:amd64 recommends no packages.

Versions of packages libc6:amd64 suggests:
ii  debconf [debconf-2.0]  1.5.49
pn  glibc-doc              <none>
ii  locales                2.13-38+deb7u3

-- debconf information excluded
Index: sunrpc/svc_udp.c
===================================================================
--- sunrpc/svc_udp.c	(revision 3768)
+++ sunrpc/svc_udp.c	(revision 3769)
@@ -329,6 +329,7 @@
 	  iovp = (struct iovec *) &xprt->xp_pad [0];
 	  iovp->iov_base = rpc_buffer (xprt);
 	  iovp->iov_len = slen;
+	  mesgp->msg_iov = iovp; /* hack around clients like amd that memcpy() a SVCXPRT structure */
 	  sent = __sendmsg (xprt->xp_sock, mesgp, 0);
 	}
       else

--- End Message ---
--- Begin Message ---
Version: 2.32-0experimental0

On 2014-08-08 17:24, Edgar Fuß wrote:
> Package: libc6
> Version: 2.13-38+deb7u3
> Severity: normal
> Tags: upstream patch
> 
> This is really a problem with amd (am-utils), not the eglibc, but it's hard to solve on amd's side (see topic "NFS v2 RPC reply on LOOKUP" on the am-utils list) but can easily be hacked around on eglibc's side.
> 
> The phenomenon is an amd NFS mount (typically on user login) to stall for 5 or 10 seconds.
> 
> The root problem is that amd occasionally copies (the contents of) a SVCXPRT structure to store it away and be able to respond in the background. This is probably illegal, but "used to work" with the traditional SUN RPC implementation.
> 
> Now eglibc stores both an iovec and a msghdr structure in a private part of the SVCXPRT, with the embedded msgghdr's msg_iov field set to point at the corresponding embedded iovec. When the structure is copied, the embedded msghdr's msg_iov still points to the original SVCXPT's embedded iovec, not the one embedded in the copy. If the copy is then used to transmit a reply, the embedded iovec's length is set to the desired value, but sendmsg() actually uses the original SVCXPRT's value due to the msg_iov field of the msghdr embedded in the copy pointing at the iovec embedded in the original (which fields are not set to the desired values).
> Then, sendmsg() transmits a reply of incorrect length and doesn't return with the expected value, which causes a second (error) reply being sent, confusing the client. The client then discard the reply and resends the request after a (five second) timeout. At that point, amd has probably finished the mount operation, doesn't background the request, replies correctly and everything works as expected.
> 
> The problem can obviously be hacked around by forcing the embedded msghdr's msg_iov field to point to the embedded iovec before passing the msghdr to sendmsg(), which the attached (one-line) patch does.
> 

SunRPC support has been removed from glibc 2.32. Closing the bug
accordingly.

Regards,
Aurelien

-- 
Aurelien Jarno                          GPG: 4096R/1DDD8C9B
aurelien@aurel32.net                 http://www.aurel32.net

--- End Message ---

Reply to: