[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#757474: libc6: amd copying a SVCXPRT structure leads to libc's RPC code sending packets of incorrect length

Package: libc6
Version: 2.13-38+deb7u3
Severity: normal
Tags: upstream patch

This is really a problem with amd (am-utils), not the eglibc, but it's hard to solve on amd's side (see topic "NFS v2 RPC reply on LOOKUP" on the am-utils list) but can easily be hacked around on eglibc's side.

The phenomenon is an amd NFS mount (typically on user login) to stall for 5 or 10 seconds.

The root problem is that amd occasionally copies (the contents of) a SVCXPRT structure to store it away and be able to respond in the background. This is probably illegal, but "used to work" with the traditional SUN RPC implementation.

Now eglibc stores both an iovec and a msghdr structure in a private part of the SVCXPRT, with the embedded msgghdr's msg_iov field set to point at the corresponding embedded iovec. When the structure is copied, the embedded msghdr's msg_iov still points to the original SVCXPT's embedded iovec, not the one embedded in the copy. If the copy is then used to transmit a reply, the embedded iovec's length is set to the desired value, but sendmsg() actually uses the original SVCXPRT's value due to the msg_iov field of the msghdr embedded in the copy pointing at the iovec embedded in the original (which fields are not set to the desired values).
Then, sendmsg() transmits a reply of incorrect length and doesn't return with the expected value, which causes a second (error) reply being sent, confusing the client. The client then discard the reply and resends the request after a (five second) timeout. At that point, amd has probably finished the mount operation, doesn't background the request, replies correctly and everything works as expected.

The problem can obviously be hacked around by forcing the embedded msghdr's msg_iov field to point to the embedded iovec before passing the msghdr to sendmsg(), which the attached (one-line) patch does.

-- System Information:
Debian Release: 7.6
  APT prefers stable
  APT policy: (500, 'stable')
Architecture: amd64 (x86_64)
Foreign Architectures: i386

Kernel: Linux 3.10.42.wap (SMP w/2 CPU cores)
Locale: LANG=de_DE.UTF-8, LC_CTYPE=de_DE.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash

Versions of packages libc6:amd64 depends on:
ii  libc-bin  2.13-38+deb7u3
ii  libgcc1   1:4.7.2-5

libc6:amd64 recommends no packages.

Versions of packages libc6:amd64 suggests:
ii  debconf [debconf-2.0]  1.5.49
pn  glibc-doc              <none>
ii  locales                2.13-38+deb7u3

-- debconf information excluded
Index: sunrpc/svc_udp.c
--- sunrpc/svc_udp.c	(revision 3768)
+++ sunrpc/svc_udp.c	(revision 3769)
@@ -329,6 +329,7 @@
 	  iovp = (struct iovec *) &xprt->xp_pad [0];
 	  iovp->iov_base = rpc_buffer (xprt);
 	  iovp->iov_len = slen;
+	  mesgp->msg_iov = iovp; /* hack around clients like amd that memcpy() a SVCXPRT structure */
 	  sent = __sendmsg (xprt->xp_sock, mesgp, 0);

Reply to: