[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#633793: marked as done (libc6: [sparc] getaddrinfo() times out, succeeds on retry)



Your message dated Thu, 21 Jul 2011 15:01:24 +0200
with message-id <20110721130124.GG8346@hall.aurel32.net>
and subject line Re: Bug#633793: (no subject)
has caused the Debian Bug report #633793,
regarding libc6: [sparc] getaddrinfo() times out, succeeds on retry
to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact owner@bugs.debian.org
immediately.)


-- 
633793: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=633793
Debian Bug Tracking System
Contact owner@bugs.debian.org with problems
--- Begin Message ---
Package: libc6
Version: 2.11.2-10
Severity: important


getaddrinfo() randomly encounters 5 to 10 second delays in DNS name
resolution, and sometimes fails.  Using strace, I was able to track this
down to timeouts in waiting for the UDP socket to be ready for a read().
Adjusting the timeout or attempts options in /etc/resolv.conf influences
the timeout behavior.

Running tcpdump on the client shows that the DNS server is sending
responses to the client's queries, but the resolver seems to ignore them.

When running strace on the following python script:

  import socket
  socket.getaddrinfo('www.google.com', 80)

I get the following output from strace (note the Timeouts):

     0.000298 socket(PF_INET, SOCK_DGRAM|SOCK_NONBLOCK, IPPROTO_IP) = 3
     0.000213 connect(3, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("130.127.255.250")}, 16) = 0
     0.000251 gettimeofday({1310583673, 595304}, NULL) = 0
     0.000180 poll([{fd=3, events=POLLOUT}], 1, 0) = 1 ([{fd=3, revents=POLLOUT}])
     0.000231 send(3, "D\230\1\0\0\1\0\0\0\0\0\0\3www\6google\3com\0\0\1\0\1", 32, MSG_NOSIGNAL) = 32
     0.000307 poll([{fd=3, events=POLLIN|POLLOUT}], 1, 5000) = 1 ([{fd=3, revents=POLLOUT}])
     0.000239 send(3, "\237K\1\0\0\1\0\0\0\0\0\0\3www\6google\3com\0\0\34\0\1", 32, MSG_NOSIGNAL) = 32
     0.000303 gettimeofday({1310583673, 596567}, NULL) = 0
     0.000262 poll([{fd=3, events=POLLIN}], 1, 4998) = 1 ([{fd=3, revents=POLLIN}])
     0.000453 ioctl(3, 0x4004667f, 0xff98bc50) = 0
     0.000287 recvfrom(3, "D\230\201\200\0\1\0\7\0\4\0\4\3www\6google\3com\0\0\1\0\1"..., 2048, 0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("130.127.255.250")}, [16]) = 284
     0.000538 gettimeofday({1310583673, 598108}, NULL) = 0
     0.000174 poll([{fd=3, events=POLLIN}], 1, 4997) = 0 (Timeout)
     5.002406 gettimeofday({1310583678, 600685}, NULL) = 0
     0.000173 poll([{fd=3, events=POLLOUT}], 1, 0) = 1 ([{fd=3, revents=POLLOUT}])
     0.000225 send(3, "D\230\1\0\0\1\0\0\0\0\0\0\3www\6google\3com\0\0\1\0\1", 32, MSG_NOSIGNAL) = 32
     0.000301 poll([{fd=3, events=POLLIN}], 1, 5000) = 1 ([{fd=3, revents=POLLIN}])
     0.001226 ioctl(3, 0x4004667f, 0xff98bc50) = 0
     0.000359 recvfrom(3, "D\230\201\200\0\1\0\7\0\4\0\4\3www\6google\3com\0\0\1\0\1"..., 2048, 0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("130.127.255.250")}, [16]) = 284
     0.000531 gettimeofday({1310583678, 603502}, NULL) = 0
     0.000210 poll([{fd=3, events=POLLOUT}], 1, 4997) = 1 ([{fd=3, revents=POLLOUT}])
     0.000239 send(3, "\237K\1\0\0\1\0\0\0\0\0\0\3www\6google\3com\0\0\34\0\1", 32, MSG_NOSIGNAL) = 32
     0.000296 gettimeofday({1310583678, 604250}, NULL) = 0
     0.000176 poll([{fd=3, events=POLLIN}], 1, 4996) = 0 (Timeout)
     5.001422 close(3)                  = 0

Hard-coding an entry for www.google.com into /etc/hosts results in no
timeout from getaddrinfo().

Additionally, gethostbyname() does not seem to result in any delays
regardless of whether or not the host is in /etc/hosts or is coming from
DNS.

This problem is not specific to python.  We first noticed the problem in
the LDAP client, but were able to track it down to a general problem with
getaddrinfo().

-- System Information:
Debian Release: 6.0.2
  APT prefers stable-updates
  APT policy: (500, 'stable-updates'), (500, 'stable')
Architecture: sparc (sparc64)

Kernel: Linux 2.6.32 (SMP w/1 CPU core)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash

Versions of packages libc6 depends on:
ii  libc-bin                      2.11.2-10  Embedded GNU C Library: Binaries
ii  libgcc1                       1:4.4.5-8  GCC support library

libc6 recommends no packages.

Versions of packages libc6 suggests:
ii  debconf [debconf-2.0]         1.5.36.1   Debian configuration management sy
pn  glibc-doc                     <none>     (no description available)
ii  locales                       2.11.2-10  Embedded GNU C Library: National L

-- debconf information:
  glibc/upgrade: true
  glibc/restart-services:
  glibc/disable-screensaver:
  glibc/restart-failed:



--- End Message ---
--- Begin Message ---
On Wed, Jul 20, 2011 at 04:15:51PM -0400, Scott Duckworth wrote:
> We just discovered that we are also experiencing this same issue on
> CentOS 6 and Fedora 14.  We are not experiencing the issue under CentOS
> 5 and Ubuntu 10.04.  So this issue does not appear to be Debian-specific.
> 
> Furthermore, we also discovered that using a different DNS server (the
> Google DNS servers), we do not experience the problem.  It is likely
> that this is a problem with our organization's DNS servers.
> 
> From a Debian point of view, this bug can probably be closed.  However,
> when we get the problem figured out I will try to update this bug with
> the solution.
>

Ok, thanks for the update, I am closing the bug for now. Feel free to
reopen it if you have more details.

-- 
Aurelien Jarno	                        GPG: 1024D/F1BCDB73
aurelien@aurel32.net                 http://www.aurel32.net


--- End Message ---

Reply to: