[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

getaddrinfo() return value chaos



Continuing on from the "boot ordering and resolvconf" thread;
cc:ed to Helmut in case this gets filtered again; bcc:ed to
683061@bugs.debian.org since this is relevant for how that
issue is addressed...

Executive summary: The getaddrinfo() returns different values
depending on the OS and on nsswitch.conf settings, making it
very difficult to use getaddrinfo() return values to deciding how
to handle an error.

Here are the results of further experiments with getaddrinfo().
I am using the attached x.c program.  It tries to look up the
valid domain name 'www.google.com' and an invalid name
four times:
* once with empty /etc/resolv.conf (in which case the resolver tries 127.0.0.1:53)
* once with /etc/resolv.conf pointing to a working nameserver on my LAN
* once with empty /etc/resolv.conf (again)
* once with /etc/resolv.conf pointing to an IP address where there is no working nameserver

OS is Debian 7.0.

================================
# ./a.out
Making resolv.conf empty
Results of looking up www.google.com: status = -2, errno = 2
Results of looking up a bogus name: status = -2, errno = 2
Writing nameserver option to resolv.conf
Results of looking up www.google.com: status = 0, errno = 101
Results of looking up a bogus name: status = -2, errno = 2
Making resolv.conf empty
Results of looking up www.google.com: status = -2, errno = 2
Results of looking up a bogus name: status = -2, errno = 2
Writing incorrect nameserver option to resolv.conf
Results of looking up www.google.com: status = -2, errno = 2
Results of looking up a bogus name: status = -2, errno = 2
================================

As I saw before, status is always -2 (EAI_NONAME).  The manpage
doesn't say that errno is significant in that case. (It is significant
when status is -11.)

Helmut got different results. Is the difference between my machine
and Helmut's machine attributable to some diff in nsswitch.conf,
perhaps?  I have:

    hosts: files mdns4_minimal [NOTFOUND=return] dns mdns4

I tested next with

    hosts: dns

and got different results.

================================
Making resolv.conf empty
Results of looking up www.google.com: status = -2, errno = 111
Results of looking up a bogus name: status = -2, errno = 111
Writing nameserver option to resolv.conf
Results of looking up www.google.com: status = 0, errno = 101
Results of looking up a bogus name: status = -2, errno = 101
Making resolv.conf empty
Results of looking up www.google.com: status = -2, errno = 111
Results of looking up a bogus name: status = -2, errno = 111
Writing incorrect nameserver option to resolv.conf
Results of looking up www.google.com: status = -2, errno = 111
Results of looking up a bogus name: status = -2, errno = 111
================================

Now errno for empty or incorrect resolv.conf is 111 (ECONNREFUSED ).
And with correct resolv.conf and bogus domain name errno is 101
(ENETUNREACH).  That doesn't make too much sense but as I
said I don't think we are supposed to pay attention to errno if
status is -2.

Next I ran the program on Ubuntu 13.04 with "hosts: dns".

================================
Making resolv.conf empty
Results of looking up www.google.com: status = -11, errno = 111
Results of looking up a bogus name: status = -11, errno = 111
Writing nameserver option to resolv.conf
Results of looking up www.google.com: status = 0, errno = 101
Results of looking up a bogus name: status = -2, errno = 101
Making resolv.conf empty
Results of looking up www.google.com: status = -11, errno = 111
Results of looking up a bogus name: status = -11, errno = 111
Writing incorrect nameserver option to resolv.conf
Results of looking up www.google.com: status = -11, errno = 110
Results of looking up a bogus name: status = -11, errno = 110
================================

This is different in two ways. First the status is -11 (EAI_SYSTEM)
instead of -2 (EAI_NONAME) when no nameserver can be reached.
Second, there is now a difference between the empty-resolv.conf
case and the resolv.conf-with-bogus-address case. In the latter
case errno is 110 (ETIMEDOUT) instead of 111 (ECONNREFUSED).
This is better.

Debian 7.0 has libc6 version 2.13-37.
Ubuntu 13.04 has libc6 version 2.17-0ubuntu5.

What's behind all this?  [google, google...]  I see that in
November 2012 a change was made upstream

  http://sourceware.org/git/?p=glibc.git;a=commitdiff;h=cfde9b463d63092ff0908d4c2748ace648e2ead8

in response to a complaint that the wrong status was returned
in the case of an internal error (out-of-fds).

  http://sourceware.org/bugzilla/show_bug.cgi?id=14719

This is possibly the reason that I get status -11 instead of
-2 on Ubuntu.

But the result of the change was that getaddrinfo() returned
EAI_SYSTEM in too many cases.

  http://sourceware.org/bugzilla/show_bug.cgi?id=15339

This has recently (May 2013) been fixed

  http://sourceware.org/bugzilla/show_bug.cgi?id=15635
  http://sourceware.org/git/?p=glibc.git;a=commitdiff;h=3d04f5db20c8f0d1ba3881b5f5373586a18cf188

such that (IIUC) now getaddrinfo() should return different
status/errno combinations for internal errors, network errors
and name resolution failures. I haven't tested the upstream
code yet.

I don't get the impression that the handling of return values
by the various eglibc layers has been well thought out and
documented; the developers seem to be making changes ad-hoc.

In any case, because of all these differences and changes we
won't have a good, stable getaddrinfo() interface to program
against until Jessie.  In the meantime a program that needs to
distinguish between different causes for a name resolution
failure will have to do more than just check the status and
errno from getaddinfo().
--
Thomas
#include <sys/types.h>
#include <sys/socket.h>
#include <netdb.h>
#include <errno.h>
#include <stdio.h>

struct addrinfo *res;

void check_google()
{
    int status;
    status = getaddrinfo("www.google.com", NULL, NULL, &res);
    printf("Results of looking up www.google.com: status = %d, errno = %d\n", status, errno);
    status = getaddrinfo("sjfkdsjfswfloo0f02938sjf28398sd.com", NULL, NULL, &res);
    printf("Results of looking up a bogus name: status = %d, errno = %d\n", status, errno);
}

int main()
{
    FILE *fp;

    printf("Making resolv.conf empty\n");
    fp = fopen("/etc/resolv.conf", "w+"); fclose(fp);
    sleep(1);
    check_google();

    printf("Writing nameserver option to resolv.conf\n");
    fp = fopen("/etc/resolv.conf", "w+"); fprintf(fp, "nameserver 192.168.1.254\n"); fclose(fp);
    sleep(1);
    check_google();

    printf("Making resolv.conf empty\n");
    fp = fopen("/etc/resolv.conf", "w+"); fclose(fp);
    sleep(1);
    check_google();

    printf("Writing incorrect nameserver option to resolv.conf\n");
    fp = fopen("/etc/resolv.conf", "w+"); fprintf(fp, "nameserver 192.168.5.4\n"); fclose(fp);
    sleep(1);
    check_google();
}

Reply to: