[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: glibc's getaddrinfo() sort order



Anthony Towns writes ("Re: glibc's getaddrinfo() sort order"):
> Stability is useful for any case where the servers hosting a particular
> might be out of sync with each other; eg, if stability could be assumed
> we'd have less errors where an invocation of "apt-get update" chooses one
> mirror, and a subsequent "apt-get upgrade" chooses a different server
> that hasn't finished syncing. Hopefully "apt-get" isn't considered
> "too pathological to live"...

If an application needs to get (or prefer) the same address each time,
it is trivial to have it save the answer from the DNS lookup
(preferably, honouring the timeout).

Note that _when apt started doing rule 9 it broke our own ftp site_ !

> > One of the existing use cases that breaks is round-robin DNS.  
> 
> Round-robin DNS isn't broken; the expectation of (approximately) equal
> load-distribution across all servers in a round-robin is broken.

DNS administrators currently configure multiple addresses in order to
achieve equal load distribution.

> Better routing has less direct benefits to the client, probably limited
> to slightly better ping times, with a small chance of somewhat cheaper
> bandwidth costs. For the people providing the service, it lets you make
> better assumptions as to load balancing -- you can expect the servers
> based in a particular area to be serving a load proportional to the
> number of users in that area, rather than having the load fairly evenly
> distributed globally. Of course, there are other ways of doing this
> that don't rely on how the client's resolver is implemented. Of course,
> if the routing is worse, those turn into drawbacks instead of benefits.

This "number of users in that area" idea is _not_ what DNS
administrators currently mean when they configure multiple IPv4
addresses.

And of course that's not what rule 9 does, as Steve has explained.

> Even without the possibility of applications like apt-get benefiting
> from stability of results, I don't think we've done anywhere enough of
> a review to be declaring that there aren't any benefits to rule 9.

You're asking us to prove a negative.

And, this is the wrong analysis in any case.

Regardless of whether rule 9 for IPv4 would have some theoretical (or
even practical benefits) it is _not the current standard_.  The
current standard, as expected by DNS administrators who actually
configure multiple addresses, is the behaviour of gethostbyname.

AJ: You keep dodging this point.  Can you please explain why if
getaddrinfo should do rule 9, gethostbyname shouldn't ?  Or to look at
it another way, why whether an application applies rule 9 for IPv4
should depend on whether that application has been updated to allow it
to support IPv6 ?

> As far as I can see, for rule 9 to be fundamentally misguided and
> broken, the concept of providing a stable answer, or a better than random
> ordering, would need to be harmful. If they're beneficial, even in some
> cases, then we've got a problem in the details of the specification,
> not a fundamental issue.

Yes, providing a stable answer _is_ harmful precisely because the
whole point of round robin DNS is to _not_ provide a stable answer,
because that allows a server operator to spread load effectively (and
various fringe benefits).

> Uh, round-robin DNS isn't a guarantee that any individual client will
> get different or randomised results -- and the argument that round-robin
> won't break anything that relies on rule 9 goes the other way too.

There aren't Internet protocols where server operators have deployed
multiple IPv4 addresses relying on rule 9.  You can see that quite
easily, because any application that happens to use gethostbyname
rather than getaddrinfo won't work.

> Further, having getaddrinfo() behave differently for IPv4 and IPv6
> isn't completely helpful in making Debian support IPv6 -- if we change
> a program from gethostbyname() to getaddrinfo() under the assumption
> they behave the same way and that's fine (for IPv4), but getaddrinfo()
> for the particular app for IPv6 requires extra randomisation or the
> addition of fail-over code to work sensibly, we're not done.

The difference is that it is alleged (by rule 9 proponents) that DNS
operators who configure multiple IPv6 addresses ought to do that only
when they want rule 9.

As Steve and I have explained, we don't think rule 9 makes sense for
IPv6 either.

However, there is only a very small base of IPv6 use at the moment,
and of course clients which can do IPv6 have necessarily already been
converted to use getaddrinfo, many implementations of which have done
rule 9 for some time.  Ie, using rule 9 is more common for IPv6 and
DNS administrators therefore already can't publish multiple AAAA RRs
and expect to get the benefits of the DNS round robin.

Or to put it another way: the IMO irrefutable argument I'm laying out
against rule 9 for IPv4 is that _existing DNS configurations_ where
multiple IPv4 addresses are published expect that those addresses will
be treated by client systems the same way they have been for decades:
according to round robin DNS.

This argument doesn't apply to IPv6.  So that's why I think it is
impossible to argue coherently that rule 9 is correct for IPv4, but at
least conceivable for IPv6.

> It's sound to a point, but we haven't established an understanding of
> what common behaviour on today's internet actually is, we haven't done any
> review of how getaddrinfo() is being used outside of our own experiences,
> and we haven't made any examination of how we can achieve the apparent
> goals underlying the proposed standard without the drawbacks that are
> concerning us.

COMMON BEHAVIOUR ON TODAY'S INTERNET IS THAT IMPLEMENTED BY
GETHOSTBYNAME.

How many times do I have to explain this ?  getaddrinfo is the
REPLACEMENT FOR GETHOSTBYNAME.  It is not an interface which
applications choose because they want different address sorting
behaviour.  It is the interface applications MUST USE TO SUPPORT IPV6.

Changing applications to use getaddrinfo instead of gethostbyname is
done BECAUSE THOSE APPLICATIONS ARE BEING UPDATED TO SUPPORT IPV6.

Updating an application to support IPv6 should not change the way it
treats DNS RRsets containing multiple IPv4 addresses.  Obviously.

Ian.



Reply to: