[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: glibc's getaddrinfo() sort order



Kurt Roeckx writes ("Re: glibc's getaddrinfo() sort order"):
> It's atleast in the spirit of the rfc to prefer one that's on the local
> network.  It might be the intention of rule 9, but then rule 9 isn't
> very well written.

I agree that applying RFC3484 section 6 rule 9 to IPv4 addresses is a
mistake and that therefore we should change the default in Debian
accordingly.  I would encourage Kurt to take this matter up with the
relevant IETF working group.


Others have already written about problems involving NAT.  I agree
with this argument (although I don't approve of NAT and it galls me to
use some braindamage involving NAT as an argument for anything).

However there is another argument I would like to make:

A host using getaddrinfo configured to apply rule 9 to IPv4 addresses
will behave quite differently to a host using gethostbyname.  I think
that this change in behaviour is unwarranted.  Whether an application
uses gethostbyname or getaddrinfo is an implementation detail (related
closely to whether that particular application's source code has been
modified to try to support IPv6) and this should not change the
behaviour.

Presently when connecting to a service offering only IPv4 addresses,
most hosts will use gethostbyname and use the addresses offered in
round-robin DNS order.  That is to say, the meaning (pre-RFC3484, and
current de-facto) of a DNS RRset containing several IP addresses is
that the addresses should be tried `uniformly at random' by callers,
as done by the nameserver round-robin RRset rotation algorithm.

RFC3484 section 6 rule 9 applied to IPv4 appears to be an attempt to
change that meaning.  This interpretation of rule 9 for IPv4 as an
attempt to change the meaning of existing deployed DNS RRsets is
supported by the fact that proponents of rule 9 for IPv4 claim that it
will fix existing problems, as in
    http://udrepper.livejournal.com/16116.html.

However, it is obviously wrongheaded to attempt to change the defined
meaning of all existing multi-record A RRsets.  On the existing
Internet, zone administrators use multi-record A RRsets in the
knowledge that those RRsets will be used by callers in an
evenly-distributed round-robin fashion as currently implemented by
bind and gethostbyname.

This meaning for multiple A records had been established for well over
a decade by the time 3848 was written and in the intervening years it
has continued to be dominant.  New systems, and systems newly modified
to support IPv6, should continue to interpret existing A RRsets in the
same way as before.

A few cursory web searches show that this new behaviour of getaddrinfo
is indeed causing trouble as applications are converted to IPv6 and
the change in behaviour with IPv4 is found to be undesirable.


Finally, I would like to preemptively address the line "but this is an
RFC and we must do what it says".  There are two responses:

The most obvious one is that RFC3484 is merely Proposed Standard.  At
this stage of the standardisation process one can expect to find
errors, mistaken deviations from existing practice, and so on.
(The IETF standardisation process has been broken so that documents
often get stuck in this state; but that doesn't mean that we should
treat draft documents as if they were gospel, let alone documents that
aren't even drafts.)

The second is a more general point: if a standards document tells us
to do something which is wrong, then we should not do it.  Obviously
we should think fairly hard before making the decision to go against a
standard, but our job is to do the right thing and standards documents
are there to help us not to constrain us.  I think my argument above
about the existing meaning of multiple A records is irrefutable.


> I already suggested that maybe rule 9 should be limited to the common
> prefix length of the netmask you're using.  An other option is that you
> extend rule 2 to have the same behaviour with ipv4, and that 10/8,
> 172.16/12 and 192.168/16 should be considered organization-local.

Replacing rule 9 with something more limited based on local network
interfaces (ie, prefer what appear to be locally-attached addresses)
would be fine.  Or a default based on routing metrics would be fine
too.  (Although I think these may be too much work to do in
getaddrinfo.)

The problem occurs when we start ranking IPv4 addresses of foreign
systems about we have no special knowledge of the topology.

Ranking RFC1918 addresses ahead of others is not entirely a safe thing
to do because people sometimes foolishly publish RFC1918 addresses for
public services and expect callers to skip those addresses somehow.
But at least it wouldn't break people who weren't already doing wrong
things.


Ian.



Reply to: