[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Call for Votes (getaddrinfo)



Ian Jackson writes ("Re: Call for Votes (getaddrinfo)"):
> Thus X wins and the resolution between -8<- above has been passed,
> overruling the maintainer.

I think we should send our rationales, including dissents, to the bug
report.  I've collated the opinions that people attached to their
votes and the result is below.

I've included the assenting views in the order they were emailed, as
that seems to make them easiest to make sense of.

There was one dissent, which I put at the end.  If there were several
they too should probably have come in chronological order.  It makes
most sense to put dissents after assents as they're more likely to be
responsive to assents than vice versa, and also to avoid confusing 
readers with a conflicting view at the start of the text.

Normally I think we should just send a collation like the one below
straight to the bug report and other interested parties along with the
actual decision text, but since this is a new process for dealing with
rationales I thought people might want a chance to comment.

I definitely think we should probably take the rationale text attached
to votes as definitive, and not look through the whole mail thread.

If a TC member wishes to amend or augment their rationale after voting
they should do so very explicitly.

Ian.


----------------------------------------
  
Ian Jackson, assenting:

  Introduction

  1. We have been asked to rule on the application of RFC3484 section 6
     rule 9 by the resolver in glibc.

  2. Rule 9 requires a host to sort addresses according to the length
     of the initial prefix common with the host's own address, when
     deciding which of a peer's addresses to try in which order.  Thus
     eg, a host 172.18.45.11 would prefer to make a connection to
     172.18.45.6 rather than to 172.31.80.8.

  3. This has been implemented in glibc upstream by having the DNS
     resolver sort addresses before passing them to the application via
     getaddrinfo.

  Background and history

  4. Prior to the publication and implementation of RFC3484, and prior
     to the introduction of getaddrinfo, most hosts would use an
     implementation of gethostbyname to find IPv4 addresses to use for
     a peer, given its hostname.  gethostbyname has almost universally
     returned the addresses in the order supplied by whatever DNS
     nameserver it was using.

  5. In 1993, the then-ubiquitous nameserver implementation BIND was
     modified to implement a feature known as `DNS Round Robin'.  This
     does not need to be explained in detail, but the intended and
     actual effect was that clients would be provided addresses (and
     other records) in a deliberately varying order, so that in the
     aggregate clients' choice of address to use would be distributed
     uniformly across the published addresses.

  6. Between then and the recent implementation of rule 9 by some
     hosts, DNS round robin became universally deployed.  It has been
     implemented by other nameservers and has become a de facto
     standard at least for the interpretation of multiple IPv4
     addresses in the global DNS.

  IPv6 transition

  7. The primary use of getaddrinfo is to replace gethostbyname when an
     application is converted to support IPv6.  gethostbyname cannot be
     sensibly used to support IPv6; while there are other interfaces
     that can be used instead, the routine practice has been to make
     certain very consistent sets of changes to applications, which
     include replacing the use of gethostbyname by getaddrinfo.

  8. gethostbyname in current glibc does not implement rule 9.  The
     effect therefore is that whether a particular host follows rule 9
     for a particular protocol depends mainly on whether that
     particular version of the application in question has been updated
     in the host's operating system to support IPv6.  (As well as, of
     course, whether the operating system's getaddrinfo uses rule 9.)

  9. There are no known applications which specifically desire the
     rule 9 behaviour; we know of no case where an application uses
     getaddrinfo specifically to get rule 9.

  10. There is therefore no rational reason for the difference
     between the behaviour of gethostbyname and getaddrinfo, other than
     perhaps implementation convenience.

  Compatibility and benefits

  11. Rule 9 is incompatible with the DNS Round Robin.  Prior to rule
     9, a system administrator would publish multiple addresses in the
     intent and expectation of getting roughly equal client load on
     each address.

  12. When Debian's apt changed its behaviour to follow rule 9,
     it broke ftp.us.debian.org because the load suddenly became very
     unbalanced.  Thus this incompatibility causes actual operational
     problems.

  13. We know of no situations where multiple IPv4 addresses on the
     global Internet are published with the intent and expectation that
     rule 9 will be followed by client systems.

  14. The nature of the IPv4 address space structure suggests that rule
     9 is not in practice useful for IPv4 on the global Internet.

  History and status of RFC3484

  15. RFC3484 and rule 9 forms part of a document set published as part
     of early IPv6 work.

  16. At the time of publication of RFC3484, the intended IPv6 addressing
     architecture had a significantly different shape.  3484 and rule 9
     appear to form part of a set of behaviours which go alongside
     rapid renumbering, which has now fallen out of favour.

  17. There is no evidence that the authors of RFC3484, which is
     specifically headed as an IPv6 document, considered specifically
     the behaviour for IPv4 or realised that the specification
     conflicted with the widely-used DNS Round Robin.

  18. RFC3484 was a product of IPv6 (ie networking) working groups, not
     DNS working groups.

  Standards

  18. The purpose of standards is interoperability.  Where following a
     standard makes us less interoperable we should not follow the
     standard.  Debian is entitled to deviate from standards, including
     published documents, if we consider it appropriate to do so.

  19. We should of course consider carefully before going against a
     published document.  However, when the situation is clear, we
     should not be overly reluctant to do so.  In cases where de jure
     and de facto standards disagree, we must make a judgement which we
     prefer based on all of the circumstances.

  20. In any case RFC3484 is currently `Proposed Standard', which is
     the earliest and least mature form of standards track document,
     which can be expected to have rough edges.

  Conclusions

  21. Rule 9 is not the standard behaviour for IPv4, RFC3484
     notwithstanding.  Round Robin is the de facto standard behaviour
     (despite not having been officially standardised), and there can
     be little justification for making such a radical change at this
     stage.

  22. RFC3484 is therefore in error when it applies rule 9 to IPv4.
     Not using rule 9 for IPv4 is unquestionably preferable.

  23. It appears that RFC3484 is also unhelpful for IPv6.  However,
     since there is no existing de-facto standard for IPv6, this
     conclusion is arguable.

  24. Therefore I would insist on traditional DNS Round Robin, rather
     than rule 9, for IPv4; but I would only recommend against rule 9
     in the case of IPv6.

  25. It is clear that the IETF needs to revisit this issue and I would
     formally recommend to them that they do so.

  Backporting to current stable

  26. In my opinion this change should be backported to current
     stable.  However, this decision does not need to be taken now.  We
     can wait for experience with the change in unstable and testing,
     which will help convince doubters that there is no compatibility
     problem.

  27. I encourage the submitter and other interested parties to pursue
     getting this changed in a stable update, and to bring the matter
     back to the Technical Committee if necessary to achieve this.

  Responsibility of the Technical Committee to decide

  28. One committee member has insisted on the presence of `leave the
     choice up to the maintainer' on the ballot (option M).  My
     understanding of the meaning of this wording is that if that
     option wins we refuse to make a decision on the matter and also
     refuse to deal with it any more.  Ie, this option is equivalent to
     Further Discussion except that the committee will not discuss or
     vote any more but instead considers the matter closed.

  29. I do not consider it appropriate for the committee to decline to
     issue a ruling.  Once a matter has reached us it is for us to make
     a decision and we should not abdicate that responsibility.  If the
     committee disagrees with the maintainer, but not sufficiently
     overwhelmingly so as to be able to overrule the maintainer, we
     should nevertheless issue a ruling clearly stating that we
     disagree.  In this particular case the committee does seem
     to have a sufficient majority to overrule, if we can only get the
     mechanics of voting working properly.

  30. It has also been suggested that we should not overrule the
     maintainer unless we consider the bug release-critical.  This is
     an abdication of the responsibility of the committee.  In
     particular, whether or not to overrule the maintainer should
     depend primarily on how _clear_ it is that the maintainer is
     wrong, rather than on how _serious_ the consequences are.  The
     constitution's supermajority condition gives effect to the
     requirement for high confidence in a decision to overrule, and of
     course individual committee members will want to be sure of their
     ground in such a case.

  31. Therefore I reject the suggestions that we should not decide the
     matter, or that we should not overrule without concluding that the
     problem is release-critical.


----------------------------------------

Andreas Barth, assenting:

  Rationale is mostly known already - Rule 9 doesn't make sense in the
  IPv4 world (as we have discussed) and breaks current behaviour (for more
  detailed analysis, see e.g. Ians mail I'm responding to).

  I'm not argueing about backporting to stable or not right now, as I'm
  biased a bit on that :) and would like to see this question be handled
  in the usual SRM accept process.

  However, using the usual rules, chances are pretty good to get that done
  once the fix has reached testing.


----------------------------------------

Manoj Srivastava, assenting:

  As I have mentioned before, I think we should be deciding an issue
  purely on its merits; and how egregious the error is should not
  count towards determining what the correct solution is.  If our
  deliberations conclude that a maintainer is incorrect, well, that is
  what we concluded. Everyone makes mistakes.


----------------------------------------

Anthony Towns, dissenting:

  Again, if we don't think this bug is severe enough to need to be fixed
  in stable (and thus qualifies as RC), I don't think we should be overruling
  the maintainer.

  If Josip's correct in saying that this is screwing over the Debian
  apt round-robin hosts, it seems like we should be saying this is RC, but
  nobody seemed willing to do that when I brought it up earlier.

  >  4. Prior to the publication and implementation of RFC3484, and prior
  >     to the introduction of getaddrinfo, most hosts would use an
  >     implementation of gethostbyname to find IPv4 addresses to use for
  >     a peer, given its hostname.  gethostbyname has almost universally
  >     returned the addresses in the order supplied by whatever DNS
  >     nameserver it was using.

  getaddrinfo() also almost universally behaved that way until very
  recently.

  >  IPv6 transition
  >  7. The primary use of getaddrinfo is to replace gethostbyname when an
  >     application is converted to support IPv6.  

  I would say the primary use of getaddrinfo is to resolve a domain name
  in a useful way. I don't think replacing gethostbyname is relevant --
  if it behaves differently to gethostbyname that's a win if it's more
  useful and a loss if it's less useful; it's not always a loss merely
  because it's different.

  >  9. There are no known applications which specifically desire the
  >     rule 9 behaviour; we know of no case where an application uses
  >     getaddrinfo specifically to get rule 9.

  RFC3484 specifically allows rule 9 to be overriden if the implementation
  has a better process, so it's not reasonable for an application to rely
  on rule 9, afaics.

  >  10. There is therefore no rational reason for the difference
  >     between the behaviour of gethostbyname and getaddrinfo, other than
  >     perhaps implementation convenience.

  Consistency between IPv4 and IPv6 behaviours seems a perfectly rational
  desire, even if it doesn't warrant the cost of changing the application
  behaviour.

  >  11. Rule 9 is incompatible with the DNS Round Robin.  

  It's perfectly compatible, it just overrides it.

  >  12. When Debian's apt changed its behaviour to follow rule 9,
  >     it broke ftp.us.debian.org because the load suddenly became very
  >     unbalanced.  Thus this incompatibility causes actual operational
  >     problems.

  I've seen no evidence that that actually happened. There's some
  hearsay from Josip ("I'm told that thisbug also broke round-robin DNS
  functionality for ftp.us.debian.org/http.us.debian.org"), but that's it.

  >  Standards
  >  18. The purpose of standards is interoperability.  Where following a
  >     standard makes us less interoperable we should not follow the
  >     standard.  Debian is entitled to deviate from standards, including
  >     published documents, if we consider it appropriate to do so.

  This doesn't affect interoperability either way, though.

  It changes the impact of Debian systems on services provided by
  round-robin hosts (ie, to possibly impact some servers more than
  others, depending on the distribution of clients, rather than
  doing equal balancing), and it results in changed expectations of
  users/developers/admins as to how host resolution on round-robin addresses
  will work.

  >  23. It appears that RFC3484 is also unhelpful for IPv6.  However,
  >     since there is no existing de-facto standard for IPv6, this
  >     conclusion is arguable.

  RFC3484 is relied upon by other IPv6 drafts/standards in order to choose
  the correct class of address for a service (a roaming address versus
  a static one, a site-local address versus a global one, etc). Some of
  those can be dealt with by earlier rules (particularly site-local versus
  global), but that leaves many RFCs that do rely on the rule for IPv6.

  >  Backporting to current stable
  >  26. In my opinion this change should be backported to current
  >     stable.  However, this decision does not need to be taken now.  We

  I think this should be the maintainers' call.

  The call I think we should be making is whether this is an issue that
  needs to be corrected in stable, whether by the patch we've seen, or by
  some other means. If that fix doesn't happen immediately, but waits for
  further testing in unstable and lenny, that's fine -- it'll be waiting
  for the next point release in any case.

  Again, if we don't think this is sufficiently serious to need to be
  fixed in stable, afaics that means we're ignoring the impact of Debian
  machines on round-robin services as an important consideration --
  including ftp/http.us.d.o and security.d.o.

  >  28. One committee member has insisted on the presence of `leave the
  >     choice up to the maintainer' on the ballot (option M).  My
  >     understanding of the meaning of this wording is that if that
  >     option wins we refuse to make a decision on the matter and also
  >     refuse to deal with it any more.  Ie, this option is equivalent to
  >     Further Discussion except that the committee will not discuss or
  >     vote any more but instead considers the matter closed.

  >  29. I do not consider it appropriate for the committee to decline to
  >     issue a ruling.  Once a matter has reached us it is for us to make
  >     a decision and we should not abdicate that responsibility.

  We should *always* decline to make a decision unless we have clear
  evidence that it's *necessary* for us to step in. That is to say, it
  must be *important* that the issue be resolved, and that the maintainer
  cannot already be resolving it.

  I can't see any way in which this issue can be important enough to
  be resolved without it also being important enough to resolve for our
  current stable release too, which afaics would make it by definition
  release critical. I don't understand why everyone seems to be passing
  on declaring it RC or important enough to need fixing in stable.

  >     If the
  >     committee disagrees with the maintainer, but not sufficiently
  >     overwhelmingly so as to be able to overrule the maintainer,

  If the issue isn't particularly important or the maintainer is already
  handling it satisfactorily, the ctte shouldn't be spending its time
  agreeing or disagreeing with the maintainer. I'd say 412976 would be an
  example of that.

  >     In this particular case the committee does seem
  >     to have a sufficient majority to overrule, if we can only get the
  >     mechanics of voting working properly.

  Overruling the maintainer should be an absolute last resort, not something
  we do anytime we see something that 75% of us happen to disagree with.

  >  30. It has also been suggested that we should not overrule the
  >     maintainer unless we consider the bug release-critical.  This is
  >     an abdication of the responsibility of the committee.

  It's the responsibility of the technical committee to determine which bugs
  are important enough to warrant our attention to disputes about them. Not
  determining whether this bug is release-critical, which is to say warrants
  an update to stable if it applies to packages in stable, is abidicating our
  responsibility to evaluate the importance of this issue afaics.

  Maybe you're in effect saying:

	  - the use of rule9 for a function resolving IPv4 addresses is
	    RC in glibc if many applications in Debian use that function for
	    IPv4 addresses

	  - there are/will be many applications using getaddrinfo() for IPv4
	    addresses in lenny, so this is RC for lenny and must be resolved

	  - there aren't many applications using getaddrinfo() for IPv4
	    addresses in etch, so this does not need to be resolved
	    (though it'd be nice if it were)

  I've been assuming there are already sufficient getaddrinfo apps in etch
  that this is relevant for etch if it's relevant for lenny, but maybe
  people disagree with that?

  I could endorse that, though I was under the impression that apt in etch
  used getaddrinfo for IPv4 resolution (which would seem sufficient Debian
  apps in etch to me to make the issue equally relevant to etch).

  >     In
  >     particular, whether or not to overrule the maintainer should
  >     depend primarily on how _clear_ it is that the maintainer is
  >     wrong, rather than on how _serious_ the consequences are.

  I strongly disagree with this. Maintainers get things wrong very
  frequently; it's their job to fix these things, not the technical
  committee's. If the issue isn't both important to Debian *and* being
  mishandled/ignored by the maintainer, it's not in our purview.

  Establishing a policy or practice where the only bar to us overruling a
  maintainer is that someone reassign a bug to us and 75% of those of us
  who can be bothered voting disagree with the maintainer is a terrible
  idea, IMO.

  As a consequence, I'll continue voting any attempts to overrule the
  maintainer that don't (IMO) clearly and consistently establish why the
  issue is important to Debian below further discussion.

  And again, the only way I can see this issue being important to Debian
  is the overall effect of many Debian machines accessing round-robin
  services and failing to do so in a balanced way. But afaics, if we are
  using that as our basis, it applies equally to stable and unstable,
  and warrants being treated as a release critical issue. If we're not
  willing to take that issue that seriously, I don't see any aspects of
  this problem that are important enough to warrant tech-ctte resolution.


-- 
Ian Jackson, at home.           Local/personal: ijackson@chiark.greenend.org.uk
ian@davenant.greenend.org.uk       http://www.chiark.greenend.org.uk/~ijackson/
Problems mailing me ?  Send postmaster@chiark the bounce (bypasses the blocks).



Reply to: