RFC 2553 bind semantics harms the way to AF independence

To: debian-ipv6@lists.debian.org
Subject: RFC 2553 bind semantics harms the way to AF independence
From: horape@tinuviel.compendium.net.ar
Date: Fri, 22 Jun 2001 18:05:38 -0300
Message-id: <20010622180538.A16244@tinuviel.compendium.net.ar>
¡Hola!

I've sent to the ipngwg and ngtrans mailing lists the following, based
on the discussions we've had here some months ago. Please if you have
any comments participate on the discussions in the lists mentioned
before.

The objective is allowing Linux to be RFC compliant and we being able
to work independently of the AF at the same.

				HoraPe




          RFC 2553 bind semantics harms the way to AF independence
                                      
                              Horacio J. Peña
   
   RFC 2553 enforces the IPv4 mapped on IPv6 model for bind(2). This has
   had some very useful short term results, but harms very badly the way
   to AF independence, a goal that in my opinion we should try to reach.
   
                                 Main premise
                                       
   This paper is based on the premise that it's better writing AF
   independent programs than IPv6 centric ones.
   
   The basis for this is that we don't believe IPv6 is the
   cure-everything protocol and that any time in the future (probably in
   the far future) there would be a new transition from IPv6 to some
   other protocol. And we should do our best so that when that happens
   those who have to make that transition work can do it as easily as
   possible.
   
   We're experiencing the IPv4 to IPv6 transition, and it's painful.
   There's too much work to do, and the porting of the applications is
   responsible for much of that pain.
   
   Yes, porting an application is not so hard. But when you have so many,
   there is a problem. How many times have we heard that the IPv6
   adoption is so slow because there is no support on the clients? How
   easier would have been getting that support if no change to the
   applications had have to be done? I believe that the answer is ``lots
   easier''. Porting the applications to use the AF independent way will
   make future transitions very much easier. I believe that's desired.
   
                            What does RFC 2553 says
                                       
   Note: I'm talking about ``RFC 2553'' but meaning ``RFC 2553 and
   successors'', so i'll quote rfc2553bis-03 draft, not the RFC.
   
     Because of the importance of providing IPv4 compatibility in the
     API, these extensions are explicitly designed to operate on
     machines that provide complete support for both IPv4 and IPv6. A
     subset of this API could probably be designed for operation on
     systems that support only IPv6. However, this is not addressed in
     this memo.
     
   (from ``2. Design Considerations'')
   
   I.e., RFC 2553 applies to dual stack hosts.
   
     Applications may use PF_INET6 sockets to open TCP connections to
     IPv4 nodes, or send UDP packets to IPv4 nodes, by simply encoding
     the destination's IPv4 address as an IPv4-mapped IPv6 address, and
     passing that address, within a sockaddr_in6 structure, in the
     connect() or sendto() call. When applications use PF_INET6 sockets
     to accept TCP connections from IPv4 nodes, or receive UDP packets
     from IPv4 nodes, the system returns the peer's address to the
     application in the accept(), recvfrom(), or getpeername() call
     using a sockaddr_in6 structure encoded this way.
     
   (from ``3.7 Compatibility with IPv4 Nodes'')
   
     5.3 IPV6_V6ONLY option for AF_INET6 Sockets
     
     This socket option restricts AF_INET6 sockets to IPv6
     communications only. As stated in section <3.7 Compatibility with
     IPv4 Nodes>, AF_INET6 sockets may be used for both IPv4 and IPv6
     communications. Some applications may want to restrict their use of
     an AF_INET6 socket to IPv6 communications only. For these
     applications the IPV6_V6ONLY socket option is defined. When this
     option is turned on, the socket can be used to send and receive
     IPv6 packets only. This is an IPPROTO_IPV6 level option. This
     option takes an int value. This is a boolean option. By default
     this option is turned off.
     
   This implies that when binding an INET6 socket to a port (without
   specifying an address to bind to) it will hear the IPv4 requests too
   unless the IPV6_V6ONLY option is set.
   
   That is done using the IPv4-mapped IPv6 addresses.
   
                            How to program a server
                                       
   Let me digress a bit now. I'll show how a server is programmed in IPv4
   only programs, in IPv6 centric ones, and in the AF independent way, so
   the rest of this paper can be understood.
   
IPv4 server

int listenfd, connfd;
struct sockaddr_in cliaddr, servaddr;
socklen_t clilen;

listenfd = socket(AF_INET, SOCK_STREAM, 0);

if(listenfd < 0)
   die();

memset(&servaddr, 0, sizeof(servaddr));
servaddr.sin_family = AF_INET;
servaddr.sin_addr.s_addr = htonl(INADDR_ANY);
servaddr.sin_port = htons(1000);

if(bind(listenfd, &servaddr, sizeof(servaddr)) != 0)
   die();

if(listen(listenfd, 10) != 0)
   die();

connfd = accept(listenfd, (struct sockaddr *) &cliaddr, &clilen);

if(connfd < 0)
   die();

/* do something with connfd */

   This does work only with IPv4 connections, if anyone tries to connect
   to that box at the tcp port 1000 by IPv6 it will not connect.
   
IPv6 centric server

int listenfd, connfd;
struct sockaddr_in6 cliaddr, servaddr;
socklen_t clilen;

listenfd = socket(AF_INET6, SOCK_STREAM, 0);

if(listenfd < 0)
   die();

memset(&servaddr, 0, sizeof(servaddr));
servaddr.sin6_family = AF_INET6;
servaddr.sin6_addr = in6addr_any;
servaddr.sin6_port = htons(1000);

if(bind(listenfd, &servaddr, sizeof(servaddr)) != 0)
   die();

if(listen(listenfd, 10) != 0)
   die();

connfd = accept(listenfd, (struct sockaddr *) &cliaddr, &clilen);

if(connfd < 0)
   die();

/* do something with connfd */

   Almost no changes from the IPv4 only server. Accepts both IPv4 and
   IPv6 connections. But it is not going to work in OS with IPv6 support
   compiled out (it is not going to work even for IPv4)
   
AF independent server

int listenfds[MAX_AF], connfd;
struct addrinfo hints, *res, *ressave;
struct sockaddr_storage ss;
socklen_t sslen;
int n, i, m;
fd_set fdset;

memset(&hints, 0, sizeof(hints));
hints.ai_flags = AI_PASSIVE;
hints.ai_family = AF_UNSPEC;
hints.ai_socktype = SOCK_STREAM;

if(getaddrinfo(NULL, "1000", &hints, &res) != 0)
   die();

ressave = res;

for(n = 0; (n < MAX_AF) && res ; res = res->ai_next) {
        listenfds[n] = socket(res->ai_family, res->ai_socktype,
                    res->ai_protocol);
        if(listenfds[n] < 0)
          continue; /* libc supports protocols that kernel don't */

        if(bind(listenfds[n], res->ai_addr, res->ai_addrlen) != 0)
      die();

        if(listen(listenfds[n], 10) != 0)
      die();
        n++;
        }

freeaddrinfo(ressave);

m = 0;
FD_ZERO(&fdset);
for(i = 0; i < n; i++) {
        FD_SET(listenfds[i], &fdset);
        m = MAX(listenfds[i]+1,m);
        }

if(select(m, &fdset, NULL, NULL, NULL) < 0)
   die();

for(i = 0; i < n; i++) {
        if(FD_ISSET(listenfds[i], &fdset)) {
                sslen = sizeof(ss);
                connfd = accept(listenfds[i], (struct sockaddr*) &ss, &sslen);
                break;
                }
        }

if(connfd < 0)
   die();

/* do something with connfd */

   Lots harder...
   
A comment

   But that's not all.
   
   Nor the IPv6 centric nor the AF independent way work as cleanly as I
   presented them. The IPv6 centric way dies on OS where the IPv6 support
   is not compiled, so when programming the IPv6 centric way you should
   check if the socket call fails and then fall back to work as a pure
   IPv4 server (ie, duplication of code)
   
   And about the AF independent way... I'll talk about the problems it
   has on the following sections of this paper.
   
   But, even if both ways worked so great as the previous sections would
   make you believe, while the AF independent way is harder to do, it's a
   once in the life change, while the IPv6 centric way would have to be
   modified if any time in the future you want to handle anything that
   cannot be mapped to IPv6.
   
   End of the digression. Let's go back to RFC 2553.
   
                           RFC 2553 implementations
                                       
   We classify the RFC 2553 implementations by how they implement the
   bind semantics.
   
The non compliant ones

   These systems consider IPv4 and IPv6 as different protocols, so they
   don't let the IPv4 mapping to IPv6 work.
   
   The AF independent way works great. The IPv6 centric way works only
   for IPv6 connections and the INET6 sockets never catch an IPv4
   connection.
   
   OpenBSD, NetBSD (by default) and MSR stack for Windows are some of the
   non compliant implementations.
   
The buggy ones

   Warning: I talk about ``buggyness'' just about the issue on topic, the
   systems I qualify as ``buggy'' are the best I've worked with. And I
   like the ``buggy'' stacks better than the ``correct'' ones where
   trying to work without depending on AF is really hard.
   
   Moderns IPv4 stacks consider INADDR_ANY as meaning ``every address'',
   and not ``default for not bound addresses'', so they don't allow
   binding to an specific address when the wildcard address is bound in
   the same port. That is to avoid letting applications ``steal''
   connections from other ones. The same way it shouldn't be allowed to
   ``steal'' the IPv4 connections from the IPv6 wildcard. Allowing that
   should be considered a bug.
   
   That behaviour lets the IPv6 centric way work ok, and the AF
   independent way work ok too. But it has a bug.
   
   FreeBSD, NetBSD (optionally) and BSDI has that buggy behaviour.
   Probably most propietary implementations do too.
   
Compliant, non-buggy, but unworkable

   Warning: I amn't saying that Linux has not bugs. I'm just talking
   about bind semantics in this paper.
   
   That's Linux. Linux complies with the RFC letting the IPv6 sockets
   catch IPv4 connections, has not the bug mentioned before, but it is
   impossible to work with in an AF independent fashion.
   
   When doing the socket/bind/listen loop, the IPv4 bind call will fail
   because there is an IPv6 socket bound to the IPv6 wildcard address, so
   you should ignore bind errors, or croak only if none of the bind calls
   worked, but that will be the same that ignoring bind errors if a new
   protocol that has no mapping to the others exists. Ignoring these
   errors is a Bad Thing.
   
Conclusion: there's no good implementations of RFC 2553

   If my classification is not exhaustive and there is a non-buggy, fully
   compliant implementation of the RFC 2553 bind semantics that doesn't
   cause problems to programs written in an AF independent way, I'd be
   very pleased to know them and learn how they have avoided all that
   problems. But I believe that the cause is that the RFC is not very
   good on that point and should have more work done on that.
   
   Until then I can just suggest several possible ways of solving this
   problem.
   
                              Possible solutions
                                       
   Any of the following possible solutions is good enough for me, being
   my objective to be able to program portable, AF-independent programs
   without having to add special cases for IPv6 (nor any other protocol,
   I'm an application programmer, I shouldn't care about what is running
   the network), something I cannot do now because the Linux way of
   implementing bind.
   
   I believe IPv4 mapped addresses will be deprecated sooner or later
   because IPv4 itself will be deprecated. And I believe that maybe now
   is the time to start that deprecation, not by disallowing them right
   now, but by allowing the existence of systems where the IPv4 and IPv6
   stacks are isolated (like OBSD and Windows) That's what I call for,
   but any of the other possible solutions will be enough for me.
   
Deprecate IPv4 mapped addresses

   Maybe the time for IPv4 mapped addresses is over, maybe that was a
   good mechanism to get things to start rolling but it's time to grow up
   and left them.
   
   Itojun has mentioned in their ipv6-transition-abuse draft many other
   problems that the IPv4 mapped addresses have.
   
   But, there is too much work done in the IPv6 centric way, so maybe it
   isn't prudent to throw them all at once.
   
Deprecate IPV6_V6ONLY, add IPV6_ACCEPTV4MAPPED option

   Then the IPv6 sockets would have to be explicitly allowed to accept
   IPv4 connections. So the programs that use the IPv6 centric way have
   to be modified a bit, but the buggy implementations and the unworkable
   one could be corrected without losing features. Just making
   IPV6_V6ONLY default to on would have the same results.
   
More magic to getaddrinfo

   Take the Linux approach as the good one (it's the only compliant and
   non buggy -again, talking just about the issue on topic, i won't judge
   the general buggyness of any stack here), add a bit more (yet more!)
   of magic to getaddrinfo so it only returns the INET wildcard sockaddr
   when the kernel has no support for IPv6, and then the buggy stacks
   could be corrected with no loss of features and the unworkable one
   would get workable.
   
Add a provision for double stack implementations

   RFC 2553 targets the dual stack systems, where there is one stack that
   implements both IPv4 and IPv6 protocols.
   
   If the RFC had a little comment telling that there is allowed to have
   systems with two isolated stacks, and that the IPv4 to IPv6 mapping
   may be absent on these systems, the non compliant implementations
   would become compliant and we would have some implementations
   compliant, non buggy and easy to work with AF-independently.
   

					HoraPe
---
Horacio J. Peña
horape@compendium.com.ar
horape@uninet.edu
bofh@puntoar.net.ar
horape@hcdn.gov.ar
Reply to:
Follow-Ups:
- Re: RFC 2553 bind semantics harms the way to AF independence
  - From: csmall@eye-net.com.au (Craig Small)
Prev by Date: Re: inetd / ssh
Next by Date: BIND 9 in deb package ?
Previous by thread: Re: inetd / ssh
Next by thread: Re: RFC 2553 bind semantics harms the way to AF independence
Index(es):
- Date
- Thread