[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#966459: linux: traffic class socket options (both IPv4/IPv6) inconsistent with docs/standards



[The previous message is archived at <https://bugs.debian.org/966459>.]

On Tue, 2020-07-28 at 20:31 +0200, Thorsten Glaser wrote:
> Package: src:linux
> Version: 5.7.6-1
> Severity: normal
> Tags: upstream
> X-Debbugs-Cc: tg@mirbsd.de
> 
> I’m using setsockopt to set the traffic class on sending and receive
> it in control messages on receiving, for both IPv4 and IPv6.
> 
> The relevant documentation is the ip(7) manpage and, because the ipv6(7)
> manpage doesn’t contain it, RFC3542.

ip(7) also doesn't document IP_PKTOPIONS.

[...]
> Same in net/ipv4/ip_sockglue.c…
> 
>                         int tos = inet->rcv_tos;
>                         put_cmsg(&msg, SOL_IP, IP_TOS, sizeof(tos), &tos);
> … in one place, but…
> 
>         put_cmsg(msg, SOL_IP, IP_TOS, 1, &ip_hdr(skb)->tos);
> 
> … in ip_cmsg_recv_tos(), yielding inconsistent results for IPv4(!).

Those are two different APIs though: recvmsg() for datagram sockets, vs
getsockopt(... IP_PKTOPTIONS ...) for stream sockets.  They obviously
ought to be consistent, but mistakes happen.

[...]
> tl;dr: Receiving traffic class values from IP traffic is broken on
> big endian platforms.

Some user-space that uses getsockopt(... IP_PKTOPTIONS ...) for stream
sockets might be broken.

I searched for 'cmsg_type.*IP_TOS' on codesearch.debian.net, and found
only two instances where it was used in conjunction with IP_PKTOPTIONS.

libzorpll reads only the first byte (so is broken on big-endian):
https://sources.debian.org/src/libzorpll/7.0.1.0%7Ealpha1-1.1/src/io.cc/#L239

squid reads an int and then truncates it to a byte (so is fine):
https://sources.debian.org/src/squid/4.12-1/src/ip/QosConfig.cc/#L41

> I place the following suggestion for discussion, to achieve maximum
> portability: put 4 bytes into the CMSG for both IPv4 and IPv6, where
> the first and fourth byte are, identically, traffic class, second and
> third 0.
[...]

I see no point in changing the IPv6 behaviour: it seems to be
consistent with itself and with the standard, so only risks breaking
user-space that works today.

As for IPv4, changing the format of the IP_TOS field in the
IP_PKTOPIONS value looks like it would work for the two users found in
Debian.

But you should know that the highest priority for Linux API
compatibility is to avoid breaking currently working user-space.  That
means that ugly and inconsistent APIs won't get fixed if it causes a
regression for the programs people actually use.  If the API never
worked like it was supposed to on some architectures, that's not a
regression, and is lower priority.

Ben.

-- 
Ben Hutchings
It is easier to write an incorrect program
than to understand a correct one.

Attachment: signature.asc
Description: This is a digitally signed message part


Reply to: