[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#768478: linux-image-3.16 (wheezy-backports and jessie): outbound TCP throughput drops to zero for several drivers



Package: src:linux
Version: 3.16
Severity: important
Tags: patch

Dear Kernel team,

There is a bug with TCP in kernel 3.16 described as:

"Some drivers are unable to perform TX completions in a bound time.
They instead call skb_orphan()

Problem is skb_fclone_busy() has to detect this case, otherwise
we block TCP retransmits and can freeze unlucky tcp sessions on
mostly idle hosts."

Bug has been privately reported but we are following up with a BTS submission.  Google engineer has already submitted upstream: https://patchwork.ozlabs.org/patch/405110/

This bug is likely to surface userland, affects several drivers, and is sender-side only:

# git grep -n skb_orphan -- drivers/net
drivers/net/ethernet/chelsio/cxgb3/sge.c:1313:          skb_orphan(skb);
drivers/net/ethernet/chelsio/cxgb4/sge.c:1167:          skb_orphan(skb);
drivers/net/ethernet/chelsio/cxgb4vf/sge.c:1337:                skb_orphan(skb);
drivers/net/ethernet/sun/niu.c:6674:            skb_orphan(skb);
drivers/net/loopback.c:77:      skb_orphan(skb);
drivers/net/tun.c:789:  if (unlikely(skb_orphan_frags(skb, GFP_ATOMIC)))
drivers/net/tun.c:800:  skb_orphan(skb);
drivers/net/virtio_net.c:938:   skb_orphan(skb);
drivers/net/wireless/ath/wil6210/txrx.c:532:    skb_orphan(skb);
drivers/net/wireless/brcm80211/brcmfmac/msgbuf.c:718:           skb_orphan(skb);
drivers/net/wireless/libertas/tx.c:156:         skb_orphan(skb);
drivers/net/wireless/mac80211_hwsim.c:992:      skb_orphan(skb);

Google engineer also states that backported patch for 3.16 or 3.17 kernel is much simpler :

diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 4e4932b5079b..a8794367cd20 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -2082,7 +2082,8 @@ static bool skb_still_in_host_queue(const struct sock *sk,
        const struct sk_buff *fclone = skb + 1;
 
        if (unlikely(skb->fclone == SKB_FCLONE_ORIG &&
-                    fclone->fclone == SKB_FCLONE_CLONE)) {
+                    fclone->fclone == SKB_FCLONE_CLONE &&
+                    fclone->sk == sk)) {
                NET_INC_STATS_BH(sock_net(sk),
                                 LINUX_MIB_TCPSPURIOUS_RTX_HOSTQUEUES);
                return true;

Understandably very bad timing but ideally this should be addressed in Jessie now versus a later backports update.

Thank you,
Eric

Reply to: