Bug#666386: igb + bnx2 + ifenslave + brctl + vconfig = largely broken

To: submit@bugs.debian.org
Subject: Bug#666386: igb + bnx2 + ifenslave + brctl + vconfig = largely broken
From: Josip Rodin <joy@debbugs.entuzijast.net>
Date: Fri, 30 Mar 2012 12:08:54 +0200
Message-id: <[🔎] 20120330100854.GA19259@entuzijast.net>
Reply-to: Josip Rodin <joy@debbugs.entuzijast.net>, 666386@bugs.debian.org
Package: linux-image-2.6.32-5-xen-amd64
Version: 2.6.32-41

Hi,

The machine is a new IBM x3550 M3, with this network hardware:

% lspci | grep Ethernet
0b:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20)
0b:00.1 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20)
1a:00.0 Ethernet controller: Intel Corporation 82580 Gigabit Network Connection (rev 01)
1a:00.1 Ethernet controller: Intel Corporation 82580 Gigabit Network Connection (rev 01)

One of each brands (eth0 and eth2) has a working cable plugged into a
working Ethernet switch that's set up so that it serves a native VLAN
(otherwise known as ID 54) and VLAN ID 2 trunked (tagged), among others.

The devices are:

lrwxrwxrwx 1 root root 0 Mar 19 15:42 /sys/class/net/eth0 -> ../../devices/pci0000:00/0000:00:07.0/0000:1a:00.0/net/eth0/
lrwxrwxrwx 1 root root 0 Mar 19 15:42 /sys/class/net/eth2 -> ../../devices/pci0000:00/0000:00:01.0/0000:0b:00.0/net/eth2/

So, if I read that right, eth0 is Intel, and eth2 is Broadcom.

The desired network setup is, in interfaces(5) format:

iface bond54 inet manual
  slaves eth0 eth2
  bond_mode active-backup
  bond_miimon 100

iface xenbr54 inet static
  bridge-ports bond54
  bridge-fd 0
  address 192.168.54.2
  netmask 255.255.255.0

iface vlan2 inet manual
  vlan-raw-device xenbr54

iface xenbr2 inet static
  bridge-ports vlan2
  bridge-fd 0
  address 213.202.97.156
  netmask 255.255.255.240
  gateway 213.202.97.145

This used to work for me elsewhere, however, on this machine it's broken as
follows:

Everything starts up fine, and the machine is perfectly usable (albeit I
only used SSH) over the xenbr54 interface.

However, over the xenbr2 interface, all the small network packets pass, such
as ICMP, or the bringup and teardown of HTTP connections, but as soon as I
try to actually GET something non-trivial over a seemingly established HTTP
connection, the machine pretends it doesn't see that incoming traffic.

Like this:

% wget -O /dev/null http://ftp.hr.debian.org/debian/ls-lR.gz
--2012-03-30 11:15:23--  http://ftp.hr.debian.org/debian/ls-lR.gz
Resolving ftp.hr.debian.org... 161.53.160.11, 2001:b68:ff:1::11
Connecting to ftp.hr.debian.org|161.53.160.11|:80... connected.
HTTP request sent, awaiting response...

In parallel, the trace shows:

% sudo tshark -n -i xenbr2
  0.000000 213.202.97.156 -> 161.53.160.11 TCP 51657 > 80 [SYN] Seq=0 Win=5840 Len=0 MSS=1460 TSV=232632046 TSER=0 WS=1
  0.001797 161.53.160.11 -> 213.202.97.156 TCP 80 > 51657 [SYN, ACK] Seq=0 Ack=1 Win=5792 Len=0 MSS=1460 TSV=643552423 TSER=232632046 WS=8
  0.001816 213.202.97.156 -> 161.53.160.11 TCP 51657 > 80 [ACK] Seq=1 Ack=1 Win=5840 Len=0 TSV=232632046 TSER=643552423
  0.001906 213.202.97.156 -> 161.53.160.11 HTTP GET /debian/ls-lR.gz HTTP/1.0
  0.003625 161.53.160.11 -> 213.202.97.156 TCP 80 > 51657 [ACK] Seq=1 Ack=131 Win=6912 Len=0 TSV=643552423 TSER=232632046

And then it sits there. The server machine (which I happen to have control
over) says:

  0.000000 213.202.97.156 -> 161.53.160.11 TCP 51660 > 80 [SYN] Seq=0 Win=5840 Len=0 MSS=1460 TSV=232668023 TSER=0 WS=1
  0.000023 161.53.160.11 -> 213.202.97.156 TCP 80 > 51660 [SYN, ACK] Seq=0 Ack=1 Win=5792 Len=0 MSS=1460 TSV=643588400 TSER=232668023 WS=8
  0.003117 213.202.97.156 -> 161.53.160.11 TCP 51660 > 80 [ACK] Seq=1 Ack=1 Win=5840 Len=0 TSV=232668024 TSER=643588400
  0.003125 213.202.97.156 -> 161.53.160.11 HTTP GET /debian/ls-lR.gz HTTP/1.0
  0.003145 161.53.160.11 -> 213.202.97.156 TCP 80 > 51660 [ACK] Seq=1 Ack=131 Win=6912 Len=0 TSV=643588401 TSER=232668024
  0.003480 161.53.160.11 -> 213.202.97.156 TCP [TCP segment of a reassembled PDU]
  0.003500 161.53.160.11 -> 213.202.97.156 TCP [TCP segment of a reassembled PDU]
  0.204965 161.53.160.11 -> 213.202.97.156 TCP [TCP Retransmission] [TCP segment of a reassembled PDU]
  0.613959 161.53.160.11 -> 213.202.97.156 TCP [TCP Retransmission] [TCP segment of a reassembled PDU]
  1.428964 161.53.160.11 -> 213.202.97.156 TCP [TCP Retransmission] [TCP segment of a reassembled PDU]
  3.061959 161.53.160.11 -> 213.202.97.156 TCP [TCP Retransmission] [TCP segment of a reassembled PDU]
  6.329958 161.53.160.11 -> 213.202.97.156 TCP [TCP Retransmission] [TCP segment of a reassembled PDU]
 12.853960 161.53.160.11 -> 213.202.97.156 TCP [TCP Retransmission] [TCP segment of a reassembled PDU]

And then I Ctrl+C that wget, and the traces show:

(on the client)
  8.017451 213.202.97.156 -> 161.53.160.11 TCP 51664 > 80 [FIN, ACK] Seq=131 Ack=1 Win=5840 Len=0 TSV=232696067 TSER=643614440
  8.057740 161.53.160.11 -> 213.202.97.156 TCP [TCP Previous segment lost] 80 > 51664 [ACK] Seq=4345 Ack=132 Win=6912 Len=0 TSV=643616454 TSER=232696067

(on the server)
  8.017218 213.202.97.156 -> 161.53.160.11 TCP 51664 > 80 [FIN, ACK] Seq=131 Ack=1 Win=5840 Len=0 TSV=232696067 TSER=643614440
  8.055647 161.53.160.11 -> 213.202.97.156 TCP 80 > 51664 [ACK] Seq=4345 Ack=132 Win=6912 Len=0 TSV=643616454 TSER=232696067
 10.778888 161.53.160.11 -> 213.202.97.156 TCP [TCP segment of a reassembled PDU]
 12.850888 161.53.160.11 -> 213.202.97.156 TCP [TCP Retransmission] [TCP segment of a reassembled PDU]
 25.906890 161.53.160.11 -> 213.202.97.156 TCP [TCP Retransmission] [TCP segment of a reassembled PDU]

That server isn't broken. The same thing happens when I initiate an SSH
connection to a random other machine - it works as far as getting a shell,
but if I run mutt -y that increases the amount of data going through, it
dies just as well.

I also have another couple of much older machines plugged into the same
switch at the client side, using native VLAN 2, and it's working just fine.

Then I thought, maybe it's this switch that doesn't do VLANs properly, and
it's killing my traffic.

So I started disassembling this complex setup on the machine, and got this:

* I tried to remove xenbr2 and move the L3 setup onto vlan2 - it worked,
  but had the same failure symptoms as above

* I tried to remove xenbr54 and move the VLAN setup onto bond54 - and that
  made everything work just fine.

So it looks like the bridging component is the trigger for screwing things
up. But since I simply can't lose the bridging because of Xen, I went
further and tried to fiddle with things a bit more:

% sudo ifenslave -d bond54 eth0

% wget -O /dev/null http://ftp.hr.debian.org/debian/ls-lR.gz
--2012-03-30 11:46:19--  http://ftp.hr.debian.org/debian/ls-lR.gz
Resolving ftp.hr.debian.org... 161.53.160.11, 2001:b68:ff:1::11
Connecting to ftp.hr.debian.org|161.53.160.11|:80... [hangs] ^C

Now it can't even connect. Let's put it back in:

% sudo ifenslave bond54 eth0

% wget -O /dev/null http://ftp.hr.debian.org/debian/ls-lR.gz
--2012-03-30 11:46:48--  http://ftp.hr.debian.org/debian/ls-lR.gz
Resolving ftp.hr.debian.org... 161.53.160.11, 2001:b68:ff:1::11
Connecting to ftp.hr.debian.org|161.53.160.11|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 6533601 (6.2M) [application/x-gzip]
[...all good...]

So a random enslaving back-and-forth makes it work?

Let's see if there's any difference in hardware:

% sudo ifenslave -d bond54 eth2

% wget -O /dev/null http://ftp.hr.debian.org/debian/ls-lR.gz
--2012-03-30 11:48:40--  http://ftp.hr.debian.org/debian/ls-lR.gz
Resolving ftp.hr.debian.org... 161.53.160.11, 2001:b68:ff:1::11
Connecting to ftp.hr.debian.org|161.53.160.11|:80... connected.
HTTP request sent, awaiting response... [hangs] ^C

% sudo ifenslave bond54 eth2

% wget -O /dev/null http://ftp.hr.debian.org/debian/ls-lR.gz
--2012-03-30 11:51:13--  http://ftp.hr.debian.org/debian/ls-lR.gz
Resolving ftp.hr.debian.org... 161.53.160.11, 2001:b68:ff:1::11
Connecting to ftp.hr.debian.org|161.53.160.11|:80... connected.
HTTP request sent, awaiting response... [hangs] ^C

There it's consistent at least.

Let's try the -c option while both are enslaved:

% sudo ifenslave -c bond54 eth0

% wget -O /dev/null http://ftp.hr.debian.org/debian/ls-lR.gz
--2012-03-30 11:51:52--  http://ftp.hr.debian.org/debian/ls-lR.gz
Resolving ftp.hr.debian.org... 161.53.160.11, 2001:b68:ff:1::11
Connecting to ftp.hr.debian.org|161.53.160.11|:80... connected.
HTTP request sent, awaiting response... ^C

% sudo ifenslave -c bond54 eth2

% wget -O /dev/null http://ftp.hr.debian.org/debian/ls-lR.gz
--2012-03-30 11:51:56--  http://ftp.hr.debian.org/debian/ls-lR.gz
Resolving ftp.hr.debian.org... 161.53.160.11, 2001:b68:ff:1::11
Connecting to ftp.hr.debian.org|161.53.160.11|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 6533601 (6.2M) [application/x-gzip]
[...all good...]

% sudo ifenslave -c bond54 eth0

% wget -O /dev/null http://ftp.hr.debian.org/debian/ls-lR.gz
--2012-03-30 11:52:04--  http://ftp.hr.debian.org/debian/ls-lR.gz
Resolving ftp.hr.debian.org... 161.53.160.11, 2001:b68:ff:1::11
Connecting to ftp.hr.debian.org|161.53.160.11|:80... connected.
HTTP request sent, awaiting response... ^C

% sudo ifenslave -c bond54 eth2

% wget -O /dev/null http://ftp.hr.debian.org/debian/ls-lR.gz
--2012-03-30 11:52:11--  http://ftp.hr.debian.org/debian/ls-lR.gz
Resolving ftp.hr.debian.org... 161.53.160.11, 2001:b68:ff:1::11
Connecting to ftp.hr.debian.org|161.53.160.11|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 6533601 (6.2M) [application/x-gzip]
[...all good...]

So I'm guessing there's something wrong with one of these device drivers,
or the enslaving infrastructure, or both.

Since all combinations with active Intel drivers are broken,
and the only working combinations are ones where Intel is disabled,
it looks like the igb driver could be the one doing something wrong,
given these circumstances.

For the record, I also tried the same final test after having done:
	for i in filter nat mangle; do sudo iptables -t $i -F; done
just to make sure nothing fishy was going on there - the results were
the same, netfilter isn't interfering.

Please fix this. TIA.

-- 
     2. That which causes joy or happiness.
Reply to:
Prev by Date: Bug#666360: linux-image-3.2.0-2-amd64: 3.2.0-2 breaks HDMI/DVI output on Dell Latitude E6410
Next by Date: Bug#666360: linux-image-3.2.0-2-amd64: 3.2.0-2 breaks HDMI/DVI output on Dell Latitude E6410
Previous by thread: Bug#666360: [3.2.6 -> 3.2.7 regression] i915: HDMI/DVI output broken on Dell Latitude E6410
Next by thread: Processed: submitter 666360
Index(es):
- Date
- Thread