Bug#657309: linux image 3.1 panics on brctl addif bond to bridge
Package: linux-image-3.1.0-1-amd64
Version: 3.1.8-2
Hello,
After the upgrade to linux-image-3.1.0-1-amd64, we can't brctl addif
bond interfaces (active-backup) on a bridge. We get a kernel panic each
time. This behaviour is not observed with 2.6.32-5-amd64, but is
observed also with backports' 3.1 kernel
/etc/network/interfaces:
auto eth0
iface eth0 inet manual
up ifconfig $IFACE mtu 9000 || true
up echo 0 > /proc/sys/net/ipv6/conf/$IFACE/autoconf
auto eth1
iface eth1 inet manual
up ifconfig $IFACE mtu 9000 || true
up echo 0 > /proc/sys/net/ipv6/conf/$IFACE/autoconf
auto eth2
iface eth2 inet manual
up ifconfig $IFACE mtu 9000 || true
up echo 0 > /proc/sys/net/ipv6/conf/$IFACE/autoconf
auto eth3
iface eth3 inet manual
up ifconfig $IFACE mtu 9000 || true
up echo 0 > /proc/sys/net/ipv6/conf/$IFACE/autoconf
# The primary network interface
auto bond0
iface bond0 inet static
address 10.1.5.123
netmask 255.255.255.192
broadcast 10.1.5.127
gateway 10.1.5.65
mtu 1500
bond-mode active-backup
primary eth0
bond-miimon 100
slaves eth0 eth1 eth2 eth3
up echo 0 > /proc/sys/net/ipv6/conf/$IFACE/autoconf
auto prv
iface prv inet manual
up prv-net-helper up bond0 1 100 2999
down prv-net-helper down bond0 1 100 2999
/usr/sbin/prv-net-helper:
#!/bin/bash
function usage {
echo "Usage: $0 <mode> <parent interface> <prv min> <prv max>
<offset>"
exit 1
}
if [ $# -ne 5 ]; then
usage
fi
function up {
iface=$1
prv_min=$2
prv_max=$3
offset=$4
echo "Adding VLANs $2 - $3"
vconfig set_name_type DEV_PLUS_VID_NO_PAD
for prv in $(seq $prv_min $prv_max); do
vlan=$(($prv+$offset))
bridge=prv$prv
vconfig add $iface $vlan
ifconfig $iface.$vlan up
brctl addbr $bridge
brctl setfd $bridge 0
brctl addif $bridge $iface.$vlan
ifconfig $bridge up
sleep 3
done
}
function down {
iface=$1
prv_min=$2
prv_max=$3
offset=$4
echo "Removing VLANs $2 - $3"
for prv in $(seq $prv_min $prv_max); do
vlan=$(($prv+$offset))
bridge=prv$prv
(
ifconfig $bridge down
brctl delif $bridge $iface.$vlan # dev_plus_vid
vconfig rem $iface.$vlan
brctl delbr $bridge
) 2>/dev/null
done
}
mode=$1; shift
if [ "$mode" = "up" ]; then
up $@
elif [ "$mode" = "down" ]; then
down $@
else
usage
fi
After the script runs, we should have prv1-100 bridges, each one having
a different bond0.VLAN interface:
For example:
# brctl show
bridge name bridge id STP enabled interfaces
prv1 8000.001517cff668 no bond0.3000
Instead we get a kernel panic on "brctl addif $bridge $iface.$vlan"
Backtrace:
rados0-01 login: [ 586.287504] device bond0.3001 entered promiscuous mode
[ 586.293343] device bond0 entered promiscuous mode
[ 586.298691] device eth1 entered promiscuous mode
[ 588.195088] skb_over_panic: text:ffffffffa009fa8e len:2048 put:2048
head:ffff880626066000 data:ffff880626066040 tail:0x840 end:0x640 dev:eth1
[ 588.209409] ------------[ cut here ]------------
[ 588.214651] kernel BUG at
/build/buildd-linux-2.6_3.1.8-2-amd64-XPJTbL/linux-2.6-3.1.8/debian/build/source_amd64_none/net/core/skbuff.c:128!
[ 588.228851] invalid opcode: 0000 [#1] SMP
[ 588.233650] CPU 0
[ 588.235758] Modules linked in: 8021q garp bridge stp drbd lru_cache
cn nfnetlink_queue nfnetlink kvm_intel kvm ip6table_raw ip6t_REJECT
nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6table_mangle
ip6_tables xt_NOTRACK iptable_raw ipt_REJECT xt_pkttype
nf_conntrack_ipv4 nf_defrag_ipv4 xt_state iptable_filter xt_tcpudp
xt_NFQUEUE iptable_mangle ip_tables x_tables ext4 jbd2 crc16
ipmi_devintf ipmi_si nf_conntrack ipmi_poweroff ipmi_msghandler mptctl
bonding psmouse ohci_hcd ioatdma i2c_i801 i7core_edac edac_core i2c_core
snd_pcm snd_timer snd soundcore snd_page_alloc joydev evdev tpm_tis
processor ac acpi_power_meter tpm pcspkr tpm_bios button container
power_supply thermal_sys ext3 jbd mbcache dm_mod sd_mod crc_t10dif
usbhid hid sg sr_mod cdrom ata_generic uhci_hcd mptsas mptscsih ata_piix
mptbase libata scsi_transport_sas ehci_hcd usbcore igb e1000e scsi_mod
dca [last unloaded: scsi_wait_scan]
[ 588.331722]
[ 588.333469] Pid: 0, comm: swapper Not tainted 3.1.0-1-amd64 #1
FUJITSU PRIMERGY RX200 S5 /D2786
[ 588.347093] RIP: 0010:[<ffffffff81267df5>] [<ffffffff81267df5>]
skb_put+0x78/0x82
[ 588.355736] RSP: 0018:ffff88063fc03d70 EFLAGS: 00010282
[ 588.361755] RAX: 0000000000000097 RBX: ffff880c25360200 RCX:
0000000000000dc7
[ 588.369813] RDX: 0000000000000000 RSI: 0000000000000046 RDI:
0000000000000246
[ 588.377871] RBP: ffff880625d9fe80 R08: 0000000000000000 R09:
0000000000000000
[ 588.385930] R10: 0000000000000001 R11: 0000000000000000 R12:
ffffc9000775e4c8
[ 588.393988] R13: ffffc9000775e4a0 R14: ffff880c25be3850 R15:
ffff880c25be3840
[ 588.402047] FS: 0000000000000000(0000) GS:ffff88063fc00000(0000)
knlGS:0000000000000000
[ 588.411199] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 588.417704] CR2: 0000000000b29008 CR3: 0000000001605000 CR4:
00000000000006f0
[ 588.425760] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[ 588.433808] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400
[ 588.441866] Process swapper (pid: 0, threadinfo ffffffff81600000,
task ffffffff8160d020)
[ 588.451017] Stack:
[ 588.453346] 0000000000000840 0000000000000640 ffff880c267a6000
ffff880c25360200
[ 588.461996] ffff880625d9fe80 ffffffffa009fa8e ffff880c257cd0c4
0000000000000044
[ 588.470647] ffff88063fc03e38 ffffffffa0054eef ffffffff81601fd8
000000008108e8b1
[ 588.479296] Call Trace:
[ 588.482111] <IRQ>
[ 588.484613] [<ffffffffa009fa8e>] ? igb_poll+0x44c/0x9d1 [igb]
[ 588.491220] [<ffffffffa0054eef>] ? e1000_clean_rx_irq+0x257/0x291
[e1000e]
[ 588.499089] [<ffffffffa00554a1>] ? e1000_clean+0x1f7/0x208 [e1000e]
[ 588.506274] [<ffffffff81271694>] ? net_rx_action+0xa1/0x1af
[ 588.512684] [<ffffffff8104ad58>] ? __do_softirq+0xb9/0x177
[ 588.518997] [<ffffffff8102333c>] ? __setup_APIC_LVTT+0x4a/0x66
[ 588.525696] [<ffffffff8133506c>] ? call_softirq+0x1c/0x30
[ 588.531914] [<ffffffff8100f845>] ? do_softirq+0x3c/0x7b
[ 588.537934] [<ffffffff8104afc0>] ? irq_exit+0x3c/0x9a
[ 588.543760] [<ffffffff8100f575>] ? do_IRQ+0x82/0x98
[ 588.549392] [<ffffffff8132e16e>] ? common_interrupt+0x6e/0x6e
[ 588.555992] <EOI>
[ 588.558483] [<ffffffff811d586f>] ? intel_idle+0xd4/0xf9
[ 588.564503] [<ffffffff811d584e>] ? intel_idle+0xb3/0xf9
[ 588.570523] [<ffffffff81251e5a>] ? cpuidle_idle_call+0xf0/0x175
[ 588.577319] [<ffffffff8100d250>] ? cpu_idle+0x9c/0xe0
[ 588.583146] [<ffffffff816a6b4e>] ? start_kernel+0x3bd/0x3c8
[ 588.589553] [<ffffffff816a6140>] ? early_idt_handlers+0x140/0x140
[ 588.596543] [<ffffffff816a63c4>] ? x86_64_start_kernel+0x104/0x111
[ 588.603628] Code: 8b 57 68 48 89 44 24 10 8b 87 d0 00 00 00 48 89 44
24 08 8b bf cc 00 00 00 31 c0 48 89 3c 24 48 c7 c7 79 80 50 8 58 fc 0b
00 <0f> 0b 4c 01 c0 48 83 c4 28 c3 41 57 41 56 41 55 41 54 41 89 d4
[ 588.629346] RIP [<ffffffff81267df5>] skb_put+0x78/0x82
[ 588.635327] RSP <ffff88063fc03d70>
[ 588.639558] ---[ end trace 83fa0875c297a122 ]---
[ 588.644923] Kernel panic - not syncing: Fatal exception in interrupt
[ 588.652217] Pid: 0, comm: swapper Tainted: G D 3.1.0-1-amd64 #1
[ 588.660003] Call Trace:
[ 588.662930] <IRQ> [<ffffffff8132793d>] ? panic+0x95/0x1a5
[ 588.669429] [<ffffffff8132eecb>] ? oops_end+0xa9/0xb6
[ 588.675328] [<ffffffff8100e8c0>] ? do_invalid_op+0x87/0x91
[ 588.681755] [<ffffffff81267df5>] ? skb_put+0x78/0x82
[ 588.687580] [<ffffffff810464d1>] ? vprintk+0x39e/0x3d9
[ 588.693682] [<ffffffff81334deb>] ? invalid_op+0x1b/0x20
[ 588.699777] [<ffffffff81267df5>] ? skb_put+0x78/0x82
[ 588.705627] [<ffffffffa009fa8e>] ? igb_poll+0x44c/0x9d1 [igb]
[ 588.712334] [<ffffffffa0054eef>] ? e1000_clean_rx_irq+0x257/0x291
[e1000e]
[ 588.720319] [<ffffffffa00554a1>] ? e1000_clean+0x1f7/0x208 [e1000e]
[ 588.727611] [<ffffffff81271694>] ? net_rx_action+0xa1/0x1af
[ 588.734136] [<ffffffff8104ad58>] ? __do_softirq+0xb9/0x177
[ 588.740563] [<ffffffff8102333c>] ? __setup_APIC_LVTT+0x4a/0x66
[ 588.747376] [<ffffffff8133506c>] ? call_softirq+0x1c/0x30
[ 588.753712] [<ffffffff8100f845>] ? do_softirq+0x3c/0x7b
[ 588.759852] [<ffffffff8104afc0>] ? irq_exit+0x3c/0x9a
[ 588.765792] [<ffffffff8100f575>] ? do_IRQ+0x82/0x98
[ 588.771538] [<ffffffff8132e16e>] ? common_interrupt+0x6e/0x6e
[ 588.778258] <EOI> [<ffffffff811d586f>] ? intel_idle+0xd4/0xf9
[ 588.785140] [<ffffffff811d584e>] ? intel_idle+0xb3/0xf9
[ 588.791332] [<ffffffff81251e5a>] ? cpuidle_idle_call+0xf0/0x175
[ 588.798245] [<ffffffff8100d250>] ? cpu_idle+0x9c/0xe0
[ 588.804189] [<ffffffff816a6b4e>] ? start_kernel+0x3bd/0x3c8
[ 588.810721] [<ffffffff816a6140>] ? early_idt_handlers+0x140/0x140
[ 588.817826] [<ffffffff816a63c4>] ? x86_64_start_kernel+0x104/0x111
If there's no bond0, just an ethernet interface, everything works as
expected.
The cards are Intel Corporation PRO/1000 PT Dual Port Server Adapter
with the driver e1000e (lspci -n gives 8086:105e (rev 06) )
The machine is a debian squeeze, with only the kernel from wheezy or
backports (Tried both)
Thanks in advance,
Costas Drogos
Reply to: