Bug#466404: RTNL: assertion failed at net/ipv4/devinet.c when using bonding
We can reproduce this bug - or something remarkably similar - in a
running (as opposed to newly booting) system of a different architecture
to the original reporter.
Our kernel is linux-image-2.6.18-6-686 version 2.6.18.dfsg.1-18etch1 and
we get the dmesg output
RTNL: assertion failed at net/ipv4/devinet.c (985)
[<c0261e46>] inetdev_event+0x3e/0x286
[<c02809da>] _spin_lock_bh+0xb/0x18
[<c02809d7>] _spin_lock_bh+0x8/0x18
[<c023f0b0>] rt_run_flush+0x68/0x8f
[<c01284c3>] notifier_call_chain+0x19/0x32
[<c0228541>] dev_set_mac_address+0x46/0x4b
[<f8a2ad7b>] alb_set_slave_mac_addr+0x5a/0x7f [bonding]
[<f8a2b16e>] alb_swap_mac_addr+0x99/0x138 [bonding]
[<f8a269f1>] bond_change_active_slave+0x187/0x27f [bonding]
[<f8a27411>] bond_select_active_slave+0x8d/0xbb [bonding]
[<f8a286a1>] bond_mii_monitor+0x363/0x3a8 [bonding]
[<f8a2833e>] bond_mii_monitor+0x0/0x3a8 [bonding]
[<c0125607>] run_timer_softirq+0xfb/0x151
[<c0121838>] __do_softirq+0x5a/0xbb
[<c01218cf>] do_softirq+0x36/0x3a
[<c0103747>] apic_timer_interrupt+0x1f/0x24
[<c0101a5a>] default_idle+0x0/0x59
[<c0101a8b>] default_idle+0x31/0x59
[<c0101b52>] cpu_idle+0x9f/0xb9
[<c03196fd>] start_kernel+0x379/0x380
when we "ifconfig <slave> down" or "ethtool -s <slave> speed
<non-default>" one of the slaves of a bond running in mode 6
(balance-alb). This does not happen in mode 0 (balance-rr).
Each of our bonded interfaces has two e1000 slaves, which are are on
different physical cards on the same PCI bus. Each card has two NICs
but no bond uses two NICs on the same card. The relevant PCI bus looks
like this:
02:08.0 Ethernet controller: Intel Corporation 82546EB
Gigabit Ethernet Controller (Copper) (rev 01)
02:08.1 Ethernet controller: Intel Corporation 82546EB
Gigabit Ethernet Controller (Copper) (rev 01)
02:09.0 Ethernet controller: Intel Corporation 82546EB
Gigabit Ethernet Controller (Copper) (rev 01)
02:09.1 Ethernet controller: Intel Corporation 82546EB
Gigabit Ethernet Controller (Copper) (rev 01)
02:0a.0 Ethernet controller: Intel Corporation 82546EB
Gigabit Ethernet Controller (Copper) (rev 01)
02:0a.1 Ethernet controller: Intel Corporation 82546EB
Gigabit Ethernet Controller (Copper) (rev 01)
The modprobe config looks like this:
alias bond0 bonding
options bond0 mode=balance-alb miimon=100 max_bonds=8
alias bond1 bonding
options bond1 mode=balance-alb miimon=100 max_bonds=8
The interfaces config looks like this:
auto bond0
iface bond0 inet static
address 192.168.1.1
netmask 255.255.255.0
pre-up /sbin/modprobe bond0
pre-up /sbin/ifconfig bond0 up
pre-up /sbin/ifenslave bond0 eth2 eth4
down /sbin/ifenslave -d bond0 eth2 eth4
auto bond1
iface bond1 inet static
address 192.168.2.1
netmask 255.255.255.0
pre-up /sbin/modprobe bond1
pre-up /sbin/ifconfig bond1 up
pre-up /sbin/ifenslave bond1 eth5 eth7
down /sbin/ifenslave -d bond1 eth5 eth7
I've tried to find out whether this is a know problem upstream and if so
whether it's fixed in a later kernel version. I've not found an answer yet.
--
Duncan Gibb, Technical Architect
Sirius Corporation - The Open Source Experts
http://www.siriusit.co.uk/
Tel: +44 870 608 0063 || +44 7977 441 515
Reply to: