[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#466404: RTNL: assertion failed at net/ipv4/devinet.c when using bonding



We can reproduce this bug - or something remarkably similar - in a
running (as opposed to newly booting) system of a different architecture
to the original reporter.

Our kernel is linux-image-2.6.18-6-686 version 2.6.18.dfsg.1-18etch1 and
we get the dmesg output

RTNL: assertion failed at net/ipv4/devinet.c (985)
 [<c0261e46>] inetdev_event+0x3e/0x286
 [<c02809da>] _spin_lock_bh+0xb/0x18
 [<c02809d7>] _spin_lock_bh+0x8/0x18
 [<c023f0b0>] rt_run_flush+0x68/0x8f
 [<c01284c3>] notifier_call_chain+0x19/0x32
 [<c0228541>] dev_set_mac_address+0x46/0x4b
 [<f8a2ad7b>] alb_set_slave_mac_addr+0x5a/0x7f [bonding]
 [<f8a2b16e>] alb_swap_mac_addr+0x99/0x138 [bonding]
 [<f8a269f1>] bond_change_active_slave+0x187/0x27f [bonding]
 [<f8a27411>] bond_select_active_slave+0x8d/0xbb [bonding]
 [<f8a286a1>] bond_mii_monitor+0x363/0x3a8 [bonding]
 [<f8a2833e>] bond_mii_monitor+0x0/0x3a8 [bonding]
 [<c0125607>] run_timer_softirq+0xfb/0x151
 [<c0121838>] __do_softirq+0x5a/0xbb
 [<c01218cf>] do_softirq+0x36/0x3a
 [<c0103747>] apic_timer_interrupt+0x1f/0x24
 [<c0101a5a>] default_idle+0x0/0x59
 [<c0101a8b>] default_idle+0x31/0x59
 [<c0101b52>] cpu_idle+0x9f/0xb9
 [<c03196fd>] start_kernel+0x379/0x380

when we "ifconfig <slave> down" or "ethtool -s <slave> speed
<non-default>" one of the slaves of a bond running in mode 6
(balance-alb).  This does not happen in mode 0 (balance-rr).

Each of our bonded interfaces has two e1000 slaves, which are are on
different physical cards on the same PCI bus.  Each card has two NICs
but no bond uses two NICs on the same card.  The relevant PCI bus looks
like this:

02:08.0 Ethernet controller: Intel Corporation 82546EB
        Gigabit Ethernet Controller (Copper) (rev 01)
02:08.1 Ethernet controller: Intel Corporation 82546EB
        Gigabit Ethernet Controller (Copper) (rev 01)
02:09.0 Ethernet controller: Intel Corporation 82546EB
        Gigabit Ethernet Controller (Copper) (rev 01)
02:09.1 Ethernet controller: Intel Corporation 82546EB
        Gigabit Ethernet Controller (Copper) (rev 01)
02:0a.0 Ethernet controller: Intel Corporation 82546EB
        Gigabit Ethernet Controller (Copper) (rev 01)
02:0a.1 Ethernet controller: Intel Corporation 82546EB
        Gigabit Ethernet Controller (Copper) (rev 01)


The modprobe config looks like this:

alias bond0 bonding
options bond0 mode=balance-alb miimon=100 max_bonds=8

alias bond1 bonding
options bond1 mode=balance-alb miimon=100 max_bonds=8


The interfaces config looks like this:

auto bond0
iface bond0 inet static
        address 192.168.1.1
        netmask 255.255.255.0
        pre-up /sbin/modprobe bond0
        pre-up /sbin/ifconfig bond0 up
        pre-up /sbin/ifenslave bond0 eth2 eth4
        down /sbin/ifenslave -d bond0 eth2 eth4

auto bond1
iface bond1 inet static
        address 192.168.2.1
        netmask 255.255.255.0
        pre-up /sbin/modprobe bond1
        pre-up /sbin/ifconfig bond1 up
        pre-up /sbin/ifenslave bond1 eth5 eth7
        down /sbin/ifenslave -d bond1 eth5 eth7


I've tried to find out whether this is a know problem upstream and if so
whether it's fixed in a later kernel version.  I've not found an answer yet.

-- 
Duncan Gibb, Technical Architect
Sirius Corporation - The Open Source Experts
http://www.siriusit.co.uk/
Tel: +44 870 608 0063 || +44 7977 441 515



Reply to: