[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

VXLAN remote mac flapping between IP's



Hi all,

I have several debian servers running multicast vxlan for a qemu/kvm
cluster in a fully routed mini-datacenter.
Each server has multiple links to multiple top of rack switches,
configured as ecmp point-to-points with /30's
Additionally, each server is running ospf via FRR and thus has routes
to each other /30 in this cluster.
Each server also has a lo:1 address set, but I don't think that's
coming into play here...

I've been seeing my dmesg on all members getting spammed with messages
like this:

[169833.510533] vxlan_1gig1: 46:65:af:13:b7:d1 migrated from
172.16.150.114 to 172.16.150.218
[169833.511121] vxlan_1gig1: 46:65:af:13:b7:d1 migrated from
172.16.150.218 to 172.16.150.114

I'm seeing this across all vxlan interfaces on each server.
The mac that's flapping is the remote address for the 1gig1 vxlan
interface on one of the other peers, but
.218 and .114 are both 10gig addresses on that same remote machine,
which doesn't make sense as they're connected to different switches,
and that vxlan is bound to a 1gig interface...

>From the documentation, I thought that kernel vxlan interfaces were
tied to a specific interface.
However, 150.114 is on a 10gig switch, and .218 is as well, whereas
this vxlan interface is bound to one of the 1gig links.

It looks like the kernel is seeing better-cost routes via the 10gig
interfaces, and then.... ignoring the device parameter?
There's no packetloss going on, and performance is unaffected. I'd
just like to know more about why this is happening.
I always have syslog and my graylog server, but both the ring buffer
and systemd journal are basically useless because of this constant
flapping

This is how I'm building the vxlan interface in
/etc/network/interfaces (identical config across all servers, just
IP's changed).
I know I'm doing things less than ideally, but am I missing something
big here? Any tips for a better implementation?

auto lo
iface lo inet loopback

auto lo:1
iface lo:1 inet static
address 10.15.30.22
netmask 255.255.255.255

allow-hotplug 1gig1
iface 1gig1 inet static
mtu 1500
address 172.16.150.154
netmask 255.255.255.252
# 1gig1

allow-hotplug 10gig1
iface 10gig1 inet static
mtu 1500
address 172.16.150.82
netmask 255.255.255.252
# 10gig1

allow-hotplug 10gig2
iface 10gig2 inet static
mtu 1500
address 172.16.150.86
netmask 255.255.255.252
# 10gig2

auto vxlan_1gig1
iface vxlan_1gig1 inet manual
up exec `ip link add vxlan_1gig1 type vxlan id 250 group 239.1.250.10
dstport 4789 ttl 2 dev 1gig1; ip link set vxlan_1gig1 up; ip addr add
172.16.250.2/24 dev vxlan_1gig1`
down exec `ip link set vxlan_1gig1 down; ip link del vxlan_1gig1`
# vxlan via 1gig1 - primary corosync network

auto vxlan_10gig1
iface vxlan_10gig1 inet manual
up exec `ip link add vxlan_10gig1 type vxlan id 251 group 239.1.251.10
dstport 4789 ttl 2 dev 10gig1; ip link set vxlan_10gig1 up; ip addr
add 172.16.251.2/24 dev vxlan_10gig1`
down exec `ip link set vxlan_10gig1 down; ip link del vxlan_10gig1`
# vxlan via 10gig1 - secondary corosync network, vm migration network

auto vxlan_10gig2
iface vxlan_10gig2 inet manual
        up exec `ip link add vxlan_10gig2 type vxlan id 252 group
239.1.252.10 dstport 4789 ttl 2 dev 10gig2; ip link set vxlan_10gig2
up; brctl addif vmbr0 vxlan_10gig2`
        down exec `ip link set vxlan_10gig2 down; ip link del vxlan_10gig2`
# vxlan via 10gig2 - vm lan

auto vmbr0
iface vmbr0 inet static
address 10.15.40.2
        netmask 255.255.255.0
bridge_ports vxlan_10gig2
bridge_stp off
bridge_fd 0
# vm network

I'd appreciate any help on this or ideas for further
troubleshooting/improvements, I'm deeply confused at this point but
that probably just means I have something setup wrong.

Thanks,
-ed


Reply to: