[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#596635: linux-image-2.6.32-5-xen-amd64: Xen vif bridge failure / netfront smartpoll bugfix



Package: linux-2.6
Version: 2.6.32-21
Severity: normal

Hello, I have been experiencing random bridge failures with Xen domU's.

My environment is all Debian squeeze, Xen 4.0.1~rc6, PV (not HVM),
generic network setup (network-bridge/vif-bridge scripts).

Randomly, maybe after 10 to 60 minutes of uptime, a domU or two will
fall victim to bridge failure.  There's no syslog/dmesg output.  The
only report of the problem can by seen through network stats on dom0
(the domU vifX.X interfaces have huge TX drops), and 'brctl showmacs'
output is missing the MAC addresses for the domU's that have failed.

The issue has been identifed and fixed in this xen-devel mailing list
thread: http://thread.gmane.org/gmane.comp.emulators.xen.devel/88590

I applied Dongxiao Xu's changes to drivers/net/xen-netfront.c, taken
from Jeremy Fitzhardinge's git repository, to the linux-2.6 package,
tested and has proven to be stable for the last few days.  I have
attached this patch to this bug report.

BTW the following data reportbug collected about the kernel probably
isn't very interesting, though it is a domU with original/unpatched
Debian kernel.

-- Package-specific info:
** Version:
Linux version 2.6.32-5-xen-amd64 (Debian 2.6.32-21) (ben@decadent.org.uk) (gcc version 4.3.5 (Debian 4.3.5-2) ) #1 SMP Wed Aug 25 16:02:22 UTC 2010

** Command line:
root=/dev/xvda1 ro 

** Not tainted

** Kernel log:
[    0.024015]   alloc kstat_irqs on node 0
[    0.024019]   alloc irq_desc for 534 on node 0
[    0.024020]   alloc kstat_irqs on node 0
[    0.004000] Initializing CPU#1
[    0.004000] CPU: L1 I cache: 32K, L1 D cache: 32K
[    0.004000] CPU: L2 cache: 256K
[    0.004000] CPU: L3 cache: 8192K
[    0.004000] CPU 1/0x6 -> Node 0
[    0.004000] CPU: Unsupported number of siblings 16
[    0.024252] Brought up 2 CPUs
[    0.024328] CPU0 attaching sched-domain:
[    0.024333]  domain 0: span 0-1 level CPU
[    0.024337]   groups: 0 1
[    0.024346] CPU1 attaching sched-domain:
[    0.024349]  domain 0: span 0-1 level CPU
[    0.024352]   groups: 1 0
[    0.024495] devtmpfs: initialized
[    0.028697] Grant table initialized
[    0.028697] regulator: core version 0.5
[    0.028697] NET: Registered protocol family 16
[    0.028697]   alloc irq_desc for 533 on node 0
[    0.028697]   alloc kstat_irqs on node 0
[    0.028717] PCI: setting up Xen PCI frontend stub
[    0.029288] bio: create slab <bio-0> at 0
[    0.029288] ACPI: Interpreter disabled.
[    0.029288] xen_balloon: Initialising balloon driver with page order 0.
[    0.029288] vgaarb: loaded
[    0.029288] PCI: System does not support PCI
[    0.029288] PCI: System does not support PCI
[    0.029288] Switching to clocksource xen
[    0.029538] pnp: PnP ACPI: disabled
[    0.030125] NET: Registered protocol family 2
[    0.030243] IP route cache hash table entries: 2048 (order: 2, 16384 bytes)
[    0.030601] TCP established hash table entries: 8192 (order: 5, 131072 bytes)
[    0.030672] TCP bind hash table entries: 8192 (order: 5, 131072 bytes)
[    0.030704] TCP: Hash tables configured (established 8192 bind 8192)
[    0.030711] TCP reno registered
[    0.030781] NET: Registered protocol family 1
[    0.030840] Unpacking initramfs...
[    0.034064] Freeing initrd memory: 4904k freed
[    0.037392] platform rtc_cmos: registered platform RTC device (no PNP device found)
[    0.037743] audit: initializing netlink socket (disabled)
[    0.037762] type=2000 audit(1284332221.698:1): initialized
[    0.042365] HugeTLB registered 2 MB page size, pre-allocated 0 pages
[    0.044493] VFS: Disk quotas dquot_6.5.2
[    0.044566] Dquot-cache hash table entries: 512 (order 0, 4096 bytes)
[    0.044665] msgmni has been set to 488
[    0.045230] alg: No test for stdrng (krng)
[    0.045363] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 253)
[    0.045378] io scheduler noop registered
[    0.045388] io scheduler anticipatory registered
[    0.045397] io scheduler deadline registered
[    0.045447] io scheduler cfq registered (default)
[    0.055871] registering netback
[    0.057776]   alloc irq_desc for 532 on node 0
[    0.057781]   alloc kstat_irqs on node 0
[    0.058167] Linux agpgart interface v0.103
[    0.058211] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
[    0.058466] input: Macintosh mouse button emulation as /devices/virtual/input/input0
[    0.058528] PNP: No PS/2 controller found. Probing ports directly.
[    0.059385] i8042.c: No controller found.
[    0.059509] mice: PS/2 mouse device common for all mice
[    0.059672] rtc_cmos rtc_cmos: rtc core: registered rtc_cmos as rtc0
[    0.059744] cpuidle: using governor ladder
[    0.059750] cpuidle: using governor menu
[    0.059759] No iBFT detected.
[    0.060177] TCP cubic registered
[    0.060389] NET: Registered protocol family 10
[    0.061099] lo: Disabled Privacy Extensions
[    0.061473] Mobile IPv6
[    0.061480] NET: Registered protocol family 17
[    0.061622] PM: Resume from disk failed.
[    0.061637] registered taskstats version 1
[    0.064007] XENBUS: Device with no driver: device/vbd/51713
[    0.064007] XENBUS: Device with no driver: device/vbd/51714
[    0.064007] XENBUS: Device with no driver: device/vif/0
[    0.064007] XENBUS: Device with no driver: device/console/0
[    0.064007] /build/buildd-linux-2.6_2.6.32-21-amd64-bEMv9E/linux-2.6-2.6.32/debian/build/source_amd64_xen/drivers/rtc/hctosys.c: unable to open rtc device (rtc0)
[    0.064007] Initalizing network drop monitor service
[    0.153824] Freeing unused kernel memory: 592k freed
[    0.153971] Write protecting the kernel read-only data: 4320k
[    0.200844] udev: starting version 160
[    0.242532]   alloc irq_desc for 531 on node 0
[    0.242535]   alloc kstat_irqs on node 0
[    0.249938]   alloc irq_desc for 530 on node 0
[    0.249941]   alloc kstat_irqs on node 0
[    0.262922] blkfront: xvda1: barriers enabled
[    0.282053] blkfront: xvda2: barriers enabled
[    0.531739] kjournald starting.  Commit interval 5 seconds
[    0.531769] EXT3-fs: mounted filesystem with ordered data mode.
[    0.764289] udev: starting version 160
[    0.855268] Initialising Xen virtual ethernet driver.
[    0.856665]   alloc irq_desc for 529 on node 0
[    0.856667]   alloc kstat_irqs on node 0
[    0.881754] input: PC Speaker as /devices/platform/pcspkr/input/input1
[    0.886677] Error: Driver 'pcspkr' is already registered, aborting...
[    1.073785] Adding 1048568k swap on /dev/xvda2.  Priority:-1 extents:1 across:1048568k SS
[    1.129525] EXT3 FS on xvda1, internal journal
[    2.229159] ip_tables: (C) 2000-2006 Netfilter Core Team
[   12.048118] eth0: no IPv6 routers present

** Model information
not available

** Loaded modules:
Module                  Size  Used by
xt_multiport            2267  1 
iptable_filter          2258  1 
ip_tables              13899  1 iptable_filter
x_tables               12845  2 xt_multiport,ip_tables
snd_pcsp                6579  0 
snd_pcm                60519  1 snd_pcsp
snd_timer              15582  1 snd_pcm
evdev                   7352  0 
xen_netfront           16073  0 
snd                    46446  3 snd_pcsp,snd_pcm,snd_timer
soundcore               4598  1 snd
snd_page_alloc          6249  1 snd_pcm
ext3                  106502  1 
jbd                    37085  1 ext3
mbcache                 5050  1 ext3
xen_blkfront            9435  2 

** Network interface configuration:

auto lo
iface lo inet loopback

auto eth0
iface eth0 inet static
  address 192.168.1.30
  netmask 255.255.255.0
  gateway 192.168.1.1

** Network status:
*** IP interfaces and addresses:
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 00:16:3e:00:00:0d brd ff:ff:ff:ff:ff:ff
    inet 192.168.1.30/24 brd 192.168.1.255 scope global eth0
    inet6 fe80::216:3eff:fe00:d/64 scope link 
       valid_lft forever preferred_lft forever

*** Device statistics:
Inter-|   Receive                                                |  Transmit
 face |bytes    packets errs drop fifo frame compressed multicast|bytes    packets errs drop fifo colls carrier compressed
    lo:       0       0    0    0    0     0          0         0        0       0    0    0    0     0       0          0
  eth0: 1286892     677    0    0    0     0          0         0    96982     560    0    0    0     0       0          0

*** Protocol statistics:
Ip:
    651 total packets received
    0 forwarded
    0 incoming packets discarded
    618 incoming packets delivered
    544 requests sent out
Icmp:
    0 ICMP messages received
    0 input ICMP message failed.
    ICMP input histogram:
    0 ICMP messages sent
    0 ICMP messages failed
    ICMP output histogram:
Tcp:
    13 active connections openings
    2 passive connection openings
    0 failed connection attempts
    0 connection resets received
    6 connections established
    587 segments received
    515 segments send out
    0 segments retransmited
    0 bad segments received.
    2 resets sent
Udp:
    29 packets received
    0 packets to unknown port received.
    0 packet receive errors
    29 packets sent
UdpLite:
TcpExt:
    4 TCP sockets finished time wait in fast timer
    21 delayed acks sent
    170 packets directly queued to recvmsg prequeue.
    899 bytes directly received in process context from prequeue
    233 packet headers predicted
    56 packets header predicted and directly queued to user
    77 acknowledgments not containing data payload received
    148 predicted acknowledgments
IpExt:
    InBcastPkts: 2
    InOctets: 1286054
    OutOctets: 88478
    InBcastOctets: 463

*** Device features:
eth0: 0x50003
lo: 0x13865

** PCI devices:
not available

** USB devices:
not available


-- System Information:
Debian Release: squeeze/sid
  APT prefers testing
  APT policy: (500, 'testing')
Architecture: amd64 (x86_64)

Kernel: Linux 2.6.32-5-xen-amd64 (SMP w/2 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8) (ignored: LC_ALL set to en_US.UTF-8)
Shell: /bin/sh linked to /bin/dash

Versions of packages linux-image-2.6.32-5-xen-amd64 depends on:
ii  debconf [debconf-2.0]         1.5.35     Debian configuration management sy
ii  initramfs-tools               0.98.2     tools for generating an initramfs
ii  linux-base                    2.6.32-21  Linux image base package
ii  module-init-tools             3.12-1     tools for managing Linux kernel mo

Versions of packages linux-image-2.6.32-5-xen-amd64 recommends:
pn  firmware-linux-free           <none>     (no description available)

Versions of packages linux-image-2.6.32-5-xen-amd64 suggests:
pn  grub                          <none>     (no description available)
pn  linux-doc-2.6.32              <none>     (no description available)

Versions of packages linux-image-2.6.32-5-xen-amd64 is related to:
pn  firmware-bnx2                 <none>     (no description available)
pn  firmware-bnx2x                <none>     (no description available)
pn  firmware-ipw2x00              <none>     (no description available)
pn  firmware-ivtv                 <none>     (no description available)
pn  firmware-iwlwifi              <none>     (no description available)
pn  firmware-linux                <none>     (no description available)
pn  firmware-linux-nonfree        <none>     (no description available)
pn  firmware-qlogic               <none>     (no description available)
pn  firmware-ralink               <none>     (no description available)
pn  xen-hypervisor                <none>     (no description available)

-- debconf information:
  linux-image-2.6.32-5-xen-amd64/postinst/depmod-error-initrd-2.6.32-5-xen-amd64: false
  linux-image-2.6.32-5-xen-amd64/postinst/ignoring-do-bootloader-2.6.32-5-xen-amd64:
  linux-image-2.6.32-5-xen-amd64/prerm/removing-running-kernel-2.6.32-5-xen-amd64: true
  linux-image-2.6.32-5-xen-amd64/postinst/missing-firmware-2.6.32-5-xen-amd64:

-- 
Gerald Turner  Email: gturner@unzane.com  JID: gturner@jabber.unzane.com
GPG: 0xFA8CD6D5  21D9 B2E8 7FE7 F19E 5F7D  4D0C 3FA0 810F FA8C D6D5
diff -aNur linux-2.6-2.6.32.orig/debian/patches/features/all/xen/netfront-smartpoll-param.patch linux-2.6-2.6.32/debian/patches/features/all/xen/netfront-smartpoll-param.patch
--- linux-2.6-2.6.32.orig/debian/patches/features/all/xen/netfront-smartpoll-param.patch	1969-12-31 16:00:00.000000000 -0800
+++ linux-2.6-2.6.32/debian/patches/features/all/xen/netfront-smartpoll-param.patch	2010-09-11 10:26:16.000000000 -0700
@@ -0,0 +1,101 @@
+$ git clone git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen.git
+$ git checkout xen/netfront
+$ git diff 5473680bdedb7a62e641970119e6e9381a8d80f4..3b966565a89659f938a4fd662c8475f0c00e0606
+
+diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
+index e894dd2..23b9e4d 100644
+--- a/drivers/net/xen-netfront.c
++++ b/drivers/net/xen-netfront.c
+@@ -53,6 +53,10 @@
+ 
+ static const struct ethtool_ops xennet_ethtool_ops;
+ 
++static int use_smartpoll = 1;
++module_param(use_smartpoll, int, 0600);
++MODULE_PARM_DESC (use_smartpoll, "Use smartpoll mechanism if available");
++
+ struct netfront_cb {
+ 	struct page *page;
+ 	unsigned offset;
+@@ -77,8 +81,8 @@ struct netfront_smart_poll {
+ 
+ #define GRANT_INVALID_REF	0
+ 
+-#define NET_TX_RING_SIZE __RING_SIZE((struct xen_netif_tx_sring *)0, PAGE_SIZE)
+-#define NET_RX_RING_SIZE __RING_SIZE((struct xen_netif_rx_sring *)0, PAGE_SIZE)
++#define NET_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE)
++#define NET_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE)
+ #define TX_MAX_TARGET min_t(int, NET_RX_RING_SIZE, 256)
+ 
+ struct netfront_info {
+@@ -1397,10 +1401,15 @@ static irqreturn_t xennet_interrupt(int irq, void *dev_id)
+ 			napi_schedule(&np->napi);
+ 	}
+ 
+-	if (np->smart_poll.feature_smart_poll)
+-		hrtimer_start(&np->smart_poll.timer,
+-			ktime_set(0, NANO_SECOND/np->smart_poll.smart_poll_freq),
+-			HRTIMER_MODE_REL);
++	if (np->smart_poll.feature_smart_poll) {
++		if ( hrtimer_start(&np->smart_poll.timer,
++			ktime_set(0,NANO_SECOND/np->smart_poll.smart_poll_freq),
++			HRTIMER_MODE_REL) ) {
++			printk(KERN_DEBUG "Failed to start hrtimer,"
++					"use interrupt mode for this packet\n");
++			np->rx.sring->private.netif.smartpoll_active = 0;
++		}
++	}
+ 
+ 	spin_unlock_irqrestore(&np->tx_lock, flags);
+ 
+@@ -1538,7 +1547,7 @@ again:
+ 		goto abort_transaction;
+ 	}
+ 
+-	err = xenbus_printf(xbt, dev->nodename, "feature-smart-poll", "%d", 1);
++	err = xenbus_printf(xbt, dev->nodename, "feature-smart-poll", "%d", use_smartpoll);
+ 	if (err) {
+ 		message = "writing feature-smart-poll";
+ 		goto abort_transaction;
+@@ -1631,11 +1640,14 @@ static int xennet_connect(struct net_device *dev)
+ 		return -ENODEV;
+ 	}
+ 
+-	err = xenbus_scanf(XBT_NIL, np->xbdev->otherend,
+-			   "feature-smart-poll", "%u",
+-			   &np->smart_poll.feature_smart_poll);
+-	if (err != 1)
+-		np->smart_poll.feature_smart_poll = 0;
++	np->smart_poll.feature_smart_poll = 0;
++	if (use_smartpoll) {
++		err = xenbus_scanf(XBT_NIL, np->xbdev->otherend,
++				   "feature-smart-poll", "%u",
++				   &np->smart_poll.feature_smart_poll);
++		if (err != 1)
++			np->smart_poll.feature_smart_poll = 0;
++	}
+ 
+ 	if (np->smart_poll.feature_smart_poll) {
+ 		hrtimer_init(&np->smart_poll.timer, CLOCK_MONOTONIC,
+diff --git a/include/xen/interface/io/ring.h b/include/xen/interface/io/ring.h
+index 7b301fa..c9ba846 100644
+--- a/include/xen/interface/io/ring.h
++++ b/include/xen/interface/io/ring.h
+@@ -24,8 +24,15 @@ typedef unsigned int RING_IDX;
+  * A ring contains as many entries as will fit, rounded down to the nearest
+  * power of two (so we can mask with (size-1) to loop around).
+  */
+-#define __RING_SIZE(_s, _sz) \
+-    (__RD32(((_sz) - (long)&(_s)->ring + (long)(_s)) / sizeof((_s)->ring[0])))
++#define __CONST_RING_SIZE(_s, _sz)				\
++	(__RD32(((_sz) - offsetof(struct _s##_sring, ring)) /	\
++		sizeof(((struct _s##_sring *)0)->ring[0])))
++
++/*
++ * The same for passing in an actual pointer instead of a name tag.
++ */
++#define __RING_SIZE(_s, _sz)						\
++	(__RD32(((_sz) - (long)&(_s)->ring + (long)(_s)) / sizeof((_s)->ring[0])))
+ 
+ /*
+  * Macros to make the correct C datatypes for a new kind of ring.
diff -aNur linux-2.6-2.6.32.orig/debian/patches/series/21-extra linux-2.6-2.6.32/debian/patches/series/21-extra
--- linux-2.6-2.6.32.orig/debian/patches/series/21-extra	2010-09-12 15:49:48.000000000 -0700
+++ linux-2.6-2.6.32/debian/patches/series/21-extra	2010-09-11 10:54:10.000000000 -0700
@@ -16,4 +16,5 @@
 + features/all/xen/pvhvm/0016-xen-pvhvm-rename-xen_emul_unplug-ignore-to-unnnec.patch featureset=xen
 + features/all/xen/pvhvm/0017-xen-pvhvm-make-it-clearer-that-XEN_UNPLUG_-define.patch featureset=xen
 + features/all/xen/pvops.patch featureset=xen
++ features/all/xen/netfront-smartpoll-param.patch featureset=xen
 + features/all/xen/revert-stack-guard.patch featureset=xen
diff -aNur linux-2.6-2.6.32.orig/debian/patches/series/21-extra~ linux-2.6-2.6.32/debian/patches/series/21-extra~
--- linux-2.6-2.6.32.orig/debian/patches/series/21-extra~	1969-12-31 16:00:00.000000000 -0800
+++ linux-2.6-2.6.32/debian/patches/series/21-extra~	2010-09-11 10:30:27.000000000 -0700
@@ -0,0 +1,19 @@
++ features/all/xen/pvhvm/0001-xen-Add-support-for-HVM-hypercalls.patch featureset=xen
++ features/all/xen/pvhvm/0002-x86-early-PV-on-HVM-features-initialization.patch featureset=xen
++ features/all/xen/pvhvm/0003-x86-xen-event-channels-delivery-on-HVM.patch featureset=xen
++ features/all/xen/pvhvm/0004-xen-Xen-PCI-platform-device-driver.patch featureset=xen
++ features/all/xen/pvhvm/0005-xen-Add-suspend-resume-support-for-PV-on-HVM-guests.patch featureset=xen
++ features/all/xen/pvhvm/0006-xen-Fix-find_unbound_irq-in-presence-of-ioapic-irqs.patch featureset=xen
++ features/all/xen/pvhvm/0007-x86-Use-xen_vcpuop_clockevent-xen_clocksource-and.patch featureset=xen
++ features/all/xen/pvhvm/0008-x86-Unplug-emulated-disks-and-nics.patch featureset=xen
++ features/all/xen/pvhvm/0009-x86-Call-HVMOP_pagetable_dying-on-exit_mmap.patch featureset=xen
++ features/all/xen/pvhvm/0010-xenfs-enable-for-HVM-domains-too.patch featureset=xen
++ features/all/xen/pvhvm/0011-support-multiple-.discard.-sections-to-avoid-sectio.patch featureset=xen
++ features/all/xen/pvhvm/0012-blkfront-do-not-create-a-PV-cdrom-device-if-xen_hvm.patch featureset=xen
++ features/all/xen/pvhvm/0013-Introduce-CONFIG_XEN_PVHVM-compile-option.patch featureset=xen
++ features/all/xen/pvhvm/0014-pvops-do-not-notify-callers-from-register_xenstore_.patch featureset=xen
++ features/all/xen/pvhvm/0015-xen-pvhvm-allow-user-to-request-no-emulated-device.patch featureset=xen
++ features/all/xen/pvhvm/0016-xen-pvhvm-rename-xen_emul_unplug-ignore-to-unnnec.patch featureset=xen
++ features/all/xen/pvhvm/0017-xen-pvhvm-make-it-clearer-that-XEN_UNPLUG_-define.patch featureset=xen
++ features/all/xen/pvops.patch featureset=xen
++ features/all/xen/revert-stack-guard.patch featureset=xen

Reply to: