[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#398251: Oopses in domUs with pass-through PCI devices



Package: linux-image-2.6.18-2-xen-vserver-686
Version: 2.6.18-5

I use Debians 2.6.18-5 kernel sources to build my own xen-vserver kernel image 
for my Sarge XEN/vserver system. 
The problems looks like the following: DomUs with pass-through PCI devices 
seem to have problems with interrupt handling like the following dmesg 
excerpt shows (from domU named 'trustedvserver'):

irq 16: nobody cared (try booting with the "irqpoll" option)
 [<c014f4aa>] __report_bad_irq+0x2a/0x90
 [<c014f5e7>] note_interrupt+0xb7/0xe0
 [<c014ebc8>] __do_IRQ+0x138/0x150
 [<c0106ce8>] do_IRQ+0x48/0xa0
 [<c0257ee3>] evtchn_do_upcall+0x93/0x110
 [<c0105285>] hypervisor_callback+0x3d/0x48
 [<c01099f7>] raw_safe_halt+0x27/0x60
 [<c0102ad3>] xen_idle+0x23/0x40
 [<c0102b73>] cpu_idle+0x83/0xd0
 [<c0406a0f>] start_kernel+0x1cf/0x230
 [<c0406300>] unknown_bootoption+0x0/0x1e0
handlers:
[<c02aead0>] (usb_hcd_irq+0x0/0x70)
Disabling IRQ #16

(This is the onboard USB controller with an usb sound apdater attached to it)
The bug also seems to kill the kernel NFS server (it doesn't respond after 
such a kernel oops and the domU has to be restartet). 


Or this dmesg excerpt is from another domU called 'router' with 2 NICs (a 3Com 
10/100 (ethdsl) and an Intel e1000 (eth0)):

NETDEV WATCHDOG: eth0: transmit timed out
e1000: eth0: e1000_watchdog: NIC Link is Up 1000 Mbps Full Duplex
NETDEV WATCHDOG: ethdsl: transmit timed out
ethdsl: transmit timed out, tx_status 00 status e601.
  diagnostics: net 0ccc media 8880 dma 0000003a fifo 0000
ethdsl: Interrupt posted but not delivered -- IRQ blocked by another device?
  Flags; bus-master 1, dirty 16894(14) current 16894(14)
  Transmit list 00000000 vs. c43d7ac0.
  0: @c43d7200  length 80000024 status 00010024
  1: @c43d72a0  length 80000024 status 00010024
  2: @c43d7340  length 80000024 status 00010024
  3: @c43d73e0  length 80000024 status 00010024
  4: @c43d7480  length 80000024 status 00010024
  5: @c43d7520  length 80000024 status 00010024
  6: @c43d75c0  length 80000024 status 00010024
  7: @c43d7660  length 80000024 status 00010024
  8: @c43d7700  length 80000024 status 00010024
  9: @c43d77a0  length 80000020 status 00010020
  10: @c43d7840  length 80000020 status 00010020
  11: @c43d78e0  length 80000020 status 00010020
  12: @c43d7980  length 80000020 status 80010020
  13: @c43d7a20  length 80000020 status 80010020
  14: @c43d7ac0  length 80000034 status 00010034
  15: @c43d7b60  length 80000024 status 00010024
ethdsl: Resetting the Tx ring pointer.


A third domU (called 'dmzvserver)' without any physical devices does not 
suffer from this problems. 


I'm using XEN 3.0.3 (no Debian packages, the stock XEN release tarball).
When booting the domUs with my previous 2.6.16.27 kernel (for XEN 3.0.2) there 
is no problem although I have to use kernel 2.6.18.2 for dom0 (XEN 3.0.3 
doesn't want to boot dom0 with my 2.6.16.27). 
I also suffered from this problem with the 2.6.18-3/2.6.18-4 sources with 
patch mentioned in #397281 applied. 

More details, more complete dmesg output and various other config files can be 
found here: http://markus.schuster.name/2.6.18.2-xen-vserver/

Regards, 
Markus Schuster

Attachment: pgpGlaotUvVcH.pgp
Description: PGP signature


Reply to: