[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#857183: linux-source-3.16: kernel panic while running under KVM - self-detected stall - v9fs_vfs in stack



Package: linux-source-3.16
Version: 3.16.7-ckt20-1+deb8u4
Severity: important
Tags: d-i upstream

Dear Maintainer,

We are running this system under KVM. The KVM host is running:
  3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u2 

The client is running:
  3.16.0-4-686-pae #1 SMP Debian 3.16.7-ckt20-1+deb8u4

We use a 9p filesystem to transfer some data from the KVM-host to the client. This
transfer is cron triggered (on both the KVM-host and the client) and the panic is around
the triggered time. The KVM-host writes some files, the client reads them.
The panic mainly (nearly only) occurs on the busiest production systems. The panics vanish
when I disable the use of the 9p filesystem (== don't mount it on the client). This strongly
hints to the 9p filesystem as the problem...
Unfortunately I cannot reproduce this problem on a test situation :-(

The kernel trace:

Mar  2 18:31:24 xxxx kernel: [90898.008004] INFO: rcu_sched self-detected stall on CPU { 2}  (t=5250 jiffies g=3946477 c=3946476 q=19488)
Mar  2 18:31:24 xxxx kernel: [90898.008004] sending NMI to all CPUs:
Mar  2 18:31:24 xxxx kernel: [90898.008004] NMI backtrace for cpu 2
Mar  2 18:31:24 xxxx kernel: [90898.008004] EIP: 0060:[<c103ffe5>] EFLAGS: 00010046 CPU: 2
Mar  2 18:31:24 xxxx kernel: [90898.008004] EIP is at default_send_IPI_mask_logical+0x85/0xc0
Mar  2 18:31:24 xxxx kernel: [90898.008004] EAX: 00000c00 EBX: 0f000000 ECX: fffff000 EDX: fffff000
Mar  2 18:31:24 xxxx kernel: [90898.008004] ESI: 00000002 EDI: 00000006 EBP: e855fd64 ESP: e855fd54
Mar  2 18:31:24 xxxx kernel: [90898.008004]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
Mar  2 18:31:24 xxxx kernel: [90898.008004] CR0: 8005003b CR2: 09a022d8 CR3: 1ace1000 CR4: 000006f0
Mar  2 18:31:24 xxxx kernel: [90898.008004] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
Mar  2 18:31:24 xxxx kernel: [90898.008004] DR6: fffe0ff0 DR7: 00000400
Mar  2 18:31:24 xxxx kernel: [90898.008004]  00000800 c1614f80 c1614f80 f75d2620 e855fd78 c10402ec c15483d6 c157a0b6
Mar  2 18:31:24 xxxx kernel: [90898.008004]  c1614f80 e855fdc0 c10ac8cd c1557904 00001482 003c37ed 003c37ec 00004c20
Mar  2 18:31:24 xxxx kernel: [90898.008004]  00000000 e855fdc0 c1084750 00000001 c166a5ec c1614f80 f75d2620 00000002
Mar  2 18:31:24 xxxx kernel: [90898.008004]  [<c10402ec>] ? arch_trigger_all_cpu_backtrace+0x5c/0xd0
Mar  2 18:31:24 xxxx kernel: [90898.008004]  [<c10ac8cd>] ? rcu_check_callbacks+0x38d/0x5b0
Mar  2 18:31:24 xxxx kernel: [90898.008004]  [<c1084750>] ? account_process_tick+0x60/0x130
Mar  2 18:31:24 xxxx kernel: [90898.008004]  [<c1062d4c>] ? update_process_times+0x3c/0x60
Mar  2 18:31:24 xxxx kernel: [90898.008004]  [<c10b7856>] ? tick_sched_handle.isra.13+0x26/0x60
Mar  2 18:31:24 xxxx kernel: [90898.008004]  [<c10b78c7>] ? tick_sched_timer+0x37/0x70
Mar  2 18:31:24 xxxx kernel: [90898.008004]  [<c1075d58>] ? __remove_hrtimer+0x38/0x90
Mar  2 18:31:24 xxxx kernel: [90898.008004]  [<c107666d>] ? __run_hrtimer+0x6d/0x190
Mar  2 18:31:24 xxxx kernel: [90898.008004]  [<c10b7890>] ? tick_sched_handle.isra.13+0x60/0x60
Mar  2 18:31:24 xxxx kernel: [90898.008004]  [<c1076de8>] ? hrtimer_interrupt+0x1e8/0x2a0
Mar  2 18:31:24 xxxx kernel: [90898.008004]  [<c108960e>] ? check_preempt_wakeup+0x12e/0x1a0
Mar  2 18:31:24 xxxx kernel: [90898.008004]  [<c103e34f>] ? local_apic_timer_interrupt+0x2f/0x60
Mar  2 18:31:24 xxxx kernel: [90898.008004]  [<c147f843>] ? smp_apic_timer_interrupt+0x33/0x50
Mar  2 18:31:24 xxxx kernel: [90898.008004]  [<c147ef7c>] ? apic_timer_interrupt+0x34/0x3c
Mar  2 18:31:24 xxxx kernel: [90898.008004]  [<c11000e0>] ? event_enable_trigger_func+0x170/0x290
Mar  2 18:31:24 xxxx kernel: [90898.008004]  [<c117034d>] ? generic_fillattr+0x9d/0xa0
Mar  2 18:31:24 xxxx kernel: [90898.008004]  [<f816dc08>] ? v9fs_vfs_getattr_dotl+0x58/0x90 [9p]
Mar  2 18:31:24 xxxx kernel: [90898.008004]  [<f816dbb0>] ? v9fs_vfs_mkdir_dotl+0x1e0/0x1e0 [9p]
Mar  2 18:31:24 xxxx kernel: [90898.008004]  [<c1170373>] ? vfs_getattr_nosec+0x23/0x40
Mar  2 18:31:24 xxxx kernel: [90898.008004]  [<c1170469>] ? vfs_fstatat+0x59/0x90
Mar  2 18:31:24 xxxx kernel: [90898.008004]  [<c1170e46>] ? SyS_stat64+0x26/0x40
Mar  2 18:31:24 xxxx kernel: [90898.008004]  [<c1169e8a>] ? filp_close+0x4a/0x70
Mar  2 18:31:24 xxxx kernel: [90898.008004]  [<c147e709>] ? syscall_call+0x7/0x7
Mar  2 18:31:24 xxxx kernel: [90898.011759] NMI backtrace for cpu 3
Mar  2 18:31:24 xxxx kernel: [90898.012003] EIP: 0060:[<c117034d>] EFLAGS: 00000202 CPU: 3
Mar  2 18:31:24 xxxx kernel: [90898.012003] EIP is at generic_fillattr+0x9d/0xa0
Mar  2 18:31:24 xxxx kernel: [90898.012003] EAX: f37171d8 EBX: 0004200b ECX: 00000000 EDX: f0533f5c
Mar  2 18:31:24 xxxx kernel: [90898.012003] ESI: 00001bbd EDI: f0533f5c EBP: f0533efc ESP: f0533ef4
Mar  2 18:31:24 xxxx kernel: [90898.012003]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
Mar  2 18:31:24 xxxx kernel: [90898.012003] CR0: 8005003b CR2: 09a022d8 CR3: 325c1000 CR4: 000006f0
Mar  2 18:31:24 xxxx kernel: [90898.012003] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
Mar  2 18:31:24 xxxx kernel: [90898.012003] DR6: fffe0ff0 DR7: 00000400
Mar  2 18:31:24 xxxx kernel: [90898.012003]  d33bc880 ed2d2100 f0533f10 f816dc08 f816dbb0 d33bc880 f37171d8 f0533f24
Mar  2 18:31:24 xxxx kernel: [90898.012003]  c1170373 f0533f3c 00000001 00000000 f0533f50 c1170469 f0533f3c f0533f5c
Mar  2 18:31:24 xxxx kernel: [90898.012003]  080f9547 ffffff9c f28e6a10 d33bc880 bffa6ac0 09a15508 09a03be8 f0533fac
Mar  2 18:31:24 xxxx kernel: [90898.012003]  [<f816dc08>] ? v9fs_vfs_getattr_dotl+0x58/0x90 [9p]
Mar  2 18:31:24 xxxx kernel: [90898.012003]  [<f816dbb0>] ? v9fs_vfs_mkdir_dotl+0x1e0/0x1e0 [9p]
Mar  2 18:31:24 xxxx kernel: [90898.012003]  [<c1170373>] ? vfs_getattr_nosec+0x23/0x40
Mar  2 18:31:24 xxxx kernel: [90898.012003]  [<c1170469>] ? vfs_fstatat+0x59/0x90
Mar  2 18:31:24 xxxx kernel: [90898.012003]  [<c1170e46>] ? SyS_stat64+0x26/0x40
Mar  2 18:31:24 xxxx kernel: [90898.012003]  [<c1169e8a>] ? filp_close+0x4a/0x70
Mar  2 18:31:24 xxxx kernel: [90898.012003]  [<c147e709>] ? syscall_call+0x7/0x7
Mar  2 18:31:24 xxxx kernel: [90898.010212] NMI backtrace for cpu 1
Mar  2 18:31:24 xxxx kernel: [90898.012003] INFO: rcu_sched self-detected stall on CPU { 3}  (t=5250 jiffies g=3946477 c=3946476 q=19488)
Mar  2 18:31:24 xxxx kernel: [90898.010212] EIP: 0060:[<c1049a12>] EFLAGS: 00200246 CPU: 1
Mar  2 18:31:24 xxxx kernel: [90898.010212] EIP is at native_safe_halt+0x2/0x10
Mar  2 18:31:24 xxxx kernel: [90898.010212] EAX: 00000000 EBX: f3910000 ECX: 00000000 EDX: f3910000
Mar  2 18:31:24 xxxx kernel: [90898.010212] ESI: ffffffed EDI: 00000001 EBP: f3911f50 ESP: f3911f40
Mar  2 18:31:24 xxxx kernel: [90898.010212]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
Mar  2 18:31:24 xxxx kernel: [90898.010212] CR0: 8005003b CR2: 09e40c08 CR3: 2dc50000 CR4: 000006f0
Mar  2 18:31:24 xxxx kernel: [90898.010212] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
Mar  2 18:31:24 xxxx kernel: [90898.010212] DR6: fffe0ff0 DR7: 00000400
Mar  2 18:31:24 xxxx kernel: [90898.010212]  c10180dc f3910000 ffffffed 00000000 f3911f58 c101887e f3911f90 c1090c53
Mar  2 18:31:24 xxxx kernel: [90898.010212]  f3911fec 00000000 f3911fec 00200282 f3911fec 00000000 00000000 a4d09de9
Mar  2 18:31:24 xxxx kernel: [90898.010212]  043f7133 98ca892d 00000000 00000000 f3911fb4 c103c527 00000000 00000000
Mar  2 18:31:24 xxxx kernel: [90898.010212]  [<c10180dc>] ? default_idle+0x1c/0xa0
Mar  2 18:31:24 xxxx kernel: [90898.010212]  [<c101887e>] ? arch_cpu_idle+0xe/0x10
Mar  2 18:31:24 xxxx kernel: [90898.010212]  [<c1090c53>] ? cpu_startup_entry+0x303/0x3b0
Mar  2 18:31:24 xxxx kernel: [90898.010212]  [<c103c527>] ? start_secondary+0x207/0x2e0
Mar  2 18:31:24 xxxx kernel: [90898.012003] NMI backtrace for cpu 0
Mar  2 18:31:24 xxxx kernel: [90898.012003] EIP: 0060:[<c1049a12>] EFLAGS: 00200246 CPU: 0
Mar  2 18:31:24 xxxx kernel: [90898.012003] EIP is at native_safe_halt+0x2/0x10
Mar  2 18:31:24 xxxx kernel: [90898.012003] EAX: 00000000 EBX: c15ee000 ECX: 00000000 EDX: c15ee000
Mar  2 18:31:24 xxxx kernel: [90898.012003] ESI: ffffffed EDI: 00000000 EBP: c15eff98 ESP: c15eff88
Mar  2 18:31:24 xxxx kernel: [90898.012003]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
Mar  2 18:31:24 xxxx kernel: [90898.012003] CR0: 8005003b CR2: 0975d050 CR3: 19e55000 CR4: 000006f0
Mar  2 18:31:24 xxxx kernel: [90898.012003] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
Mar  2 18:31:24 xxxx kernel: [90898.012003] DR6: fffe0ff0 DR7: 00000400
Mar  2 18:31:24 xxxx kernel: [90898.012003]  c10180dc c15ee000 ffffffed c171f800 c15effa0 c101887e c15effd8 c1090c53
Mar  2 18:31:24 xxxx kernel: [90898.012003]  c15effec c15effbc c15effec 00000000 c15effec 00000000 00000000 0410eae1
Mar  2 18:31:24 xxxx kernel: [90898.012003]  1b158d91 c16c23a0 00099800 c171f800 01bf6003 c1671bf9 0000007e ffffffff
Mar  2 18:31:24 xxxx kernel: [90898.012003]  [<c10180dc>] ? default_idle+0x1c/0xa0
Mar  2 18:31:24 xxxx kernel: [90898.012003]  [<c101887e>] ? arch_cpu_idle+0xe/0x10
Mar  2 18:31:24 xxxx kernel: [90898.012003]  [<c1090c53>] ? cpu_startup_entry+0x303/0x3b0
Mar  2 18:31:24 xxxx kernel: [90898.012003]  [<c1671bf9>] ? start_kernel+0x3dd/0x3e2
Mar  2 18:31:24 xxxx kernel: [90898.012003]  [<c1671625>] ? set_init_arg+0x45/0x45
Mar  2 18:31:24 xxxx kernel: [90898.008004] INFO: NMI handler (arch_trigger_all_cpu_backtrace_handler) took too long to run: 15.261 msecs
Mar  2 18:31:50 xxxx kernel: [90924.056004] EIP: 0060:[<c117034d>] EFLAGS: 00000202 CPU: 2
Mar  2 18:31:50 xxxx kernel: [90924.056004] EIP is at generic_fillattr+0x9d/0xa0
Mar  2 18:31:50 xxxx kernel: [90924.056004] EAX: f37171d8 EBX: 0004200b ECX: 00000000 EDX: e855ff5c
Mar  2 18:31:50 xxxx kernel: [90924.056004] ESI: 00001bbd EDI: e855ff5c EBP: e855fefc ESP: e855fef4
Mar  2 18:31:50 xxxx kernel: [90924.056004]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
Mar  2 18:31:50 xxxx kernel: [90924.056004] CR0: 8005003b CR2: 09a022d8 CR3: 1ace1000 CR4: 000006f0
Mar  2 18:31:50 xxxx kernel: [90924.056004] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
Mar  2 18:31:50 xxxx kernel: [90924.056004] DR6: fffe0ff0 DR7: 00000400
Mar  2 18:31:50 xxxx kernel: [90924.056004]  d33bc880 eec3aa00 e855ff10 f816dc08 f816dbb0 d33bc880 f37171d8 e855ff24
Mar  2 18:31:50 xxxx kernel: [90924.056004]  c1170373 e855ff3c 00000001 00000000 e855ff50 c1170469 e855ff3c e855ff5c
Mar  2 18:31:50 xxxx kernel: [90924.056004]  080f9547 ffffff9c f28e6a10 d33bc880 bffa6ac0 09a15508 09a173c8 e855ffac
Mar  2 18:31:50 xxxx kernel: [90924.056004]  [<f816dc08>] ? v9fs_vfs_getattr_dotl+0x58/0x90 [9p]
Mar  2 18:31:50 xxxx kernel: [90924.056004]  [<f816dbb0>] ? v9fs_vfs_mkdir_dotl+0x1e0/0x1e0 [9p]
Mar  2 18:31:50 xxxx kernel: [90924.056004]  [<c1170373>] ? vfs_getattr_nosec+0x23/0x40
Mar  2 18:31:50 xxxx kernel: [90924.056004]  [<c1170469>] ? vfs_fstatat+0x59/0x90
Mar  2 18:31:50 xxxx kernel: [90924.056004]  [<c1170e46>] ? SyS_stat64+0x26/0x40
Mar  2 18:31:50 xxxx kernel: [90924.056004]  [<c1169e8a>] ? filp_close+0x4a/0x70
Mar  2 18:31:50 xxxx kernel: [90924.056004]  [<c147e709>] ? syscall_call+0x7/0x7
Mar  2 18:31:50 xxxx kernel: [90924.072004] EIP: 0060:[<c117034d>] EFLAGS: 00000202 CPU: 3
Mar  2 18:31:50 xxxx kernel: [90924.072004] EIP is at generic_fillattr+0x9d/0xa0
Mar  2 18:31:50 xxxx kernel: [90924.072004] EAX: f37171d8 EBX: 0004200b ECX: 00000000 EDX: f0533f5c
   ... this keeps going....

-- System Information:
Debian Release: A bit of a mix between 7 and 8
  APT prefers stable
Architecture: i386 client, amd64 host


Reply to: