[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#516479: linux-image-2.6.26-1-xen-amd64: kernel-panic in xen_spin_wait an mutlicore dom0 with high load, not interruption save?



Package: linux-image-2.6.26-1-xen-amd64
Version: 2.6.26-13
Severity: important


hi folks,

some weeks ago, we had installed xen as Dom0 on three different
server with Lenny. All works fine, but sometimes one server
(with 2 Quad-CPU's) rebooted unexpected without any messages 
in the logs! I found out, the the reboot occurs until my backup 
script is running. The script copies some LV's from the lvm from 
an other server, with netcat to this one (totally 90GB) and then 
writes the images to the local disk.
I have tested the situation at least 20 times and the kernl-panic
ALWAYS occurs, but the duration between the start of the script and
the panic occurs was different, randomly between 5min and 60min (almost 
at the end of the script). One time I catched the kernel-panic with a 
digicam [3].
Until further tests the last weeks, I had only one additional case with 
a unexpected reboot without any logs, perhaps the same. I think the chance
is minimal. But I can not use my Xen-Dom0 for productiv DomU's if he 
can/will crash on high (interrupt-)load!

I looked at the code, but I can not determine if it is interruption save,
but I think it is a problem like this. Some discussion about improvment
of this code is found in the thread here [1].

After some research I found a workaround to set dom0-cpus=1. With this 
config, the backup-script runs through and the maschine never restartet 
unexpected without log. I think, that substantiate the suspicion, that it
is an interruption problem.
But this workaround is not practical anymore, because there is an other
known problem. I can not shutdown the DomU's anymore, they stay in the 
state "---s--" (see here [2]).

Greetings,
David Leuenberger

[1] http://lists.xensource.com/archives/html/xen-devel/2008-08/msg00669.html
[2] http://www.nabble.com/Domain-status-after-shutdown-command%3A----s---td15565767.html#a16098048
[3] The local screen after the kernel-panic (unfortunately, the last 3 cols where
not show on the screen)
[  270.844212]  [<ffffffff8026547e>] ? generic_file_unbuffered_write+0x1c0/0x63c....
[  270.844212]  [<ffffffff80231401>] ? current_fs_time+0x1e/0x24
[  270.844212]  [<ffffffff80265c39>] ? __generic_file_aio_write_nolock+0x33f/0...
a9
[  270.844212]  [<ffffffff802a1a03>] ? mnt_drop_write+0x23/0x118
[  270.844212]  [<ffffffff80265d04>] ? generic_file_aio_write+0x61/0xc1
[  270.844212]  [<ffffffffa01772fe>] ? :ext3:ext3_file_write+0x16/0x94
[  270.844212]  [<ffffffff8028a127>] ? do_sync_write+0xc9/0x10c
[  270.844212]  [<ffffffff8020e7b4>] ? get_nsec_offset+0x9/0x2c
[  270.844212]  [<ffffffff8023f691>] ? autoremove_wake_function+0x0/0x2e
[  270.844212]  [<ffffffff8042482b>] ? thread_return+0x3e/0xdb
[  270.844212]  [<ffffffff8028a8d1>] ? vfs_write+0xad/0x156
[  270.844212]  [<ffffffff8028ae73>] ? sys_write0x45/0x6e
[  270.844212]  [<ffffffff8020b528>] ? system_call+0x68/0x6d
[  270.844212]  [<ffffffff8020b4c0>] ? system_call+0x0/0x6d
[  270.844212]
[  270.844212]
[  270.844212] Code: 30 fa 58 80 4c 39 2c 08 75 04 0f 0b eb fe 48 c7 c0 40 fa...
 80 eb 1f 65 48 8b 04 25 10 00 00 00 66 f7 80 44 e0 ff ff 00 ff 75 04 <0f> 0b...
 fe 48 c7 c0 30 fa 58 80 48 8d 1c 08 48 83 3b 00 74 04
[  270.860183] RIP  [<ffffffff8037fc9c>] xen_spin_wait+0x90/0x139
[  270.860183]  RSP <ffffffff80595e68>
[  270.860183] ---[ end trace ccfa2c4ba9fb97dd ]---
[  270.860183] Kernel panic - not syncing: Aiee, killing interrupt handler!


-- System Information:
Debian Release: 5.0
  APT prefers stable
  APT policy: (500, 'stable')
Architecture: amd64 (x86_64)

Kernel: Linux 2.6.26-1-xen-amd64 (SMP w/1 CPU core)
Locale: LANG=de_CH.UTF-8, LC_CTYPE=de_CH.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/bash

Versions of packages linux-image-2.6.26-1-xen-amd64 depends on:
ii  initramfs-tools               0.92o      tools for generating an initramfs
ii  linux-modules-2.6.26-1-xen-am 2.6.26-13  Linux 2.6.26 modules on AMD64

linux-image-2.6.26-1-xen-amd64 recommends no packages.

Versions of packages linux-image-2.6.26-1-xen-amd64 suggests:
ii  grub                       0.97-47lenny2 GRand Unified Bootloader (Legacy v
pn  linux-doc-2.6.26           <none>        (no description available)

-- no debconf information



Reply to: