Bug#516479: linux-image-2.6.26-1-xen-amd64: kernel-panic in xen_spin_wait an mutlicore dom0 with high load, not interruption save?
Package: linux-image-2.6.26-1-xen-amd64
Version: 2.6.26-13
Severity: important
hi folks,
some weeks ago, we had installed xen as Dom0 on three different
server with Lenny. All works fine, but sometimes one server
(with 2 Quad-CPU's) rebooted unexpected without any messages
in the logs! I found out, the the reboot occurs until my backup
script is running. The script copies some LV's from the lvm from
an other server, with netcat to this one (totally 90GB) and then
writes the images to the local disk.
I have tested the situation at least 20 times and the kernl-panic
ALWAYS occurs, but the duration between the start of the script and
the panic occurs was different, randomly between 5min and 60min (almost
at the end of the script). One time I catched the kernel-panic with a
digicam [3].
Until further tests the last weeks, I had only one additional case with
a unexpected reboot without any logs, perhaps the same. I think the chance
is minimal. But I can not use my Xen-Dom0 for productiv DomU's if he
can/will crash on high (interrupt-)load!
I looked at the code, but I can not determine if it is interruption save,
but I think it is a problem like this. Some discussion about improvment
of this code is found in the thread here [1].
After some research I found a workaround to set dom0-cpus=1. With this
config, the backup-script runs through and the maschine never restartet
unexpected without log. I think, that substantiate the suspicion, that it
is an interruption problem.
But this workaround is not practical anymore, because there is an other
known problem. I can not shutdown the DomU's anymore, they stay in the
state "---s--" (see here [2]).
Greetings,
David Leuenberger
[1] http://lists.xensource.com/archives/html/xen-devel/2008-08/msg00669.html
[2] http://www.nabble.com/Domain-status-after-shutdown-command%3A----s---td15565767.html#a16098048
[3] The local screen after the kernel-panic (unfortunately, the last 3 cols where
not show on the screen)
[ 270.844212] [<ffffffff8026547e>] ? generic_file_unbuffered_write+0x1c0/0x63c....
[ 270.844212] [<ffffffff80231401>] ? current_fs_time+0x1e/0x24
[ 270.844212] [<ffffffff80265c39>] ? __generic_file_aio_write_nolock+0x33f/0...
a9
[ 270.844212] [<ffffffff802a1a03>] ? mnt_drop_write+0x23/0x118
[ 270.844212] [<ffffffff80265d04>] ? generic_file_aio_write+0x61/0xc1
[ 270.844212] [<ffffffffa01772fe>] ? :ext3:ext3_file_write+0x16/0x94
[ 270.844212] [<ffffffff8028a127>] ? do_sync_write+0xc9/0x10c
[ 270.844212] [<ffffffff8020e7b4>] ? get_nsec_offset+0x9/0x2c
[ 270.844212] [<ffffffff8023f691>] ? autoremove_wake_function+0x0/0x2e
[ 270.844212] [<ffffffff8042482b>] ? thread_return+0x3e/0xdb
[ 270.844212] [<ffffffff8028a8d1>] ? vfs_write+0xad/0x156
[ 270.844212] [<ffffffff8028ae73>] ? sys_write0x45/0x6e
[ 270.844212] [<ffffffff8020b528>] ? system_call+0x68/0x6d
[ 270.844212] [<ffffffff8020b4c0>] ? system_call+0x0/0x6d
[ 270.844212]
[ 270.844212]
[ 270.844212] Code: 30 fa 58 80 4c 39 2c 08 75 04 0f 0b eb fe 48 c7 c0 40 fa...
80 eb 1f 65 48 8b 04 25 10 00 00 00 66 f7 80 44 e0 ff ff 00 ff 75 04 <0f> 0b...
fe 48 c7 c0 30 fa 58 80 48 8d 1c 08 48 83 3b 00 74 04
[ 270.860183] RIP [<ffffffff8037fc9c>] xen_spin_wait+0x90/0x139
[ 270.860183] RSP <ffffffff80595e68>
[ 270.860183] ---[ end trace ccfa2c4ba9fb97dd ]---
[ 270.860183] Kernel panic - not syncing: Aiee, killing interrupt handler!
-- System Information:
Debian Release: 5.0
APT prefers stable
APT policy: (500, 'stable')
Architecture: amd64 (x86_64)
Kernel: Linux 2.6.26-1-xen-amd64 (SMP w/1 CPU core)
Locale: LANG=de_CH.UTF-8, LC_CTYPE=de_CH.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/bash
Versions of packages linux-image-2.6.26-1-xen-amd64 depends on:
ii initramfs-tools 0.92o tools for generating an initramfs
ii linux-modules-2.6.26-1-xen-am 2.6.26-13 Linux 2.6.26 modules on AMD64
linux-image-2.6.26-1-xen-amd64 recommends no packages.
Versions of packages linux-image-2.6.26-1-xen-amd64 suggests:
ii grub 0.97-47lenny2 GRand Unified Bootloader (Legacy v
pn linux-doc-2.6.26 <none> (no description available)
-- no debconf information
Reply to: