Bug#691665: [squeeze] Under CentOS 6.3 as a host OS Debian kernel hangs as a guest OS (KVM-QEMU)
found 691665 linux-2.6/3.3.4-1~experimental.1
fixed 691665 linux-2.6/3.4.1-1~experimental.1
quit
Zoltan Frombach wrote:
> I still experience the same problem with all 3.3.x kernels ( up to
> linux-image-3.3.0-trunk-amd64 amd64 3.3.4-1~experimental.1 )
>
> The problem is first resolved in this kernel: linux-image-3.4-trunk-amd64
> amd64 3.4.1-1~experimental.1
That was fast.
| $ git log --oneline --no-merges --grep=KVM v3.3.4..v3.4.1 -- arch/x86 | wc -l
| 55
Please test the attached patch against a 3.2.y-based kernel, for example
using the following instructions:
0. prerequisites
apt-get install git build-essential
1. get the kernel history, if you don't already have it
git clone \
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
2. fetch point releases
cd linux
git remote add stable \
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git
git fetch stable
3. configure, build, test
git checkout stable/linux-3.2.y
cp /boot/config-$(uname -r) .config; # current configuration
scripts/config --disable DEBUG_INFO
make localmodconfig; # optional: minimize configuration
make deb-pkg; # optionally with -j<num> for parallel build
dpkg -i ../<name of package>; # as root
reboot
Hopefully it reproduces the bug, so
4. try the patch
cd linux
git am -3sc /path/to/the/patch
make deb-pkg; # maybe with -j4
dpkg -i ../<name of package>; # as root
reboot
I don't expect this patch to work, but it's a place to start.
If you find yourself with nothing to do before someone looks more
carefully at the range with the fix, here's how to bisect to narrow it
further. This might be a little confusing: here, a "good" kernel is
one exhibiting the bug and a "bad" one is a fixed one, so the "first
bad commit" is the fix.
4. tell git about experiments so far
git bisect start -- '*kvm*'
git bisect bad v3.4.1; # boots ok
git bisect good v3.3.4; # hangs
git checks out a revision halfway between to test, so
5. test
make deb-pkg; # maybe with -j4
dpkg -i ../<name of package>; # as root
reboot
git bisect bad; # if it boots ok
git bisect good; # if it hangs in a similar way
git bisect skip; # if some other bug makes it hard to test
6. repeat step 5 until bored or until it shows the "first bad commit"
7. at any step, if the gitk package is installed you can run "git
bisect visualize" to watch the range with the fix narrowing,
or "git bisect log" to summarize the tests you've already done
and allow someone else to pick up where you left off
Sometimes this process is called a "reverse bisect", since we are
trying to find the patch that introduces a fix instead of a
regression. If curious, you can read more about the more ordinary
kind of "git bisect" at [1].
Hope that helps,
Jonathan
[1] http://www.kernel.org/pub/software/scm/git/docs/git-bisect-lk2009.html
see also http://kernel-handbook.alioth.debian.org/ch-bugs.html#s9.2.1
From: Gleb Natapov <gleb@redhat.com>
Date: Wed, 2 May 2012 15:04:02 +0300
Subject: KVM: Do not take reference to mm during async #PF
commit 62c49cc976af84cb0ffcb5ec07ee88da1a94e222 upstream.
It turned to be totally unneeded. The reason the code was introduced is
so that KVM can prefault swapped in page, but prefault can fail even
if mm is pinned since page table can change anyway. KVM handles this
situation correctly though and does not inject spurious page faults.
Fixes:
"INFO: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected" warning while
running LTP inside a KVM guest using the recent -next kernel.
Reported-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
arch/x86/kernel/kvm.c | 9 +--------
1 file changed, 1 insertion(+), 8 deletions(-)
diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index a9c2116001d6..a516c9bbc7e0 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -91,7 +91,6 @@ struct kvm_task_sleep_node {
u32 token;
int cpu;
bool halted;
- struct mm_struct *mm;
};
static struct kvm_task_sleep_head {
@@ -138,9 +137,7 @@ void kvm_async_pf_task_wait(u32 token)
n.token = token;
n.cpu = smp_processor_id();
- n.mm = current->active_mm;
n.halted = idle || preempt_count() > 1;
- atomic_inc(&n.mm->mm_count);
init_waitqueue_head(&n.wq);
hlist_add_head(&n.link, &b->list);
spin_unlock(&b->lock);
@@ -173,9 +170,6 @@ EXPORT_SYMBOL_GPL(kvm_async_pf_task_wait);
static void apf_task_wake_one(struct kvm_task_sleep_node *n)
{
hlist_del_init(&n->link);
- if (!n->mm)
- return;
- mmdrop(n->mm);
if (n->halted)
smp_send_reschedule(n->cpu);
else if (waitqueue_active(&n->wq))
@@ -219,7 +213,7 @@ again:
* async PF was not yet handled.
* Add dummy entry for the token.
*/
- n = kmalloc(sizeof(*n), GFP_ATOMIC);
+ n = kzalloc(sizeof(*n), GFP_ATOMIC);
if (!n) {
/*
* Allocation failed! Busy wait while other cpu
@@ -231,7 +225,6 @@ again:
}
n->token = token;
n->cpu = smp_processor_id();
- n->mm = NULL;
init_waitqueue_head(&n->wq);
hlist_add_head(&n->link, &b->list);
} else
--
1.8.0
Reply to: