[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#770518: Minimize chance of failures and kernel panics on Hyper-V



Package:  linux-image-3.16-0.bpo.3-amd64
Version:  3.16.5-1~bpo70+1


On Hyper-V at boot time there is a chance to hit a kernel panic when initializing hv_vmbus on guests with multiple vCPUs.  This issue has already been fixed upstream with the following two commits:

--------------------------------------------------
>From b29ef3546aecb253a5552b198cef23750d56e1e4 Mon Sep 17 00:00:00 2001
From: "K. Y. Srinivasan" <kys@microsoft.com>
Date: Thu, 28 Aug 2014 18:29:52 -0700
Subject: Drivers: hv: vmbus: Cleanup hv_post_message()

Minimize failures in this function by pre-allocating the buffer
for posting messages. The hypercall for posting the message can fail
for a number of reasons: ....

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reported-by: Sitsofe Wheeler <sitsofe@yahoo.com>
Cc: <stable@vger.kernel.org>
Tested-by: Sitsofe Wheeler <sitsofe@yahoo.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
--------------------------------------------------

--------------------------------------------------
commit 2115b5617adf2eecca49e78f3810f359ddc5c396
Author: K. Y. Srinivasan <kys@microsoft.com>
Date:   Thu Aug 28 18:29:53 2014 -0700

    Drivers: hv: vmbus: Properly protect calls to smp_processor_id()

    Disable preemption when sampling current processor ID when preemption
    is otherwise possible.

    Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
    Tested-by: Sitsofe Wheeler <sitsofe@yahoo.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
--------------------------------------------------


Here is the LKML thread for these two commits: https://lkml.org/lkml/2014/8/28/648

Here is a sample log of the kernel panic:

[....] Waiting for /dev to be fully populated...[    7.661109] hv_utils: Registering HyperV Utility Driver
[    7.662844] hv_vmbus: registering driver hv_util
[    7.664647] psmouse serio1: alps: Unknown ALPS touchpad: E7=12 00 64, EC=12 00 64
[    7.671615] general protection fault: 0000 [#1] SMP 
[    7.674445] Modules linked in: acpi_cpufreq(-) hv_utils processor thermal_sys button psmouse serio_raw evdev joydev i2c_piix4 pcspkr hyperv_fb i2c_core ext4 crc16 mbcache jbd2 sd_mod crc_t10dif crct10dif_common hid_generic hv_netvsc hid_hyperv hv_storvsc hid sg sr_mod cdrom ata_generic floppy ata_piix hv_vmbus libata scsi_mod
[    7.674445] CPU: 0 PID: 6 Comm: kworker/u128:0 Not tainted 3.16-0.bpo.2-amd64 #1 Debian 3.16.3-2~bpo70+1
[    7.674445] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090006  05/23/2012
[    7.674445] task: ffff8800db748050 ti: ffff8800db760000 task.ti: ffff8800db760000
[    7.674445] RIP: 0010:[<ffffffffa00329df>]  [<ffffffffa00329df>] vmbus_on_event+0xdf/0x1e0 [hv_vmbus]
[    7.674445] RSP: 0000:ffff8800df403eb8  EFLAGS: 00010202
[    7.674445] RAX: dead000000100100 RBX: 0000000000000009 RCX: ffff8800db763fd8
[    7.674445] RDX: ffffffffa003c1f8 RSI: 0000000000000000 RDI: 0000000000000000
[    7.674445] RBP: 0000000000000009 R08: ffff8800db760000 R09: 0000000000000000
[    7.674445] R10: 0000000000000080 R11: 0000000000000001 R12: ffff8800372e9200
[    7.674445] R13: 0000000000000000 R14: 0000000000000040 R15: dead0000000ffed8
[    7.674445] FS:  00007fc0bc0a87a0(0000) GS:ffff8800df400000(0000) knlGS:0000000000000000
[    7.674445] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[    7.674445] CR2: 00000000000000b8 CR3: 00000000d8034000 CR4: 00000000000006f0
[    7.674445] Stack:
[    7.674445]  ffff880036bf7000 0000000000000246 0000000000000000 0000000000000000
[    7.674445]  ffff880037232148 0000000000000100 0000000000000040 0000000000000007
[    7.674445]  00000000ffffffff ffffffff8107120f 0000000000000200 ffffffff8180b0b0
[    7.674445] Call Trace:
[    7.674445]  <IRQ> 
[    7.674445]  [<ffffffff8107120f>] ? tasklet_action+0xbf/0xd0
[    7.674445]  [<ffffffff81070b6e>] ? __do_softirq+0xde/0x2e0
[    7.674445]  [<ffffffff81070fc6>] ? irq_exit+0x86/0xb0
[    7.674445]  [<ffffffff8104566d>] ? hyperv_vector_handler+0x3d/0x50
[    7.674445]  [<ffffffff8154832d>] ? hyperv_callback_vector+0x6d/0x80
[    7.674445]  <EOI> 
[    7.674445]  [<ffffffff810a3037>] ? pick_next_entity+0x87/0x140
[    7.674445]  [<ffffffff81097c69>] ? finish_task_switch+0x49/0xf0
[    7.674445]  [<ffffffff8154267e>] ? __schedule+0x2de/0x770
[    7.674445]  [<ffffffff81087b48>] ? worker_thread+0x188/0x540
[    7.674445]  [<ffffffff810879c0>] ? create_and_start_worker+0x60/0x60
[    7.674445]  [<ffffffff8108e481>] ? kthread+0xc1/0xe0
[    7.674445]  [<ffffffff8108e3c0>] ? flush_kthread_worker+0xb0/0xb0
[    7.674445]  [<ffffffff815463bc>] ? ret_from_fork+0x7c/0xb0
[    7.674445]  [<ffffffff8108e3c0>] ? flush_kthread_worker+0xb0/0xb0
[    7.674445] Code: 39 c2 4c 8d b8 d8 fd ff ff 75 20 e9 ec 00 00 00 0f 1f 40 00 49 8b 87 28 02 00 00 48 39 c2 4c 8d b8 d8 fd ff ff 0f 84 d1 00 00 00 <3b> a8 cc fe ff ff 75 e1 4d 85 ff 0f 84 c0 00 00 00 49 8b 87 c0 
[    7.674445] RIP  [<ffffffffa00329df>] vmbus_on_event+0xdf/0x1e0 [hv_vmbus]
[    7.674445]  RSP <ffff8800df403eb8>
[    7.781398] ---[ end trace a317ee768385c8a2 ]---
[    7.782656] Kernel panic - not syncing: Fatal exception in interrupt
[    7.783721] Kernel Offset: 0x0 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffff9fffffff)
[    7.785630] ---[ end Kernel panic - not syncing: Fatal exception in interrupt
[    7.786653] random: nonblocking pool is initialized


Reply to: