[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#1035779: linux-image-5.10.0-22: kvm/qemu kernel null pointer dereference, VM doesn't start



On Tue, May 09, 2023 at 09:36:45PM +0200, Salvatore Bonaccorso wrote:
> Control: tags -1 + moreinfo
> 
> Hi Jared,
> 
> On Mon, May 08, 2023 at 11:50:21PM -0600, Jared Epp wrote:
> > Package: src:linux
> > Version: 5.10.178-3
> > Severity: normal
> > X-Debbugs-Cc: jaredepp@pm.me
> > 
> > Dear Maintainer,
> > 
> > After I updated my Debian 11 host kernel to 5.10.0-22, my VM guest
> > (Windows 10 using KVM / qemu / libvirt) no longer boots and there's
> > a kernel null pointer dereference along with a call trace, etc. in
> > the system log. If I reboot and choose 5.10.0-21 in grub, the VM
> > works as expected and there's no error in the log.
> > 
> > Below, reportbug included part of the kernel log but it missed part
> > of the problem so I pasted that in, I hope that's okay. If you need
> > any other information let me know.
> > 
> > Thanks
> > 
> > Jared Epp
> > 
> > -- Package-specific info:
> > ** Version:
> > Linux version 5.10.0-22-amd64 (debian-kernel@lists.debian.org) (gcc-10 (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2) #1 SMP Debian 5.10.178-3 (2023-04-22)
> > 
> > ** Command line:
> > BOOT_IMAGE=/vmlinuz-5.10.0-22-amd64 root=/dev/mapper/panthro--vg-root ro quiet mem_sleep_default=s2idle default_hugepagesz=1G hugepages=8
> > 
> > ** Tainted: D (128)
> >  * kernel died recently, i.e. there was an OOPS or BUG
> > 
> > ** Kernel log:
> > [   51.576266] BUG: kernel NULL pointer dereference, address: 0000000000000000
> > [   51.576269] #PF: supervisor read access in kernel mode
> > [   51.576270] #PF: error_code(0x0000) - not-present page
> > [   51.576271] PGD 0 P4D 0 
> > [   51.576273] Oops: 0000 [#1] SMP NOPTI
> > [   51.576275] CPU: 6 PID: 2209 Comm: CPU 0/KVM Not tainted 5.10.0-22-amd64 #1 Debian 5.10.178-3
> > [   51.576276] Hardware name: ASUS System Product Name/CROSSHAIR VI HERO, BIOS 8701 02/08/2023
> > [   51.576280] RIP: 0010:find_first_bit+0x19/0x40
> > [   51.576281] Code: 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc cc cc cc 49 89 f0 48 85 f6 74 28 31 c0 eb 0d 48 83 c0 40 48 83 c7 08 4c 39 c0 73 17 <48> 8b 17 48 85 d2 74 eb f3 48 0f bc d2 48 01 d0 49 39 c0 4c 0f 47
> > [   51.576282] RSP: 0018:ffffa99ac3a23a30 EFLAGS: 00010246
> > [   51.576283] RAX: 0000000000000000 RBX: ffffa99ac38a5000 RCX: 0000000000000000
> > [   51.576283] RDX: 0000000000000000 RSI: 0000000000000120 RDI: 0000000000000000
> > [   51.576284] RBP: 0000000000000000 R08: 0000000000000120 R09: ffff94e2c1ae72a8
> > [   51.576284] R10: 000000000000000f R11: 0000000000000000 R12: ffff94e2c1ae72a8
> > [   51.576285] R13: 0000000000000323 R14: 0000000000000003 R15: 0000000000000006
> > [   51.576286] FS:  0000000000000000(0053) GS:ffff94e89e980000(002b) knlGS:fffff8033f006000
> > [   51.576286] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [   51.576287] CR2: 0000000000000000 CR3: 000000018e4ee000 CR4: 0000000000750ee0
> > [   51.576287] PKRU: 55555554
> > [   51.576288] Call Trace:
> > [   51.576307]  kvm_make_vcpus_request_mask+0x38/0xf0 [kvm]
> > [   51.576319]  kvm_hv_flush_tlb+0x147/0x370 [kvm]
> > [   51.576328]  ? kvm_page_track_is_active+0x12/0x50 [kvm]
> > [   51.576336]  ? make_spte+0x146/0x260 [kvm]
> > [   51.576344]  ? mmu_spte_update+0x11/0x1c0 [kvm]
> > [   51.576351]  ? set_spte+0xee/0x140 [kvm]
> > [   51.576358]  ? mmu_set_spte+0x327/0x4a0 [kvm]
> > [   51.576365]  ? kvm_release_pfn_clean+0x22/0x40 [kvm]
> > [   51.576372]  ? direct_page_fault+0x223/0xa20 [kvm]
> > [   51.576374]  ? svm_get_segment+0x18/0x100 [kvm_amd]
> > [   51.576382]  ? kvm_get_cs_db_l_bits+0x35/0x70 [kvm]
> > [   51.576383]  ? svm_get_segment+0x18/0x100 [kvm_amd]
> > [   51.576390]  ? kvm_get_cs_db_l_bits+0x35/0x70 [kvm]
> > [   51.576398]  kvm_hv_hypercall+0x176/0x580 [kvm]
> > [   51.576401]  ? get_cpu_vendor+0x40/0xa0
> > [   51.576403]  ? native_load_tr_desc+0x67/0x70
> > [   51.576411]  kvm_arch_vcpu_ioctl_run+0xbe8/0x1740 [kvm]
> > [   51.576419]  kvm_vcpu_ioctl+0x21e/0x5b0 [kvm]
> > [   51.576422]  __x64_sys_ioctl+0x8b/0xc0
> > [   51.576424]  do_syscall_64+0x33/0x80
> > [   51.576426]  entry_SYSCALL_64_after_hwframe+0x61/0xc6
> > [   51.576428] RIP: 0033:0x7fad816f2237
> > [   51.576429] Code: 00 00 00 48 8b 05 59 cc 0d 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 29 cc 0d 00 f7 d8 64 89 01 48
> > [   51.576429] RSP: 002b:00007fad7ce65508 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> > [   51.576430] RAX: ffffffffffffffda RBX: 000000000000ae80 RCX: 00007fad816f2237
> > [   51.576431] RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 000000000000001c
> > [   51.576431] RBP: 000055a3e17511c0 R08: 000055a3df109848 R09: 000055a3df5335c0
> > [   51.576432] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
> > [   51.576432] R13: 000055a3df54fbc0 R14: 00007fad7ce657c0 R15: 0000000000802000
> > [   51.576434] Modules linked in: xt_nat veth nft_chain_nat xt_MASQUERADE nf_nat nf_conntrack_netlink xfrm_user xfrm_algo br_netfilter vhost_net vhost vhost_iotlb tap tun bridge stp llc overlay ip6t_REJECT nf_reject_ipv6 xt_hl ip6_tables ip6t_rt ipt_REJECT nf_reject_ipv4 xt_multiport nft_limit snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi snd_hda_intel nls_ascii snd_intel_dspcfg nls_cp437 soundwire_intel vfat soundwire_generic_allocation fat snd_soc_core snd_compress soundwire_cadence snd_hda_codec edac_mce_amd xt_limit xt_addrtype kvm_amd snd_hda_core xt_tcpudp snd_hwdep eeepc_wmi kvm soundwire_bus xpad xt_conntrack cdc_acm joydev asus_wmi ff_memless snd_pcm nf_conntrack battery sparse_keymap snd_timer nf_defrag_ipv6 rfkill nf_defrag_ipv4 irqbypass nft_compat snd video rapl efi_pstore wmi_bmof pcspkr ccp soundcore k10temp sp5100_tco nft_counter watchdog sg tpm_crb tpm_tis tpm_tis_core tpm rng_core acpi_cpufreq evdev nf_tables libcrc32c nfnetlink msr fuse
> > [   51.576464]  configfs efivarfs ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 crc32c_generic dm_crypt dm_mod hid_logitech_hidpp hid_logitech_dj amdgpu sr_mod cdrom gpu_sched sd_mod hid_generic ttm crc32_pclmul crc32c_intel usbhid hid drm_kms_helper ahci cec libahci ghash_clmulni_intel xhci_pci libata xhci_hcd drm nvme aesni_intel mxm_wmi igb libaes usbcore crypto_simd nvme_core cryptd scsi_mod glue_helper i2c_piix4 dca ptp pps_core t10_pi i2c_algo_bit crc_t10dif crct10dif_generic usb_common crct10dif_pclmul crct10dif_common wmi gpio_amdpt gpio_generic button
> > [   51.576484] CR2: 0000000000000000
> > [   51.576485] ---[ end trace acfac62cc884c67c ]---
> > [   51.668091] pstore: crypto_comp_compress failed, ret = -22!
> > [   51.682455] br-b8df22c12cd5: port 4(vethe67c4df) entered blocking state
> > [   51.682459] br-b8df22c12cd5: port 4(vethe67c4df) entered disabled state
> > [   51.682501] device vethe67c4df entered promiscuous mode
> > [   51.689861] br-b8df22c12cd5: port 4(vethe67c4df) entered blocking state
> > [   51.689863] br-b8df22c12cd5: port 4(vethe67c4df) entered forwarding state
> > [   51.696372] RIP: 0010:find_first_bit+0x19/0x40
> > [   51.696374] Code: 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc cc cc cc 49 89 f0 48 85 f6 74 28 31 c0 eb 0d 48 83 c0 40 48 83 c7 08 4c 39 c0 73 17 <48> 8b 17 48 85 d2 74 eb f3 48 0f bc d2 48 01 d0 49 39 c0 4c 0f 47
> > [   51.696376] RSP: 0018:ffffa99ac3a23a30 EFLAGS: 00010246
> > [   51.696378] RAX: 0000000000000000 RBX: ffffa99ac38a5000 RCX: 0000000000000000
> > [   51.696379] RDX: 0000000000000000 RSI: 0000000000000120 RDI: 0000000000000000
> > [   51.696380] RBP: 0000000000000000 R08: 0000000000000120 R09: ffff94e2c1ae72a8
> > [   51.696380] R10: 000000000000000f R11: 0000000000000000 R12: ffff94e2c1ae72a8
> > [   51.696381] R13: 0000000000000323 R14: 0000000000000003 R15: 0000000000000006
> > [   51.696383] FS:  0000000000000000(0053) GS:ffff94e89e980000(002b) knlGS:fffff8033f006000
> > [   51.696384] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [   51.696384] CR2: 0000000000000000 CR3: 000000018e4ee000 CR4: 0000000000750ee0
> > [   51.696385] PKRU: 55555554
> > [   51.700146] br-924f74569f8a: port 4(veth5e7362a) entered blocking state
> > [   51.700151] br-924f74569f8a: port 4(veth5e7362a) entered disabled state
> > [   51.700200] device veth5e7362a entered promiscuous mode
> > [   51.700257] br-924f74569f8a: port 4(veth5e7362a) entered blocking state
> > [   51.700259] br-924f74569f8a: port 4(veth5e7362a) entered forwarding state
> > [   51.787480] eth0: renamed from veth28831ed
> > [   51.831676] br-b8df22c12cd5: port 4(vethe67c4df) entered disabled state
> > [   51.831721] br-924f74569f8a: port 4(veth5e7362a) entered disabled state
> > [   51.831741] IPv6: ADDRCONF(NETDEV_CHANGE): veth25ca080: link becomes ready
> > [   51.831758] br-924f74569f8a: port 2(veth25ca080) entered blocking state
> > [   51.831759] br-924f74569f8a: port 2(veth25ca080) entered forwarding state
> > [   51.832280] br-b8df22c12cd5: port 5(vethc8c90d8) entered blocking state
> > [   51.832282] br-b8df22c12cd5: port 5(vethc8c90d8) entered disabled state
> > [   51.832329] device vethc8c90d8 entered promiscuous mode
> > [   51.832383] br-b8df22c12cd5: port 5(vethc8c90d8) entered blocking state
> > [   51.832385] br-b8df22c12cd5: port 5(vethc8c90d8) entered forwarding state
> > [   51.832416] br-924f74569f8a: port 5(vethfbf8266) entered blocking state
> > [   51.832418] br-924f74569f8a: port 5(vethfbf8266) entered disabled state
> > [   51.832452] device vethfbf8266 entered promiscuous mode
> > [   51.832503] br-924f74569f8a: port 5(vethfbf8266) entered blocking state
> > [   51.832504] br-924f74569f8a: port 5(vethfbf8266) entered forwarding state
> > [   51.955355] eth0: renamed from vethec69e8f
> > [   51.999437] eth0: renamed from veth30923c8
> > [   52.043965] br-b8df22c12cd5: port 5(vethc8c90d8) entered disabled state
> > [   52.044034] br-924f74569f8a: port 5(vethfbf8266) entered disabled state
> > [   52.044064] IPv6: ADDRCONF(NETDEV_CHANGE): veth1661ccb: link becomes ready
> > [   52.044086] br-924f74569f8a: port 1(veth1661ccb) entered blocking state
> > [   52.044088] br-924f74569f8a: port 1(veth1661ccb) entered forwarding state
> > [   52.044108] IPv6: ADDRCONF(NETDEV_CHANGE): veth0ebab0a: link becomes ready
> > [   52.044125] br-b8df22c12cd5: port 1(veth0ebab0a) entered blocking state
> > [   52.044127] br-b8df22c12cd5: port 1(veth0ebab0a) entered forwarding state
> > [   52.044539] br-924f74569f8a: port 6(veth5372881) entered blocking state
> > [   52.044542] br-924f74569f8a: port 6(veth5372881) entered disabled state
> > [   52.044586] device veth5372881 entered promiscuous mode
> > [   52.044644] br-924f74569f8a: port 6(veth5372881) entered blocking state
> > [   52.044646] br-924f74569f8a: port 6(veth5372881) entered forwarding state
> > [   52.057025] br-924f74569f8a: port 7(veth29a4d93) entered blocking state
> > [   52.057028] br-924f74569f8a: port 7(veth29a4d93) entered disabled state
> > [   52.057108] device veth29a4d93 entered promiscuous mode
> > [   52.057175] br-924f74569f8a: port 7(veth29a4d93) entered blocking state
> > [   52.057176] br-924f74569f8a: port 7(veth29a4d93) entered forwarding state
> > [   52.231474] eth0: renamed from veth9d75af1
> > [   52.255847] br-924f74569f8a: port 6(veth5372881) entered disabled state
> > [   52.255889] br-924f74569f8a: port 7(veth29a4d93) entered disabled state
> > [   52.255906] IPv6: ADDRCONF(NETDEV_CHANGE): veth5e7362a: link becomes ready
> > [   52.255928] br-924f74569f8a: port 4(veth5e7362a) entered blocking state
> > [   52.255929] br-924f74569f8a: port 4(veth5e7362a) entered forwarding state
> > [   52.347639] eth0: renamed from veth0803979
> > [   52.419482] eth0: renamed from veth7d2c7ef
> > [   52.508188] eth1: renamed from vethee5bda2
> > [   52.567361] IPv6: ADDRCONF(NETDEV_CHANGE): veth74b84d1: link becomes ready
> > [   52.567383] br-b8df22c12cd5: port 3(veth74b84d1) entered blocking state
> > [   52.567385] br-b8df22c12cd5: port 3(veth74b84d1) entered forwarding state
> > [   52.567531] IPv6: ADDRCONF(NETDEV_CHANGE): veth226976d: link becomes ready
> > [   52.567550] br-b8df22c12cd5: port 2(veth226976d) entered blocking state
> > [   52.567551] br-b8df22c12cd5: port 2(veth226976d) entered forwarding state
> > [   52.567730] IPv6: ADDRCONF(NETDEV_CHANGE): veth5372881: link becomes ready
> > [   52.567751] br-924f74569f8a: port 6(veth5372881) entered blocking state
> > [   52.567752] br-924f74569f8a: port 6(veth5372881) entered forwarding state
> > [   52.615463] eth1: renamed from vethef7b9c6
> > [   52.643620] IPv6: ADDRCONF(NETDEV_CHANGE): vethfbf8266: link becomes ready
> > [   52.643649] br-924f74569f8a: port 5(vethfbf8266) entered blocking state
> > [   52.643652] br-924f74569f8a: port 5(vethfbf8266) entered forwarding state
> > [   52.690903] eth0: renamed from veth3feaf4a
> > [   52.723412] IPv6: ADDRCONF(NETDEV_CHANGE): vethc8c90d8: link becomes ready
> > [   52.723440] br-b8df22c12cd5: port 5(vethc8c90d8) entered blocking state
> > [   52.723442] br-b8df22c12cd5: port 5(vethc8c90d8) entered forwarding state
> > [   52.847525] eth1: renamed from vethd1e1e47
> > [   52.877625] IPv6: ADDRCONF(NETDEV_CHANGE): veth888282e: link becomes ready
> > [   52.877656] br-924f74569f8a: port 3(veth888282e) entered blocking state
> > [   52.877660] br-924f74569f8a: port 3(veth888282e) entered forwarding state
> > [   52.877676] eth0: renamed from veth374d91f
> > [   52.939600] IPv6: ADDRCONF(NETDEV_CHANGE): vethe67c4df: link becomes ready
> > [   52.939638] br-b8df22c12cd5: port 4(vethe67c4df) entered blocking state
> > [   52.939641] br-b8df22c12cd5: port 4(vethe67c4df) entered forwarding state
> > [   52.945134] eth1: renamed from veth583ed90
> > [   52.971727] IPv6: ADDRCONF(NETDEV_CHANGE): veth29a4d93: link becomes ready
> > [   52.971776] br-924f74569f8a: port 7(veth29a4d93) entered blocking state
> > [   52.971778] br-924f74569f8a: port 7(veth29a4d93) entered forwarding state
> > [   60.145487] logitech-hidpp-device 0003:046D:4031.0007: HID++ 2.0 device connected.
> > [   63.847187] bridge0: port 1(vnet0) entered learning state
> > [   65.643192] bridge0: port 2(enp5s0) entered learning state
> > [   67.654690] kauditd_printk_skb: 50 callbacks suppressed
> > [   67.654691] audit: type=1400 audit(1683610301.229:62): apparmor="DENIED" operation="capable" profile="libvirtd" pid=1754 comm="prio-rpc-worker" capability=17  capname="sys_rawio"
> > [   78.951195] bridge0: port 1(vnet0) entered forwarding state
> > [   78.951211] bridge0: topology change detected, propagating
> > [   80.743422] bridge0: port 2(enp5s0) entered forwarding state
> > [   80.743435] bridge0: topology change detected, propagating
> 
> This sounds similar to the
> https://forum.proxmox.com/threads/with-latest-5-15-104-1-pve-windows-server-vm-freeze-stuck.125294/
> issue. Would you be able to verify two things:
> 
> Check how the Windows VM is configured and if you pass the
> '+hv-tlbflush' flag.
> 
> Additionally, would the attached patch make the issue go away?

Now with patch attached.

Regards,
Salvatore
>From b7f8d59d71742cf2b0553c042560790532bda41a Mon Sep 17 00:00:00 2001
From: Vitaly Kuznetsov <vkuznets@redhat.com>
Date: Fri, 3 Sep 2021 09:51:36 +0200
Subject: [PATCH] KVM: x86: hyper-v: Avoid calling
 kvm_make_vcpus_request_mask() with vcpu_mask==NULL

In preparation to making kvm_make_vcpus_request_mask() use for_each_set_bit()
switch kvm_hv_flush_tlb() to calling kvm_make_all_cpus_request() for 'all cpus'
case.

Note: kvm_make_all_cpus_request() (unlike kvm_make_vcpus_request_mask())
currently dynamically allocates cpumask on each call and this is suboptimal.
Both kvm_make_all_cpus_request() and kvm_make_vcpus_request_mask() are
going to be switched to using pre-allocated per-cpu masks.

Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20210903075141.403071-4-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/hyperv.c | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 09ec1cda2d68..e03e320847cd 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -1562,16 +1562,19 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *current_vcpu, u64 ingpa,
 
 	cpumask_clear(&hv_vcpu->tlb_flush);
 
-	vcpu_mask = all_cpus ? NULL :
-		sparse_set_to_vcpu_mask(kvm, sparse_banks, valid_bank_mask,
-					vp_bitmap, vcpu_bitmap);
-
 	/*
 	 * vcpu->arch.cr3 may not be up-to-date for running vCPUs so we can't
 	 * analyze it here, flush TLB regardless of the specified address space.
 	 */
-	kvm_make_vcpus_request_mask(kvm, KVM_REQ_TLB_FLUSH_GUEST,
-				    NULL, vcpu_mask, &hv_vcpu->tlb_flush);
+	if (all_cpus) {
+		kvm_make_all_cpus_request(kvm, KVM_REQ_TLB_FLUSH_GUEST);
+	} else {
+		vcpu_mask = sparse_set_to_vcpu_mask(kvm, sparse_banks, valid_bank_mask,
+						    vp_bitmap, vcpu_bitmap);
+
+		kvm_make_vcpus_request_mask(kvm, KVM_REQ_TLB_FLUSH_GUEST,
+					    NULL, vcpu_mask, &hv_vcpu->tlb_flush);
+	}
 
 ret_success:
 	/* We always do full TLB flush, set rep_done = rep_cnt. */
-- 
2.40.1


Reply to: