[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#1012100: linux-image-5.17-1: KVM LIBVIRT fails to start, slow disk access, and a kernel thread goes wild on Intel Xeon X3430



Hi Adrian,

On Monday, 30 May 2022 13:14:26 CEST Adrian Kieß wrote:
> yes, it works with the kernel 5.16.0-6, but disk access is still slow.

Ok, but that issue was also happening before 5.17 and is not a new problem.
Do you have a(n old) kernel (still) installed which does NOT have this slow 
disk access issue? If it happens on all kernel versions, then a hardware issue 
becomes much more likely to be the real culprit.

> For example, virt-manager/viewer sometimes needs a minute to connect to
> the KVM instances on localhost. But not all applications are this slow;
> for example the E-Mail client Sylpheed starts as fast as before and is
> operating at fast speed.

In your initial report I noticed the following:
> Network:
>   Device-1: Broadcom NetXtreme BCM5723 Gigabit Ethernet PCIe driver: tg3
>   IF: eth0 state: up speed: 1000 Mbps duplex: full mac: 64:31:50:d3:c0:f8
>   IF-ID-1: br0 state: up speed: 1000 Mbps duplex: unknown
>     mac: fe:40:ab:83:94:4a
>   IF-ID-2: vnet0 state: unknown speed: 10 Mbps duplex: full
>     mac: fe:54:00:c2:24:94
>   IF-ID-3: vnet1 state: unknown speed: 10 Mbps duplex: full
>     mac: fe:54:00:bf:35:8b
>   IF-ID-4: vnet2 state: unknown speed: 10 Mbps duplex: full
>     mac: fe:54:00:25:b0:8b
>   IF-ID-5: vnet3 state: unknown speed: 10 Mbps duplex: full
>     mac: fe:54:00:4a:c8:69

Is it normal/expected that IF-ID-[2-5] have "unknown speed: 10 Mbps duplex" ?
If not, that may be worth looking into.

> I assume there is also another bug now in the system, not only due to
> the new kernel. There is also another bug in GDM3, which I also
> reported: Loading GDM3 after bootup and logging in as normal user is
> also very, very slow.

Which sounds like the moment lots of files/data is read from disk to initialize 
the session, which does point to a disk issue.
But if the initial boot isn't terribly slow as well, that would be odd.
Or is /home mounted from another disk?

> As you suggested, I installed the kernel 5.17.11 from Debian/unstable
> and booted into this kernel.
> 
> virt-manager and my KVM VM instances do work again, but one VM instance
> failed to load after bootup. I restarted the VM instance, and it is now
> also operating fine.

Good, that sounds like major progress :) It looks to me that the KVM problem 
is now (mostly?) fixed.

> When opening the virt-viewer instance from virt-manager, connecting to
> the VM is still very slow with kernel 5.17.11. Something must be wrong
> I/O wise.

That still/all points to a disk problem

> I attached the dmesg output, you requested, as TXT file to this E-mail.

Couple of things I noticed in the dmesg output:
1) "[Firmware Warn]: HEST: Duplicated hardware error source ID: 9."
https://lkml.org/lkml/2011/6/27/370 seems relevant for that as it provided the 
better warning, but it also points out that it *is* considered a firmware bug.
I noticed your BIOS is from 2011. Is there a newer version available? If so, 
it may be worth trying that out to see if that improves things.

2) Several ACPI related warnings.
No idea if or what should be done with that.

3) "kvm: VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL does not work properly. Using 
workaround" and "kvm: KVM_SET_TSS_ADDR need to be called before entering vcpu"
That looks like there are still KVM related issues (just not or less fatal)
There have been other bug reports related to KVM.

4) BUG: kernel NULL pointer dereference, address: 000000000000000b
That's never good. The dmesg output also contains a Call Trace and several 
mentions of KVM, so it looks like there's still something not right about it.
I have no idea how to interpret those Call (or Stack) Traces, so hopefully 
someone else chimes in who is familiar with that.

Cheers,
  Diederik

Attachment: signature.asc
Description: This is a digitally signed message part.


Reply to: