[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#1020787: linux-image-5.19.0-2-amd64: After updating to 5.19 kernel the VMs are started without XSAVE CPU flags



Am Dienstag, dem 27.09.2022 um 01:39 +0200 schrieb Diederik de Haas:
> Which version of Xen are you using?
> 
It's the current debian sid state:


dpkg -l | grep xen
ii  grub-xen-bin                         2.06-4                            amd64        GRand Unified Bootloader, version 2 (Xen modules)
ii  grub-xen-host                        2.06-4                            amd64        GRand Unified Bootloader, version 2 (Xen host version)
ii  libxencall1:amd64                    4.16.2-1                          amd64        Xen runtime library - libxencall
ii  libxendevicemodel1:amd64             4.16.2-1                          amd64        Xen runtime libraries - libxendevicemodel
ii  libxenevtchn1:amd64                  4.16.2-1                          amd64        Xen runtime libraries - libxenevtchn
ii  libxenforeignmemory1:amd64           4.16.2-1                          amd64        Xen runtime libraries - libxenforeignmemory
ii  libxengnttab1:amd64                  4.16.2-1                          amd64        Xen runtime libraries - libxengnttab
ii  libxenhypfs1:amd64                   4.16.2-1                          amd64        Xen runtime library - libxenhypfs
ii  libxenmisc4.16:amd64                 4.16.2-1                          amd64        Xen runtime libraries - miscellaneous, versioned ABI
ii  libxenstore4:amd64                   4.16.2-1                          amd64        Xen runtime libraries - libxenstore
ii  libxentoolcore1:amd64                4.16.2-1                          amd64        Xen runtime libraries - libxentoolcore
ii  libxentoollog1:amd64                 4.16.2-1                          amd64        Xen runtime libraries - libxentoollog
ii  qemu-system-xen                      1:7.1+dfsg-2                      amd64        QEMU full system emulation (Xen helper package)
ii  xen-hypervisor-4.16-amd64            4.16.2-1                          amd64        Xen Hypervisor on AMD64
ii  xen-hypervisor-common                4.16.2-1                          amd64        Xen Hypervisor - common files
ii  xen-system-amd64                     4.16.2-1                          amd64        Xen System on AMD64 (metapackage)
ii  xen-tools                            4.9.1-1                           all          Tools to manage Xen virtual servers
ii  xen-utils-4.16                       4.16.2-1                          amd64        Xen administrative tools
ii  xen-utils-common                     4.16.2-1                          amd64        Xen administrative tools - common files
ii  xenstore-utils                       4.16.2-1                          amd64        Xenstore command line utilities for Xen


> 
> Is this all about the dom0 kernel or is it all/some about using 5.19 as 
> domU kernel? Are the issues happing on dom0 or inside domU?
> 
It seems to happen to both, dom0 and domU. It seems also to affect everything which is using gnutls, which is maybe evaluating the cpu flags, as far as I understood the linked ticket.

> If the issues happen inside (a) domU, can you share a (minimal) domU 
> configuration file so it becomes easier to replicate?
> 
sure. (Currently the old kernel 5.18 is configured.)

name="mail"
on_xend_stop="shutdown"
memory=12288
maxmem=12288
vcpus=4
cpus="2-5"
kernel="/etc/xen/vm/boot/vmlinuz-5.18.0-4-amd64"
ramdisk="/etc/xen/vm/boot/initrd.img-5.18.0-4-amd64"
root="/dev/xvda1"
disk=[ '/dev/mapper/mail,,xvda1' ]
vif=[ 'mac=00:16:3e:00:00:05, bridge=xenbr0, vifname=mail.0', 'mac=00:16:3e:00:ff:05, bridge=xenbrlo, vifname=mail.lo' ]
extra="lockd.nlm_tcpport=61053 lockd.nlm_udpport=61053 ipv6.disable=1 net.ifnames=0 xen_blkfront.max_queues=3"


> > And indeed there is some difference in /proc/cpuinfo:
> > The flags for "fma xsave avx2 bmi2 xsaveopt xsavec xgetbv1 md_clear" are
> > missing, which might result in gnutls failures.
> 
> In kernel 5.19 the following commits were added under ``arch/x86/kernel/fpu/``:
> 
> b91c0922bf1ed15b67a6faa404bc64e3ed532ec2 x86/fpu: Cleanup variable shadowing
> 8ad7e8f696951f192c6629a0cbda9ac94c773159 x86/fpu/xsave: Support XSAVEC in the kernel
> f5c0b4f30416c670408a77be94703d04d22b57df x86/prctl: Remove pointless task argument
> 
> Of these, the first 2 seem like possible candidates that caused the issue.
> https://kernel-team.pages.debian.net/kernel-handbook/ch-common-tasks.html#s4.2.2
> describes a way to apply a simple patch to a kernel.
> What you could try, is creating a patch from reverting one of the earlier
> mentioned commits and use that with 'test-patches'.
> 
OK, sounds like a reasonable test. I will report it in the next mail.

> It's probably also useful to know what CPU(s) are in the machine (dom0).
> 
This is the output of one entry in /proc/cpuinfo (with Kernel 5.18):

processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 165
model name      : Intel(R) Core(TM) i7-10700 CPU @ 2.90GHz
stepping        : 5
microcode       : 0xe2
cpu MHz         : 2903.996
cache size      : 16384 KB
physical id     : 0
siblings        : 2
core id         : 0
cpu cores       : 1
apicid          : 0
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 13
wp              : yes
flags           : fpu de tsc msr pae mce cx8 apic sep mca cmov pat clflush acpi mmx fxsr sse sse2 ss ht syscall nx rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid tsc_known_freq pni pclmulqdq
monitor est ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch cpuid_fault ssbd ibrs ibpb stibp ibrs_enhanced fsgsbase bmi1 avx2 bmi2 erms rdseed
adx clflushopt xsaveopt xsavec xgetbv1 md_clear arch_capabilities
bugs            : spectre_v1 spectre_v2 spec_store_bypass swapgs itlb_multihit srbds mmio_stale_data retbleed eibrs_pbrsb
bogomips        : 5807.99
clflush size    : 64
cache_alignment : 64
address sizes   : 39 bits physical, 48 bits virtual
power management:

> On Monday, 26 September 2022 20:31:17 CEST Ps Ps wrote:
> > On Xen Hypervisor I just found this logs:
> 
> So this is on dom0? In which log file did you find it?
> 
Yes, its dom0

> Generally: be as specific as you can be and describe *exactly* what you did
> and the exact results (if any). Also try to make it as easy as possible for
> others to reproduce what you're experiencing.
> 
> FTR: I did not see this issue on my dom0 (Xen 4.16.2-1; kernel 5.19.11-1):
> 
> root@dom0:~# dmesg
> [    0.000000] Linux version 5.19.0-2-amd64 (debian-kernel@lists.debian.org) (gcc-11 (Debian 11.3.0-6) 11.3.0, GNU ld (GNU Binutils for Debian) 2.38.90.20220713) #1 SMP PREEMPT_DYNAMIC Debian
> 5.19.11-1 (2022-09-24)
> [    0.000000] Command line: placeholder root=UUID=8008723b-668f-43f6-b432-8c56ed53f48a ro quiet net.ifnames=0
> [    0.000000] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers'
> [    0.000000] x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers'
> [    0.000000] x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers'
> [    0.000000] x86/fpu: xstate_offset[2]:  576, xstate_sizes[2]:  256
> [    0.000000] x86/fpu: Enabled xstate features 0x7, context size is 832 bytes, using 'standard' format.
> [    0.000000] signal: max sigframe size: 1776
> [    0.000000] Released 0 page(s)
> 
> root@dom0:~# grep flag /proc/cpuinfo | uniq
> flags           : fpu de tsc msr pae mce cx8 apic sep mca cmov pat clflush acpi mmx fxsr sse sse2 ss ht syscall nx rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid tsc_known_freq pni pclmulqdq
> monitor est ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch cpuid_fault ssbd ibrs ibpb stibp fsgsbase bmi1 avx2 bmi2 erms rtm rdseed adx
> xsaveopt md_clear
> 
> 
> Also found this patch which should make the error msg more informative ...
> https://lore.kernel.org/all/20220810221909.12768-1-andrew.cooper3@citrix.com/
> Even though I haven't experienced it (yet?), the language of this patch
> seems to indicate you're not alone with it.
I will add this patch too, maybe that provides some more informations.

Thanks for your hints!

Regards

Patrick


Reply to: