[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#991967: #991967: Simply ACPI powerdown/reset issue?




On 9/20/21 12:27 AM, Elliott Mitchell wrote:
On Sun, Sep 19, 2021 at 01:05:56AM -0400, Chuck Zmudzinski wrote:
xen hypervisor version: 4.14.2+25-gb6a8c4f72d-2, amd64

linux kernel version: 5.10.46-4 (the current amd64 kernel
for bullseye)

Boot system: EFI, not using secure boot, booting xen
hypervisor and dom0 bullseye with grub-efi package for
bullseye, and it boots the xen-4.14-amd64.gz file, not
the xen-4.14-amd64.efi file.
I also tested a buster dom0 with the 4.19 series kernel
on the xen-4.14 hypervisor from bullseye and saw the
problem, but I did not see the problem with either
a buster (linux 4.19) or bullseye (linux 5.10) dom0 on
the xen-4.11 hypervisor, so I think the problem is
with the Debian version of the xen-4.14 hypervisor,
not with src:linux.
You're referencing several software versions which are mismatches for
#991967.  #991967 was observed with Xen 4.11 and Linux kernel 4.19.194-3,
but not Linux kernel 4.19.181.

The fact it correlates with a Linux kernel update rather strongly points
to the Linux kernel.  I could believe the situation is partially the
fault of both though.

I don't see it with Xen-4.11 and Linux kernel 4.19.194-3 which is
the current default dom0 configuration on Debian buster, but I
do see it with Debian's version of Xen-4.14 and either Linux
kernel 4.19.194-3 from buster or Linux kernel 5.10.46-4 from
bullseye as the dom0. So I only saw it with the update of the
Xen hypervisor from 4.11 to 4.14. Of course you have different
hardware and a different acpi implementation which is also likely
to be a factor that determines whether or not the dom0 poweroff
bug manifests itself.


I suspect the following patch is the culprit for problems
shutting down on the amd64 architecture:

0030-xen-acpi-Rework-acpi_os_map_memory-and-acpi_os_unmap.patch
This patch does affect amd64 acpi code, and is probably causing
the problem on my amd64 system, so my build of the xen-4.14
hypervisor without this patch fixed the problem.
Of the ones listed that is the only one which has any overlap with x86
code.  The next reproduction step is `apt-get source xen &&
patch -p1 -R < 0030-xen-acpi-Rework-acpi_os_map_memory-and-acpi_os_unmap.patch
&& dpkg-buildpackage -b`.  Then try with this to confirm that patch
is what does it.

Thing is that delta is rather small.  I don't have a simulator, but that
is rather small to be the culprit.

I did try to remove this single patch from the xen build using
quilt, but quilt was not happy when it tried to apply the
subsequent arm patch, so I just removed all the subsequent
arm patches to keep quilt happy with my modified xen
src tree. I will try it now, though.

If it is this small a delta that is causing the problem
on x86/amd64, then maybe we can come up with a workaround
in src:xen that is acceptable for both arm and x86/amd64.


I think this bug should be re-classified as a bug in src:xen.
There could be a separate bug in src:xen, but that is not #991967.

I also would inquire with the Debian Xen Team about why they
are backporting patches from the upstream xen unstable
branch into Debian's 4.14 package that is currently shipping
on Debian stable (bullseye). IMHO, the aforementioned
patches that are not in the stable 4.14 branch upstream
should not be included in the xen package for Debian stable.
It was requested since someone trying to have Xen operational on a device
needed those for operation.  Rather a lot of bugfix or very small
standalone feature patches get cherry-picked.


Presently I haven't been convinced this is a Xen bug (though it does
effect Xen installations).

Any chance you've got the tools to build and try a 5.5.0 or 5.10.0 Linux
kernel?  I'm suspecting got incorrectly backported on the Linux side
(alternatively the Xen project seems a bit poor at keeping needed patches
in Linux).



Yes, I recently built and tested a slightly modified Debian
bullseye kernel to test a fix for #983357:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=983357

If you have a patch for Debian's 5.10 bullseye kernel that
might fix the dom0 poweroff bug I am seeing on bullseye with
Debian's current Xen 4.14, I am willing to try it out on my
system as an alternate fix from the fix I discovered in
src:xen that unfortunately removes arm patches that are
needed by some devices.


Reply to: