[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#943373: marked as done (linux-image-4.19.0-5-amd64: file system corruption in virtual machines)



Your message dated Mon, 5 Aug 2024 15:41:52 +0200
with message-id <ZrDWoCn_p92xuwg_@eldamar.lan>
and subject line Re: Bug#943373: additional information
has caused the Debian Bug report #943373,
regarding linux-image-4.19.0-5-amd64: file system corruption in virtual machines
to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact owner@bugs.debian.org
immediately.)


-- 
943373: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=943373
Debian Bug Tracking System
Contact owner@bugs.debian.org with problems
--- Begin Message ---
Package: src:linux
Version: 4.19.37-5+deb10u1

Dear Maintainer,

   * What led up to the situation?

   The upgrade of a Xen host from Stretch to Buster.


   * What was the outcome of this action?
   After the upgrade from Stretch to Buster of the host (dom0), several domUs developed
   file system errors leading to file system corruption, causing the domUs to remount in read-only mode.

   It should be noted that Buster has Xen 4.11, allowing PHV(v2) mode, while Stretch has Xen 4.8,
   not allowing that mode. It was decided to switch supported Debian Linux domUs from PV mode
   to PVH mode after the upgrade. Xen PVH mode requires at least a Linux 4.11 domU kernel.
   This requirement was satisfied by the stretch-backports linux-image-4.19.0-0.bpo.5-amd64 kernel.
   The domUs were configured to use drbd on lvm2 storage on an HP Smart Array P410i controller using the
   'hpsa' driver.
   The domUs were configured with a swap file.
   The domUs were started with the default XEN PV(H) configuration using a kernel provided by the hypervisor, e.g.:
   type = 'pvh'
   kernel      = '/boot/vmlinuz-4.19.0-0.bpo.5-amd64'
   ramdisk     = '/boot/initrd.img-4.19.0-0.bpo.5-amd64'
   extra       = 'elevator=noop noresume'
   root        = '/dev/xvda2 ro'
   disk        = 'drbd:volume,xvda2,w'
   The file system corruption developed in a matter of days after starting the domUs in PVH mode.
   Output of /sys/block/drbd[n]/queue/scheduler (dom0): none
   Output of /sys/module/scsi_mod/parameters/use_blk_mq (dom0): Y
   The issue occurred only in Stretch domUs that were using the stretch-backports
   linux-image-4.19.0-0.bpo.5-amd64 kernel (as opposed to the standard stretch 4.9 kernel)
   and only in PVH mode.
   No information specific to the issue was found in either the dom0 or domU log files nor the dom0 kernel log.
   Information from the domU kernel log (dmesg) was not preserved.


   * What outcome did you expect instead?

   The expected outcome had been that after upgrading the Stretch host
   to Buster, the domUs would run without issues, as before.


   * What exactly did you do (or not do) that was effective (or ineffective)?

   After reverting to the 4.9 kernel and PV mode on the domUs, the issue did not reoccur on the same (Buster)
   upgraded machine. The issue did not occur again in those domUs, even those configured with a swap file.
   As an additional precaution some domUs were reconfigured without a swap file, using a swap partition instead.

   The configuration with drbd on lvm2 volumes was maintained, as it had been in use without issue for a number of years
   on this machine and similar machines.


-- Package-specific info:
** Version:
Linux version 4.19.0-5-amd64 (debian-kernel@lists.debian.org) (gcc version 8.3.0 (Debian 8.3.0-6)) #1 SMP Debian 4.19.37-5+deb10u1 (2019-07-19)

** Command line:
placeholder root=/dev/mapper/host-vg-root ro console=tty0 console=ttyS1,115200n8 quiet

** Not tainted

** Model information
sys_vendor: HP
product_name: ProLiant DL360 G7
product_version:
chassis_vendor: HP
chassis_version:
bios_vendor: HP
bios_version: P68

** PCI devices:
05:00.0 RAID bus controller [0104]: Hewlett-Packard Company Smart Array G6 controllers [103c:323a] (rev 01)
        Subsystem: Hewlett-Packard Company Smart Array P410i [103c:3245]
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR- FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 64 bytes
        Interrupt: pin A routed to IRQ 28
        Region 0: Memory at fbe00000 (64-bit, non-prefetchable) [size=2M]
        Region 2: Memory at fbdf0000 (64-bit, non-prefetchable) [size=4K]
        Region 4: I/O ports at 4000 [size=256]
        [virtual] Expansion ROM at fbd00000 [disabled] [size=512K]
        Capabilities: <access denied>
        Kernel driver in use: hpsa
        Kernel modules: hpsa

-- System Information:
Debian Release: 10.0
  APT prefers stable-updates
  APT policy: (500, 'stable-updates'), (500, 'stable')
Architecture: amd64 (x86_64)

Kernel: Linux 4.19.0-5-amd64 (SMP w/2 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8), LANGUAGE=en_US:en (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled

Nick


--- End Message ---
--- Begin Message ---
Hi,

On Wed, Jul 31, 2024 at 03:20:06PM +0000, Veonax wrote:
> The likely root cause of this issue has been identified as a
> hardware problem. In particular, for those interested, a logical
> drive comprised of two different brand hdds. After the situation was
> remediated, the file system corruption did not reoccur.

Thanks for reporting back, in this case closing the bug.

Regards,
Salvatore

--- End Message ---

Reply to: