[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#810121: marked as done (linux: KVM guests randomly get I/O errors on VirtIO based devices)



Your message dated Fri, 20 Jan 2017 13:04:29 +0000
with message-id <E1cUYrl-000IYw-B3@fasolo.debian.org>
and subject line Bug#648208: fixed in os-prober 1.72
has caused the Debian Bug report #648208,
regarding linux: KVM guests randomly get I/O errors on VirtIO based devices
to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact owner@bugs.debian.org
immediately.)


-- 
648208: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=648208
Debian Bug Tracking System
Contact owner@bugs.debian.org with problems
--- Begin Message ---
Source: linux
Version: 3.16.7-ckt11-1+deb8u5
Severity: important

Hi kernel maintainers,

We've been seeing a strange bug in KVM guests hosted by a Debian jessie box (running 3.16.7-ckt11-1+deb8u5),

Basically, we are getting random VirtIO errors inside our guests, resulting in stuff like this

[4735406.568235] blk_update_request: I/O error, dev vda, sector 142339584
[4735406.572008] EXT4-fs warning (device dm-0): ext4_end_bio:317: I/O error -5 writing to inode 1184437 (offset 0 size 208896 starting block 17729472)
[4735406.572008] Buffer I/O error on device dm-0, logical block 17729472
[ ... ]
[4735406.572008] Buffer I/O error on device dm-0, logical block 17729481
[4735406.643486] blk_update_request: I/O error, dev vda, sector 142356480
[ ... ]
[4735406.748456] blk_update_request: I/O error, dev vda, sector 38587480
[4735411.020309] Buffer I/O error on dev dm-0, logical block 12640808, lost sync page write
[4735411.055184] Aborting journal on device dm-0-8.
[4735411.056148] Buffer I/O error on dev dm-0, logical block 12615680, lost sync page write
[4735411.057626] JBD2: Error -5 detected when updating journal superblock for dm-0-8.
[4735411.057936] Buffer I/O error on dev dm-0, logical block 0, lost sync page write
[4735411.057946] EXT4-fs error (device dm-0): ext4_journal_check_start:56: Detected aborted journal
[4735411.057948] EXT4-fs (dm-0): Remounting filesystem read-only
[4735411.057949] EXT4-fs (dm-0): previous I/O error to superblock detected

(From an Ubuntu 15.04 guest, EXT4 on LVM2)

Or,

Jan 06 03:39:11 titanium kernel: end_request: I/O error, dev vda, sector 1592467904
Jan 06 03:39:11 titanium kernel: EXT4-fs warning (device vda3): ext4_end_bio:317: I/O error -5 writing to inode 31169653 (offset 0 size 0 starting block 199058492)
Jan 06 03:39:11 titanium kernel: Buffer I/O error on device vda3, logical block 198899256
[...]
Jan 06 03:39:12 titanium kernel: Aborting journal on device vda3-8.
Jan 06 03:39:12 titanium kernel: Buffer I/O error on device vda3, logical block 99647488

(From a Debian jessie guest, EXT4 directly on a VirtIO-based block device)

When this happens, it affects multiple guests on the hosts at the same time.
Normally they are severe enough that they end up with a r/o file system, but
we've seen a few hosts survive with a non-fatal I/O error. The host's dmesg has
nothing interesting to see.

We've seen this happen with quite heterogeneous guests:

- Debian 6, 7 and 8 (Debian kernels 2.6.32, 3.2 and 3.16)
- Ubuntu 14.09 and 15.04 (Ubuntu kernels)
- 32 bit and 64 bit installs.

In short, we haven't seen a clear characteristic in any guest, other than the
affected hosts being the ones with some sustained I/O load (build machines,
cgit servers, PostgreSQL RDBMs...). Most of the times, hosts that just sit
there doing nothing with their disks are not affected.

The host is a stock Debian jessie install that manages libvirt-based QEMU
guests. All the guests have their block devices using virtio drivers, some of
them on spinning media based on LSI RAID (was a 3ware card before, got replaced
as we were very suspicious about it, but are getting the same results), and
some of them based on PCIe SSD storage. We have some other 3 hosts, similar
setup except they run Debian wheezy (and honestly we're not too keen on
upgrading them yet, just in case), none of them has ever shown this kind of
problem

We've been seeing this since last summer, and haven't found a pattern that
tells us where these I/O error bugs are coming from. Google isn't revealing
other people with a similar problem, and we're finding that quite surprising as
our setup is quite basic.

Thanks,
Jordi

--- End Message ---
--- Begin Message ---
Source: os-prober
Source-Version: 1.72

We believe that the bug you reported is fixed in the latest version of
os-prober, which is due to be installed in the Debian FTP archive.

A summary of the changes between this version and the previous one is
attached.

Thank you for reporting the bug, which will now be closed.  If you
have further comments please address them to 648208@bugs.debian.org,
and the maintainer will reopen the bug report if appropriate.

Debian distribution maintenance software
pp.
Colin Watson <cjwatson@debian.org> (supplier of updated os-prober package)

(This message was generated automatically at their request; if you
believe that there is a problem with it please contact the archive
administrators by mailing ftpmaster@ftp-master.debian.org)


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

Format: 1.8
Date: Fri, 20 Jan 2017 12:44:34 +0000
Source: os-prober
Binary: os-prober-udeb os-prober
Architecture: source
Version: 1.72
Distribution: unstable
Urgency: medium
Maintainer: Debian Install System Team <debian-boot@lists.debian.org>
Changed-By: Colin Watson <cjwatson@debian.org>
Description:
 os-prober  - utility to detect other OSes on a set of drives
 os-prober-udeb - utility to detect other OSes on a set of drives (udeb)
Closes: 648208 674561 698598 698733 701814 776275 784709 787418 794409 801631 803155
Changes:
 os-prober (1.72) unstable; urgency=medium
 .
   * Improve logging of mounting and setting partitions to ro/rw (thanks, Ivo
     De Decker).
   * Use a read-only device-mapper entry if possible rather than setting the
     underlying device to read-only (thanks, Ivo De Decker; closes: #701814).
     Note that this introduces a dependency on dmsetup on Linux
     architectures.
   * Remove the "blockdev --setro" code path entirely, since the read-only
     device-mapper arrangement supersedes it and should be safer (closes:
     #648208).
   * Make os-prober-udeb depend on grub-mount-udeb on all Linux and kFreeBSD
     architectures, now that it's available on them all (thanks, James
     Cowgill; closes: #776275).
   * Make os-prober depend on grub-common on Linux and kFreeBSD, in order
     that grub-mount is consistently available.
   * Fix detection of /usr/ partition as a GNU/Linux root partition when
     /lib* directories are moved to /usr/ completely (thanks, Hedayat
     Vatankhah; closes: #698733).
   * Make the yaboot parser more tolerant about the syntax of "append"
     options (thanks, Hedayat Vatankhah; closes: #674561).
   * Disable debugging if OS_PROBER_DISABLE_DEBUG is set (thanks, Hedayat
     Vatankhah; closes: #698598).
   * Replace basename/dirname with shell string processing (thanks, Hedayat
     Vatankhah; part of #694668).
   * Call dmraid only once (thanks, Jeff Mahoney).
   * Fix typos in README (thanks, Nyav; closes: #803155).
   * Add os-release support (based loosely on a patch by Török Edwin; closes:
     #794409).
   * Add Devuan detection (thanks, David Hare; closes: #801631).
   * Work harder to avoid trying to mount extended partitions (thanks,
     Philippe Coval; closes: #784709).
   * Drop " (loader)" suffixes on Microsoft operating systems (thanks, Chris
     Lamb; closes: #787418).
Checksums-Sha1:
 8438248e4fbc40749b85e37c16cf433db51b3cff 1816 os-prober_1.72.dsc
 c7b3b51719328b7cf0258561f4ee9c08f9a21d79 26452 os-prober_1.72.tar.xz
Checksums-Sha256:
 f79975ddbd06ed371c3f27a781d423b2d88af6cc04896946f8df03fa42915f70 1816 os-prober_1.72.dsc
 13ed24f78e83f0c49e11635891458f067e6c8348be210adda46629dd3b7b627a 26452 os-prober_1.72.tar.xz
Files:
 90ced5506beca6dc949347ab30f6e174 1816 debian-installer optional os-prober_1.72.dsc
 16ce39ea58a684f102dd7aff1779cbf5 26452 debian-installer optional os-prober_1.72.tar.xz

-----BEGIN PGP SIGNATURE-----
Comment: Colin Watson <cjwatson@debian.org> -- Debian developer

iQIzBAEBCAAdFiEErApP8SYRtvzPAcEROTWH2X2GUAsFAliCBlMACgkQOTWH2X2G
UAuXBw/+J+SAw+8es7Ca2xByEwor4P28IqkoU5uiBZ2zYo4xs727tAo6CVX0yw7Z
ctRSJhOpKF1/FtiIUNq+6E6ZKKRzqEder4S7fRI92e7FXbW0yn5unUz84EvVrcI1
5V/rhEe3y8IrXEgVWMJoY+Vg5/sQWRMHap7HamC0Zj+9zlGmjZkc01gJRr0hTvw/
ApCUhZncVSlqXb7zgjU74HVX3+YHzJaOsZY4XyhVjmEnWlY8U0pNffCnzuxJDTuY
gMDqC9AQmKyA3bhmzgi5DSh+5r6j2nwvYV9a3+xgJZvu0mv6Ng+UNd3p+uS+7BFU
8KdPy3Cnd0Xv/DkMKYdU5sOoFzsDwXF0sxJtlBpJyMFt9ZAnpRz7oxIS7D76HzYI
msUVsvnBLaldkdTKpn9BlhxGVCH1iIlwN6w3aCW/fBNjuA9VBJyEdzzhCuaAfXZC
fYwryq2gXgeowwdvH6rQcu04C7AHWiXulsl9yV1be8OlbRk8yc+CF08Ma7ABAweI
ToYXEupEfdWEkKyQPaGtgoowUquX6cmYKeKBn5E6N2AVgU+yT10JbWeg6VLzlkKe
p3UD6+iFKjfdKCjM9iy9aoyGjDGCy35lqb824R+nAZsGsJmkEMgRFgTMqYtgCOkU
Jtp0pdbat3CXTQn50nbICF77jOzcFM4p2sO3yVOI0sFOa0mitMY=
=lgmu
-----END PGP SIGNATURE-----

--- End Message ---

Reply to: