Bug#1116358: [regression] Regression from 90bfb28d5fa8 ("io_uring/rw: drop -EOPNOTSUPP check in __io_complete_rw_common()"): LVM snapshots causing I/O errors in KVM guest with aio=io_uring set
On 10/13/25 11:47 AM, Salvatore Bonaccorso wrote:
> Hi Jens,
>
> Kevin Lumik reported in Debian the following issue (not exactly a
> minimal reproducer, but bisection results follows):
>
> On Fri, Sep 26, 2025 at 10:34:22AM +0300, Kevin Lumik wrote:
>
>> Dear Maintainer,
>>
>> After upgrading from Debian Bookworm to Trixie, an issue within KVM guests when creating an LVM snapshot of its volume
>> has surfaced. When a LVM snapshot is taken from the host, the guest starts to get I/O errors. The issue seems to only
>> appear when aio=io_uring is specified in the KVM drive parameters and also seems to resolve when downgrading the kernel
>> package down to 6.1. The issue is also not reproducible when using aio=native.
>>
>> KVM args for the drive: "-drive id=drive-
>> virtio0,format=raw,file=/dev/dom/vps_testsql,cache=none,aio=io_uring,index=0,media=disk,if=virtio"
>>
>> An IO workload is being created in the VM using "fio --randrepeat=1 --ioengine=io_uring --direct=1 --gtod_reduce=1 --
>> name=randwrite --filename=/root/test.bin --bs=4k --iodepth=64 --runtime=60 --numjobs=32 --readwrite=randwrite --size=1G
>> --rwmixread=75 --group_reporting"
>>
>> After running "lvcreate -s /dev/dom/vps_testsql -n test -L 1G" on the host we can observe fio erroring out:
>>
>> ...
>> fio: io_u error on file /root/test.bin: Input/output error: write offset=46977024, buflen=4096
>> fio: pid=7367, err=5/file:io_u.c:1876, func=io_u error, error=Input/output error
>> fio: io_u error on file /root/test.bin: Input/output error: write offset=261505024, buflen=4096
>> fio: pid=7374, err=5/file:io_u.c:1876, func=io_u error, error=Input/output error
>> fio: io_u error on file /root/test.bin: Input/output error: write offset=973840384, buflen=4096
>> fio: io_u error on file /root/test.bin: Input/output error: write offset=9637888, buflen=4096
>> fio: io_u error on file /root/test.bin: Input/output error: write offset=159965184, buflen=4096
>> fio: io_u error on file /root/test.bin: Input/output error: write offset=857505792, buflen=4096
>> fio: io_u error on file /root/test.bin: Input/output error: write offset=90787840, buflen=4096
>> fio: io_u error on file /root/test.bin: Input/output error: write offset=26427392, buflen=4096
>> fio: io_u error on file /root/test.bin: Input/output error: write offset=955621376, buflen=4096
>> fio: io_u error on file /root/test.bin: Input/output error: write offset=96169984, buflen=4096
>> fio: pid=7372, err=5/file:io_u.c:1876, func=io_u error, error=Input/output error
>> fio: pid=7362, err=5/file:io_u.c:1876, func=io_u error, error=Input/output error
>> fio: io_u error on file /root/test.bin: Input/output error: write offset=203702272, buflen=4096
>> fio: io_u error on file /root/test.bin: Input/output error: write offset=814649344, buflen=4096
>> fio: io_u error on file /root/test.bin: Input/output error: write offset=91467776, buflen=4096
>> fio: io_u error on file /root/test.bin: Input/output error: write offset=948256768, buflen=4096
>> fio: io_u error on file /root/test.bin: Input/output error: write offset=105295872, buflen=4096
>> fio: io_u error on file /root/test.bin: Input/output error: write offset=75247616, buflen=4096
>> fio: io_u error on file /root/test.bin: Input/output error: write offset=1062293504, buflen=4096
>> fio: io_u error on file /root/test.bin: Input/output error: write offset=111955968, buflen=4096
>> fio: io_u error on file /root/test.bin: Input/output error: write offset=942563328, buflen=4096
>> fio: io_u error on file /root/test.bin: Input/output error: write offset=117354496, buflen=4096
>> fio: io_u error on file /root/test.bin: Input/output error: write offset=1050402816, buflen=4096
>> fio: io_u error on file /root/test.bin: Input/output error: write offset=125419520, buflen=4096
>> fio: io_u error on file /root/test.bin: Input/output error: write offset=129044480, buflen=4096
>> fio: pid=7369, err=5/file:io_u.c:1876, func=io_u error, error=Input/output error
>> fio: pid=7373, err=5/file:io_u.c:1876, func=io_u error, error=Input/output error
>> fio: pid=7348, err=5/file:io_u.c:1876, func=io_u error, error=Input/output error
>> fio: pid=7375, err=5/file:io_u.c:1876, func=io_u error, error=Input/output error
>> fio: pid=7358, err=5/file:io_u.c:1876, func=io_u error, error=Input/output error
>> fio: pid=7359, err=5/file:io_u.c:1876, func=io_u error, error=Input/output error
>>
>> randwrite: (groupid=0, jobs=32): err= 5 (file:io_u.c:1876, func=io_u error, error=Input/output error): pid=7344: Thu Sep
>> 25 16:28:30 2025
>> write: IOPS=194k, BW=758MiB/s (795MB/s)(3976MiB/5244msec); 0 zone resets
>> bw ( KiB/s): min=657104, max=1009816, per=100.00%, avg=785345.80, stdev=4331.73, samples=320
>> iops : min=164276, max=252454, avg=196336.40, stdev=1082.93, samples=320
>> cpu : usr=0.33%, sys=1.07%, ctx=18912, majf=0, minf=455
>> IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=99.8%
>> submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
>> complete : 0=0.1%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
>> issued rwts: total=0,1019955,0,0 short=0,0,0,0 dropped=0,0,0,0
>> latency : target=0, window=0, percentile=100.00%, depth=64
>>
>> Run status group 0 (all jobs):
>> WRITE: bw=758MiB/s (795MB/s), 758MiB/s-758MiB/s (795MB/s-795MB/s), io=3976MiB (4169MB), run=5244-5244msec
>>
>> Disk stats (read/write):
>> vda: ios=1/1001361, merge=0/2, ticks=0/10314531, in_queue=10314549, util=97.82%
>> Bus error
>>
>> ---
>>
>> And the error is also visible in the kernel error log of the VM:
>> [ 361.962970] I/O error, dev vda, sector 83277192 op 0x1:(WRITE) flags 0x8800 phys_seg 1 prio class 2
>> [ 361.963945] I/O error, dev vda, sector 83067480 op 0x1:(WRITE) flags 0x8800 phys_seg 1 prio class 2
>> [ 361.964489] I/O error, dev vda, sector 82881208 op 0x1:(WRITE) flags 0x8800 phys_seg 1 prio class 2
>> [ 361.964499] I/O error, dev vda, sector 83031976 op 0x1:(WRITE) flags 0x8800 phys_seg 1 prio class 2
>> [ 361.964501] I/O error, dev vda, sector 83089832 op 0x1:(WRITE) flags 0x8800 phys_seg 1 prio class 2
>> [ 361.964503] I/O error, dev vda, sector 83156184 op 0x1:(WRITE) flags 0x8800 phys_seg 1 prio class 2
>> [ 361.964522] I/O error, dev vda, sector 83183384 op 0x1:(WRITE) flags 0x8800 phys_seg 1 prio class 2
>> [ 361.964524] I/O error, dev vda, sector 83310704 op 0x1:(WRITE) flags 0x8800 phys_seg 1 prio class 2
>> [ 361.964525] I/O error, dev vda, sector 83311144 op 0x1:(WRITE) flags 0x8800 phys_seg 1 prio class 2
>> [ 361.964532] I/O error, dev vda, sector 83315272 op 0x1:(WRITE) flags 0x8800 phys_seg 1 prio class 2
>>
>> The host does not generate any erros, they only seem to occur within the VM.
>
> Now I asked Kevin if a bisection is possible, and the following
> results was found (https://bugs.debian.org/1116358#41):
>
> I've identified the first bad commit using git bisect:
>
> 90bfb28d5fa8127a113a140c9791ea0b40ab156a is the first bad commit
> commit 90bfb28d5fa8127a113a140c9791ea0b40ab156a
> Author: Jens Axboe <axboe@kernel.dk>
> Date: Tue Sep 10 08:57:04 2024 -0600
>
> io_uring/rw: drop -EOPNOTSUPP check in __io_complete_rw_common()
>
> A recent change ensured that the necessary -EOPNOTSUPP -> -EAGAIN
> transformation happens inline on both the reader and writer side,
> and hence there's no need to check for both of these anymore on
> the completion handler side.
>
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
>
> io_uring/rw.c | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)
>
>
> Here is the git bisect log as well:
>
> git bisect start
> # status: waiting for both good and bad commits
> # good: [98f7e32f20d28ec452afb208f9cffc08448a2652] Linux 6.11
> git bisect good 98f7e32f20d28ec452afb208f9cffc08448a2652
> # status: waiting for bad commit, 1 good commit known
> # bad: [59b723cd2adbac2a34fc8e12c74ae26ae45bf230] Linux 6.12-rc6
> git bisect bad 59b723cd2adbac2a34fc8e12c74ae26ae45bf230
> # bad: [de848da12f752170c2ebe114804a985314fd5a6a] Merge tag 'drm-next-2024-09-19' of https://gitlab.freedesktop.org/drm/kernel
> git bisect bad de848da12f752170c2ebe114804a985314fd5a6a
> # bad: [7b17f5ebd5fc5e9275eaa5af3d0771f2a7b01bbf] Merge tag 'soc-dt-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
> git bisect bad 7b17f5ebd5fc5e9275eaa5af3d0771f2a7b01bbf
> # good: [64dd3b6a79f0907d36de481b0f15fab323a53e5a] Merge tag 'for-linus-non-x86' of git://git.kernel.org/pub/scm/virt/kvm/kvm
> git bisect good 64dd3b6a79f0907d36de481b0f15fab323a53e5a
> # bad: [daa394f0f9d3cb002c72e2d3db99972e2ee42862] Merge tag 'core-debugobjects-2024-09-16' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
> git bisect bad daa394f0f9d3cb002c72e2d3db99972e2ee42862
> # good: [effdcd5275ed645f6e0f8e8ce690b97795722197] Merge tag 'affs-for-6.12-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
> git bisect good effdcd5275ed645f6e0f8e8ce690b97795722197
> # bad: [26bb0d3f38a764b743a3ad5c8b6e5b5044d7ceb4] Merge tag 'for-6.12/block-20240913' of git://git.kernel.dk/linux
> git bisect bad 26bb0d3f38a764b743a3ad5c8b6e5b5044d7ceb4
> # bad: [3a4d319a8fb5a9bbdf5b31ef32841eb286b1dcc2] Merge tag 'for-6.12/io_uring-20240913' of git://git.kernel.dk/linux
> git bisect bad 3a4d319a8fb5a9bbdf5b31ef32841eb286b1dcc2
> # good: [df2825e98507d10cb037a308087ecd7cb3f6688d] btrfs: always pass readahead state to defrag
> git bisect good df2825e98507d10cb037a308087ecd7cb3f6688d
> # good: [69a3a0a45a2f72412c2ba31761cc9193bb746fef] Merge tag 'erofs-for-6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs
> git bisect good 69a3a0a45a2f72412c2ba31761cc9193bb746fef
> # good: [ecd5c9b29643f383d39320e30d21b8615bd893da] io_uring/kbuf: add io_kbuf_commit() helper
> git bisect good ecd5c9b29643f383d39320e30d21b8615bd893da
> # good: [f011c9cf04c06f16b24f583d313d3c012e589e50] io_uring/sqpoll: do not allow pinning outside of cpuset
> git bisect good f011c9cf04c06f16b24f583d313d3c012e589e50
> # bad: [84eacf177faa605853c58e5b1c0d9544b88c16fd] io_uring/io-wq: inherit cpuset of cgroup in io worker
> git bisect bad 84eacf177faa605853c58e5b1c0d9544b88c16fd
> # bad: [90bfb28d5fa8127a113a140c9791ea0b40ab156a] io_uring/rw: drop - EOPNOTSUPP check in __io_complete_rw_common()
> git bisect bad 90bfb28d5fa8127a113a140c9791ea0b40ab156a
> # good: [c0a9d496e0fece67db777bd48550376cf2960c47] io_uring/rw: treat - EOPNOTSUPP for IOCB_NOWAIT like -EAGAIN
> git bisect good c0a9d496e0fece67db777bd48550376cf2960c47
> # first bad commit: [90bfb28d5fa8127a113a140c9791ea0b40ab156a] io_uring/rw: drop -EOPNOTSUPP check in __io_complete_rw_common()
>
> #regzbot introduced: 90bfb28d5fa8127a113a140c9791ea0b40ab156a
> #regzbot link: https://bugs.debian.org/1116358
>
> Does thi ring any bell?
Thanks to both of you, this is very useful! I guess the original commit
is mistaken, there's still a bubbling up of EOPNOTSUPP there. I'd say
let's just revert that commit, I will do that upstream and have it buble
down to the stable kernels too.
I can always look into this later and reintroduce it, if need be.
--
Jens Axboe
Reply to: