blktests failures with v6.17 kernel
Hi all,
I ran the latest blktests (git hash: 8c610b5bd81b) with the v6.17 kernel. I
observed 3 failures listed below. Comparing with the previous report with the
v6.17-rc1 kernel [1], no new failure was found.
[1] https://lore.kernel.org/linux-block/suhzith2uj75uiprq4m3cglvr7qwm3d7gi4tmjeohlxl6fcmv3@zu6zym6nmvun/
List of failures
================
#1: nvme/005,063 (tcp transport)
#2: nvme/041 (fc transport)
#3: nvme/061 (fc transport)
Failure description
===================
#1: nvme/005,063 (tcp transport)
The test case nvme/063 fails for tcp transport due to the lockdep WARN
related to the three locks q->q_usage_counter, q->elevator_lock and
set->srcu. Refer to the report for the v6.16-rc1 kernel [2].
[2] https://lore.kernel.org/linux-block/4fdm37so3o4xricdgfosgmohn63aa7wj3ua4e5vpihoamwg3ui@fq42f5q5t5ic/
I also noticed when I enable the kernel config NVME_MULTIPATH, the same
lockdep WARN is observed with nvme/005 [3].
#2: nvme/041 (fc transport)
The test case nvme/041 fails for fc transport. Refer to the report for the
v6.12 kernel [4].
[4] https://lore.kernel.org/linux-nvme/6crydkodszx5vq4ieox3jjpwkxtu7mhbohypy24awlo5w7f4k6@to3dcng24rd4/
#3: nvme/061 (fc transport)
The test case nvme/061 sometimes fails for fc transport due to a WARN and
the refcount message "refcount_t: underflow; use-after-free." Refer to the
report for the v6.15 kernel [5]. Daniel provided a fix for this failure [6],
which will be included in the v6.18-rc1 kernel.
[5] https://lore.kernel.org/linux-block/2xsfqvnntjx5iiir7wghhebmnugmpfluv6ef22mghojgk6gilr@mvjscqxroqqk/
[6] https://lore.kernel.org/linux-nvme/20250902-fix-nvmet-fc-v3-2-1ae1ecb798d8@kernel.org/
[3]
[ 49.880160] [ T1102] run blktests nvme/005 at 2025-10-02 15:52:41
[ 49.955684] [ T1486] loop0: detected capacity change from 0 to 2097152
[ 49.973977] [ T1489] nvmet: adding nsid 1 to subsystem blktests-subsystem-1
[ 50.000216] [ T1493] nvmet_tcp: enabling port 0 (10.0.2.15:4420)
[ 50.094320] [ T113] nvmet: Created nvm controller 1 for subsystem blktests-subsystem-1 for NQN nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349.
[ 50.101675] [ T1500] nvme nvme5: creating 4 I/O queues.
[ 50.105800] [ T1500] nvme nvme5: mapped 4/0/0 default/read/poll queues.
[ 50.109259] [ T1500] nvme nvme5: new ctrl: NQN "blktests-subsystem-1", addr 10.0.2.15:4420, hostnqn: nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349
[ 50.424485] [ T169] nvmet: Created nvm controller 2 for subsystem blktests-subsystem-1 for NQN nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349.
[ 50.430147] [ T462] nvme nvme5: creating 4 I/O queues.
[ 50.448180] [ T462] nvme nvme5: mapped 4/0/0 default/read/poll queues.
[ 50.500881] [ T1548] nvme nvme5: Removing ctrl: NQN "blktests-subsystem-1"
[ 50.536145] [ T1548] ======================================================
[ 50.537233] [ T1548] WARNING: possible circular locking dependency detected
[ 50.538374] [ T1548] 6.17.0 #367 Not tainted
[ 50.539332] [ T1548] ------------------------------------------------------
[ 50.540471] [ T1548] nvme/1548 is trying to acquire lock:
[ 50.541497] [ T1548] ffff888104f45f90 (set->srcu){.+.+}-{0:0}, at: __synchronize_srcu+0x21/0x240
[ 50.542804] [ T1548]
but task is already holding lock:
[ 50.544410] [ T1548] ffff88813b235e38 (&q->elevator_lock){+.+.}-{4:4}, at: elevator_change+0x12c/0x510
[ 50.545719] [ T1548]
which lock already depends on the new lock.
[ 50.548023] [ T1548]
the existing dependency chain (in reverse order) is:
[ 50.549707] [ T1548]
-> #4 (&q->elevator_lock){+.+.}-{4:4}:
[ 50.551288] [ T1548] __mutex_lock+0x1b2/0x1c30
[ 50.552175] [ T1548] elevator_change+0x12c/0x510
[ 50.553072] [ T1548] elv_iosched_store+0x271/0x2e0
[ 50.553968] [ T1548] queue_attr_store+0x235/0x360
[ 50.554843] [ T1548] kernfs_fop_write_iter+0x3d6/0x5e0
[ 50.555732] [ T1548] vfs_write+0x523/0xf80
[ 50.556547] [ T1548] ksys_write+0xfb/0x200
[ 50.557386] [ T1548] do_syscall_64+0x94/0x400
[ 50.558238] [ T1548] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 50.559171] [ T1548]
-> #3 (&q->q_usage_counter(io)){++++}-{0:0}:
[ 50.560670] [ T1548] blk_alloc_queue+0x5bf/0x710
[ 50.561469] [ T1548] blk_mq_alloc_queue+0x13f/0x250
[ 50.562326] [ T1548] scsi_alloc_sdev+0x843/0xc60
[ 50.563160] [ T1548] scsi_probe_and_add_lun+0x473/0xbc0
[ 50.564013] [ T1548] __scsi_add_device+0x1be/0x1f0
[ 50.564828] [ T1548] ata_scsi_scan_host+0x139/0x3a0
[ 50.565632] [ T1548] async_run_entry_fn+0x93/0x540
[ 50.566451] [ T1548] process_one_work+0x868/0x14b0
[ 50.567279] [ T1548] worker_thread+0x5ee/0xfd0
[ 50.568059] [ T1548] kthread+0x3af/0x770
[ 50.568796] [ T1548] ret_from_fork+0x3e5/0x510
[ 50.569562] [ T1548] ret_from_fork_asm+0x1a/0x30
[ 50.570337] [ T1548]
-> #2 (fs_reclaim){+.+.}-{0:0}:
[ 50.571673] [ T1548] fs_reclaim_acquire+0xd5/0x120
[ 50.572407] [ T1548] kmem_cache_alloc_node_noprof+0x55/0x430
[ 50.573254] [ T1548] __alloc_skb+0x1e9/0x2e0
[ 50.573972] [ T1548] tcp_send_active_reset+0x81/0x750
[ 50.574690] [ T1548] tcp_disconnect+0x1430/0x1c70
[ 50.575342] [ T1548] __tcp_close+0x74b/0xe50
[ 50.576061] [ T1548] tcp_close+0x1f/0xb0
[ 50.576727] [ T1548] inet_release+0x100/0x230
[ 50.577355] [ T1548] __sock_release+0xb0/0x270
[ 50.578075] [ T1548] sock_close+0x14/0x20
[ 50.578731] [ T1548] __fput+0x36e/0xac0
[ 50.579311] [ T1548] delayed_fput+0x6a/0x90
[ 50.580001] [ T1548] process_one_work+0x868/0x14b0
[ 50.580704] [ T1548] worker_thread+0x5ee/0xfd0
[ 50.581323] [ T1548] kthread+0x3af/0x770
[ 50.581980] [ T1548] ret_from_fork+0x3e5/0x510
[ 50.582602] [ T1548] ret_from_fork_asm+0x1a/0x30
[ 50.583245] [ T1548]
-> #1 (sk_lock-AF_INET-NVME){+.+.}-{0:0}:
[ 50.584369] [ T1548] lock_sock_nested+0x32/0xf0
[ 50.585061] [ T1548] tcp_sendmsg+0x1c/0x50
[ 50.585697] [ T1548] sock_sendmsg+0x2ef/0x410
[ 50.586281] [ T1548] nvme_tcp_try_send_cmd_pdu+0x57f/0xbc0 [nvme_tcp]
[ 50.587101] [ T1548] nvme_tcp_try_send+0x1b3/0x9c0 [nvme_tcp]
[ 50.587847] [ T1548] nvme_tcp_queue_rq+0xf77/0x1970 [nvme_tcp]
[ 50.588564] [ T1548] blk_mq_dispatch_rq_list+0x39b/0x21d0
[ 50.589271] [ T1548] __blk_mq_sched_dispatch_requests+0x1dd/0x14f0
[ 50.590076] [ T1548] blk_mq_sched_dispatch_requests+0xa8/0x150
[ 50.590823] [ T1548] blk_mq_run_hw_queue+0x1c9/0x520
[ 50.591460] [ T1548] blk_execute_rq+0x166/0x380
[ 50.592118] [ T1548] __nvme_submit_sync_cmd+0x104/0x320 [nvme_core]
[ 50.592930] [ T1548] nvmf_connect_io_queue+0x1c6/0x2f0 [nvme_fabrics]
[ 50.593731] [ T1548] nvme_tcp_start_queue+0x813/0xbd0 [nvme_tcp]
[ 50.594449] [ T1548] nvme_tcp_setup_ctrl.cold+0x6fb/0xcbf [nvme_tcp]
[ 50.595274] [ T1548] nvme_tcp_create_ctrl+0x835/0xb90 [nvme_tcp]
[ 50.596079] [ T1548] nvmf_dev_write+0x3e3/0x800 [nvme_fabrics]
[ 50.596846] [ T1548] vfs_write+0x1cc/0xf80
[ 50.597426] [ T1548] ksys_write+0xfb/0x200
[ 50.598095] [ T1548] do_syscall_64+0x94/0x400
[ 50.598750] [ T1548] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 50.599445] [ T1548]
-> #0 (set->srcu){.+.+}-{0:0}:
[ 50.600512] [ T1548] __lock_acquire+0x14a7/0x2290
[ 50.601213] [ T1548] lock_sync+0xb8/0x120
[ 50.601854] [ T1548] __synchronize_srcu+0xa0/0x240
[ 50.602486] [ T1548] elevator_switch+0x2c4/0x660
[ 50.603193] [ T1548] elevator_change+0x2de/0x510
[ 50.603876] [ T1548] elevator_set_none+0xa0/0xf0
[ 50.604496] [ T1548] blk_unregister_queue+0x13f/0x2b0
[ 50.605258] [ T1548] __del_gendisk+0x263/0x9e0
[ 50.605926] [ T1548] del_gendisk+0x102/0x190
[ 50.606524] [ T1548] nvme_ns_remove+0x32a/0x900 [nvme_core]
[ 50.607296] [ T1548] nvme_remove_namespaces+0x263/0x3b0 [nvme_core]
[ 50.608129] [ T1548] nvme_do_delete_ctrl+0xf5/0x160 [nvme_core]
[ 50.608916] [ T1548] nvme_delete_ctrl_sync.cold+0x8/0xd [nvme_core]
[ 50.609721] [ T1548] nvme_sysfs_delete+0x96/0xc0 [nvme_core]
[ 50.610433] [ T1548] kernfs_fop_write_iter+0x3d6/0x5e0
[ 50.611186] [ T1548] vfs_write+0x523/0xf80
[ 50.611839] [ T1548] ksys_write+0xfb/0x200
[ 50.612431] [ T1548] do_syscall_64+0x94/0x400
[ 50.613125] [ T1548] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 50.613891] [ T1548]
other info that might help us debug this:
[ 50.615475] [ T1548] Chain exists of:
set->srcu --> &q->q_usage_counter(io) --> &q->elevator_lock
[ 50.617333] [ T1548] Possible unsafe locking scenario:
[ 50.618497] [ T1548] CPU0 CPU1
[ 50.619230] [ T1548] ---- ----
[ 50.619921] [ T1548] lock(&q->elevator_lock);
[ 50.620510] [ T1548] lock(&q->q_usage_counter(io));
[ 50.621353] [ T1548] lock(&q->elevator_lock);
[ 50.622196] [ T1548] sync(set->srcu);
[ 50.622763] [ T1548]
*** DEADLOCK ***
[ 50.624172] [ T1548] 5 locks held by nvme/1548:
[ 50.624809] [ T1548] #0: ffff88813163c428 (sb_writers#4){.+.+}-{0:0}, at: ksys_write+0xfb/0x200
[ 50.625783] [ T1548] #1: ffff88813a344488 (&of->mutex#2){+.+.}-{4:4}, at: kernfs_fop_write_iter+0x257/0x5e0
[ 50.626823] [ T1548] #2: ffff88814fbf9698 (kn->active#138){++++}-{0:0}, at: sysfs_remove_file_self+0x61/0xb0
[ 50.627887] [ T1548] #3: ffff888108600190 (&set->update_nr_hwq_lock){++++}-{4:4}, at: del_gendisk+0xfa/0x190
[ 50.628938] [ T1548] #4: ffff88813b235e38 (&q->elevator_lock){+.+.}-{4:4}, at: elevator_change+0x12c/0x510
[ 50.630026] [ T1548]
stack backtrace:
[ 50.631053] [ T1548] CPU: 3 UID: 0 PID: 1548 Comm: nvme Not tainted 6.17.0 #367 PREEMPT(voluntary)
[ 50.631056] [ T1548] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-4.fc42 04/01/2014
[ 50.631060] [ T1548] Call Trace:
[ 50.631061] [ T1548] <TASK>
[ 50.631063] [ T1548] dump_stack_lvl+0x6a/0x90
[ 50.631067] [ T1548] print_circular_bug.cold+0x185/0x1d0
[ 50.631071] [ T1548] check_noncircular+0x14a/0x170
[ 50.631076] [ T1548] __lock_acquire+0x14a7/0x2290
[ 50.631078] [ T1548] ? save_trace+0x53/0x360
[ 50.631082] [ T1548] lock_sync+0xb8/0x120
[ 50.631083] [ T1548] ? __synchronize_srcu+0x21/0x240
[ 50.631086] [ T1548] ? __synchronize_srcu+0x21/0x240
[ 50.631088] [ T1548] __synchronize_srcu+0xa0/0x240
[ 50.631090] [ T1548] ? __pfx___synchronize_srcu+0x10/0x10
[ 50.631094] [ T1548] ? kvm_clock_get_cycles+0x14/0x30
[ 50.631097] [ T1548] ? ktime_get_mono_fast_ns+0x82/0x380
[ 50.631099] [ T1548] ? lockdep_hardirqs_on_prepare+0xce/0x1b0
[ 50.631102] [ T1548] elevator_switch+0x2c4/0x660
[ 50.631106] [ T1548] elevator_change+0x2de/0x510
[ 50.631109] [ T1548] elevator_set_none+0xa0/0xf0
[ 50.631112] [ T1548] ? __pfx_elevator_set_none+0x10/0x10
[ 50.631114] [ T1548] ? kernfs_put.part.0+0x12d/0x480
[ 50.631117] [ T1548] ? kobject_put+0x5a/0x4a0
[ 50.631120] [ T1548] blk_unregister_queue+0x13f/0x2b0
[ 50.631123] [ T1548] __del_gendisk+0x263/0x9e0
[ 50.631126] [ T1548] ? down_read+0x13b/0x480
[ 50.631129] [ T1548] ? __pfx___del_gendisk+0x10/0x10
[ 50.631130] [ T1548] ? __pfx_down_read+0x10/0x10
[ 50.631133] [ T1548] ? up_write+0x1c8/0x520
[ 50.631136] [ T1548] del_gendisk+0x102/0x190
[ 50.631138] [ T1548] nvme_ns_remove+0x32a/0x900 [nvme_core]
[ 50.631158] [ T1548] nvme_remove_namespaces+0x263/0x3b0 [nvme_core]
[ 50.631178] [ T1548] ? __pfx_nvme_remove_namespaces+0x10/0x10 [nvme_core]
[ 50.631197] [ T1548] nvme_do_delete_ctrl+0xf5/0x160 [nvme_core]
[ 50.631216] [ T1548] nvme_delete_ctrl_sync.cold+0x8/0xd [nvme_core]
[ 50.631235] [ T1548] nvme_sysfs_delete+0x96/0xc0 [nvme_core]
[ 50.631254] [ T1548] ? __pfx_sysfs_kf_write+0x10/0x10
[ 50.631256] [ T1548] kernfs_fop_write_iter+0x3d6/0x5e0
[ 50.631259] [ T1548] ? __pfx_kernfs_fop_write_iter+0x10/0x10
[ 50.631261] [ T1548] vfs_write+0x523/0xf80
[ 50.631264] [ T1548] ? __pfx_vfs_write+0x10/0x10
[ 50.631266] [ T1548] ? fput_close_sync+0xda/0x1b0
[ 50.631269] [ T1548] ? do_raw_spin_unlock+0x55/0x230
[ 50.631272] [ T1548] ? lockdep_hardirqs_on+0x88/0x130
[ 50.631274] [ T1548] ? entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 50.631276] [ T1548] ? do_syscall_64+0x180/0x400
[ 50.631279] [ T1548] ksys_write+0xfb/0x200
[ 50.631281] [ T1548] ? __pfx_ksys_write+0x10/0x10
[ 50.631283] [ T1548] ? ksys_read+0xfb/0x200
[ 50.631285] [ T1548] ? __pfx_ksys_read+0x10/0x10
[ 50.631288] [ T1548] do_syscall_64+0x94/0x400
[ 50.631290] [ T1548] ? lockdep_hardirqs_on+0x88/0x130
[ 50.631292] [ T1548] ? entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 50.631293] [ T1548] ? do_syscall_64+0x180/0x400
[ 50.631295] [ T1548] ? entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 50.631296] [ T1548] ? do_syscall_64+0x180/0x400
[ 50.631299] [ T1548] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 50.631301] [ T1548] RIP: 0033:0x7fd435ff877e
[ 50.631304] [ T1548] Code: 4d 89 d8 e8 d4 bc 00 00 4c 8b 5d f8 41 8b 93 08 03 00 00 59 5e 48 83 f8 fc 74 11 c9 c3 0f 1f 80 00 00 00 00 48 8b 45 10 0f 05 <c9> c3 83 e2 39 83 fa 08 75 e7 e8 13 ff ff ff 0f 1f 00 f3 0f 1e fa
[ 50.631308] [ T1548] RSP: 002b:00007fffcb25ded0 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
[ 50.631313] [ T1548] RAX: ffffffffffffffda RBX: 00007fd4361c28e0 RCX: 00007fd435ff877e
[ 50.631314] [ T1548] RDX: 0000000000000001 RSI: 00007fd4361c28e0 RDI: 0000000000000003
[ 50.631316] [ T1548] RBP: 00007fffcb25dee0 R08: 0000000000000000 R09: 0000000000000000
[ 50.631317] [ T1548] R10: 0000000000000000 R11: 0000000000000202 R12: 000000003d3687f0
[ 50.631318] [ T1548] R13: 000000003d36a510 R14: 000000003d368610 R15: 0000000000000000
[ 50.631322] [ T1548] </TASK>
Reply to: