WARNING: possible circular locking dependency detected in nbd
On Wed, 18 Aug 2021 09:10:49 +0200 Sven Schnelle wrote:
> Hi,
>
> i'm seeing the lockdep splat below in CI. I think this is because
Thanks for reporting it.
> nbd_open is called with disk->open_mutex held, and acquires
> nbd_index_mutex. However, nbd_put() first takes the nbd_index_lock,
> and calls del_gendisk, which locks disk->open_mutex, so the order is
> reversed.
Right. See diff attached.
>
> WARNING: possible circular locking dependency detected
> 5.14.0-20210816.rc5.git0.04a03f7da6c2.300.fc34.s390x+debug #1 Not tainted
> ------------------------------------------------------
> modprobe/17864 is trying to acquire lock:
> 00000001dea24d28 (&disk->open_mutex){+.+.}-{3:3}, at: del_gendisk+0x64/0x210
>
> but task is already holding lock:
> 000003ff805fd6e8 (nbd_index_mutex){+.+.}-{3:3}, at: refcount_dec_and_mutex_lock+0x7e/0x110
>
> which lock already depends on the new lock.
>
> the existing dependency chain (in reverse order) is:
> -> #1 (nbd_index_mutex){+.+.}-{3:3}:
> validate_chain+0x9ca/0xde8
> __lock_acquire+0x64c/0xc40
> lock_acquire.part.0+0xec/0x258
> lock_acquire+0xb0/0x200
> __mutex_lock+0xa2/0x8d8
> mutex_lock_nested+0x32/0x40
> nbd_open+0x30/0x248 [nbd]
> blkdev_get_whole+0x38/0x128
> blkdev_get_by_dev+0xcc/0x400
> blkdev_open+0x7a/0xd8
> do_dentry_open+0x19e/0x390
> do_open+0x2e0/0x458
> path_openat+0xec/0x2a8
> do_filp_open+0x90/0x130
> do_sys_openat2+0xa8/0x168
> do_sys_open+0x62/0x90
> __do_syscall+0x1c2/0x1f0
> system_call+0x78/0xa0
>
> -> #0 (&disk->open_mutex){+.+.}-{3:3}:
> check_noncircular+0x168/0x188
> check_prev_add+0xe0/0xed8
> validate_chain+0x9ca/0xde8
> __lock_acquire+0x64c/0xc40
> lock_acquire.part.0+0xec/0x258
> lock_acquire+0xb0/0x200
> __mutex_lock+0xa2/0x8d8
> mutex_lock_nested+0x32/0x40
> del_gendisk+0x64/0x210
> nbd_put.part.0+0x46/0x98 [nbd]
> nbd_cleanup+0xde/0x118 [nbd]
> __do_sys_delete_module+0x19a/0x2a8
> __do_syscall+0x1c2/0x1f0
> system_call+0x78/0xa0
>
> other info that might help us debug this:
>
> Possible unsafe locking scenario:
>
> CPU0 CPU1
> ---- ----
> lock(nbd_index_mutex);
> lock(&disk->open_mutex);
> lock(nbd_index_mutex);
> lock(&disk->open_mutex);
>
> *** DEADLOCK ***
To fix it, delete disk without the nbd_index_lock held after removing
nbd from the idr.
Only for thoughts.
+++ b/drivers/block/nbd.c
@@ -2443,6 +2443,7 @@ static int nbd_exit_cb(int id, void *ptr
struct nbd_device *nbd = ptr;
list_add_tail(&nbd->list, list);
+ idr_remove(&nbd_index_idr, nbd->index);
return 0;
}
@@ -2462,7 +2463,7 @@ static void __exit nbd_cleanup(void)
list_del_init(&nbd->list);
if (refcount_read(&nbd->refs) != 1)
printk(KERN_ERR "nbd: possibly leaking a device\n");
- nbd_put(nbd);
+ nbd_dev_remove(nbd);
}
idr_destroy(&nbd_index_idr);
Reply to: