[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

WARNING: possible circular locking dependency detected in nbd



On Wed, 18 Aug 2021 09:10:49 +0200 Sven Schnelle wrote:
> Hi,
> 
> i'm seeing the lockdep splat below in CI. I think this is because

Thanks for reporting it.

> nbd_open is called with disk->open_mutex held, and acquires
> nbd_index_mutex. However, nbd_put() first takes the nbd_index_lock,
> and calls del_gendisk, which locks disk->open_mutex, so the order is
> reversed.

Right. See diff attached.
> 
> WARNING: possible circular locking dependency detected
> 5.14.0-20210816.rc5.git0.04a03f7da6c2.300.fc34.s390x+debug #1 Not tainted
> ------------------------------------------------------
> modprobe/17864 is trying to acquire lock:
> 00000001dea24d28 (&disk->open_mutex){+.+.}-{3:3}, at: del_gendisk+0x64/0x210
> 
> but task is already holding lock:
> 000003ff805fd6e8 (nbd_index_mutex){+.+.}-{3:3}, at: refcount_dec_and_mutex_lock+0x7e/0x110
> 
> which lock already depends on the new lock.
> 
> the existing dependency chain (in reverse order) is:
> -> #1 (nbd_index_mutex){+.+.}-{3:3}:
>        validate_chain+0x9ca/0xde8
>        __lock_acquire+0x64c/0xc40
>        lock_acquire.part.0+0xec/0x258
>        lock_acquire+0xb0/0x200
>        __mutex_lock+0xa2/0x8d8
>        mutex_lock_nested+0x32/0x40
>        nbd_open+0x30/0x248 [nbd]
>        blkdev_get_whole+0x38/0x128
>        blkdev_get_by_dev+0xcc/0x400
>        blkdev_open+0x7a/0xd8
>        do_dentry_open+0x19e/0x390
>        do_open+0x2e0/0x458
>        path_openat+0xec/0x2a8
>        do_filp_open+0x90/0x130
>        do_sys_openat2+0xa8/0x168
>        do_sys_open+0x62/0x90
>        __do_syscall+0x1c2/0x1f0
>        system_call+0x78/0xa0
> 
> -> #0 (&disk->open_mutex){+.+.}-{3:3}:
>        check_noncircular+0x168/0x188
>        check_prev_add+0xe0/0xed8
>        validate_chain+0x9ca/0xde8
>        __lock_acquire+0x64c/0xc40
>        lock_acquire.part.0+0xec/0x258
>        lock_acquire+0xb0/0x200
>        __mutex_lock+0xa2/0x8d8
>        mutex_lock_nested+0x32/0x40
>        del_gendisk+0x64/0x210
>        nbd_put.part.0+0x46/0x98 [nbd]
>        nbd_cleanup+0xde/0x118 [nbd]
>        __do_sys_delete_module+0x19a/0x2a8
>        __do_syscall+0x1c2/0x1f0
>        system_call+0x78/0xa0
> 
> other info that might help us debug this:
> 
>  Possible unsafe locking scenario:
> 
>        CPU0                    CPU1
>        ----                    ----
>   lock(nbd_index_mutex);
>                                lock(&disk->open_mutex);
>                                lock(nbd_index_mutex);
>   lock(&disk->open_mutex);
> 
>  *** DEADLOCK ***

To fix it, delete disk without the nbd_index_lock held after removing
nbd from the idr.

Only for thoughts.

+++ b/drivers/block/nbd.c
@@ -2443,6 +2443,7 @@ static int nbd_exit_cb(int id, void *ptr
 	struct nbd_device *nbd = ptr;
 
 	list_add_tail(&nbd->list, list);
+	idr_remove(&nbd_index_idr, nbd->index);
 	return 0;
 }
 
@@ -2462,7 +2463,7 @@ static void __exit nbd_cleanup(void)
 		list_del_init(&nbd->list);
 		if (refcount_read(&nbd->refs) != 1)
 			printk(KERN_ERR "nbd: possibly leaking a device\n");
-		nbd_put(nbd);
+		nbd_dev_remove(nbd);
 	}
 
 	idr_destroy(&nbd_index_idr);


Reply to: