[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [PATCH] nbd: fix shutdown and recv work deadlock



Josef and Jens,

Ignore this patch. It could also deadlock but in a different way, and it
looks like there are other possible issues with races and refcounts. I
will send some new patches.


On 12/02/2019 03:51 PM, Mike Christie wrote:
> This fixes a regression added with:
> 
> commit e9e006f5fcf2bab59149cb38a48a4817c1b538b4
> Author: Mike Christie <mchristi@redhat.com>
> Date:   Sun Aug 4 14:10:06 2019 -0500
> 
>     nbd: fix max number of supported devs
> 
> where we can deadlock during device shutdown. The problem will occur if
> userpsace has done a NBD_CLEAR_SOCK call, then does close() before the
> recv_work work has done its nbd_config_put() call. If recv_work does the
> last call then it will do destroy_workqueue which will then be stuck
> waiting for the work we are running from.
> 
> This fixes the issue by having nbd_start_device_ioctl flush the work
> queue on both the failure and success cases and has a refcount on the
> nbd_device while it is flushing the work queue.
> 
> Cc: stable@vger.kernel.org
> Signed-off-by: Mike Christie <mchristi@redhat.com>
> ---
>  drivers/block/nbd.c | 9 ++++++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
> index 57532465fb83..f8597d2fb365 100644
> --- a/drivers/block/nbd.c
> +++ b/drivers/block/nbd.c
> @@ -1293,13 +1293,15 @@ static int nbd_start_device_ioctl(struct nbd_device *nbd, struct block_device *b
>  
>  	if (max_part)
>  		bdev->bd_invalidated = 1;
> +
> +	refcount_inc(&nbd->config_refs);
>  	mutex_unlock(&nbd->config_lock);
>  	ret = wait_event_interruptible(config->recv_wq,
>  					 atomic_read(&config->recv_threads) == 0);
> -	if (ret) {
> +	if (ret)
>  		sock_shutdown(nbd);
> -		flush_workqueue(nbd->recv_workq);
> -	}
> +	flush_workqueue(nbd->recv_workq);
> +
>  	mutex_lock(&nbd->config_lock);
>  	nbd_bdev_reset(bdev);
>  	/* user requested, ignore socket errors */
> @@ -1307,6 +1309,7 @@ static int nbd_start_device_ioctl(struct nbd_device *nbd, struct block_device *b
>  		ret = 0;
>  	if (test_bit(NBD_RT_TIMEDOUT, &config->runtime_flags))
>  		ret = -ETIMEDOUT;
> +	nbd_config_put(nbd);
>  	return ret;
>  }
>  
> 


Reply to: