[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [PATCH] nbd: restore default timeout when setting it to zero



On 8/25/20 7:51 PM, Hou Pu wrote:
> 
> 
> On 2020/8/26 1:29 上午, Josef Bacik wrote:
>> On 8/25/20 4:27 AM, Hou Pu wrote:
>>>
>>>
>>> On 2020/8/24 10:02 PM, Josef Bacik wrote:
>>>> On 8/23/20 11:23 PM, Hou Pu wrote:
>>>>>
>>>>>
>>>>> On 2020/8/21 9:57 PM, Josef Bacik wrote:
>>>>>> On 8/21/20 3:21 AM, Hou Pu wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 2020/8/21 3:03 AM, Josef Bacik wrote:
>>>>>>>> On 8/10/20 8:00 AM, Hou Pu wrote:
>>>>>>>>> If we configured io timeout of nbd0 to 100s. Later after we
>>>>>>>>> finished using it, we configured nbd0 again and set the io
>>>>>>>>> timeout to 0. We expect it would timeout after 30 seconds
>>>>>>>>> and keep retry. But in fact we could not change the timeout
>>>>>>>>> when we set it to 0. the timeout is still the original 100s.
>>>>>>>>>
>>>>>>>>> So change the timeout to default 30s when we set it to zero.
>>>>>>>>> It also behaves same as commit 2da22da57348 ("nbd: fix zero
>>>>>>>>> cmd timeout handling v2").
>>>>>>>>>
>>>>>>>>> It becomes more important if we were reconfigure a nbd device
>>>>>>>>> and the io timeout it set to zero. Because it could take 30s
>>>>>>>>> to detect the new socket and thus io could be completed more
>>>>>>>>> quickly compared to 100s.
>>>>>>>>>
>>>>>>>>> Signed-off-by: Hou Pu <houpu@bytedance.com>
>>>>>>>>> ---
>>>>>>>>>   drivers/block/nbd.c | 2 ++
>>>>>>>>>   1 file changed, 2 insertions(+)
>>>>>>>>>
>>>>>>>>> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
>>>>>>>>> index ce7e9f223b20..bc9dc1f847e1 100644
>>>>>>>>> --- a/drivers/block/nbd.c
>>>>>>>>> +++ b/drivers/block/nbd.c
>>>>>>>>> @@ -1360,6 +1360,8 @@ static void nbd_set_cmd_timeout(struct 
>>>>>>>>> nbd_device *nbd, u64 timeout)
>>>>>>>>>       nbd->tag_set.timeout = timeout * HZ;
>>>>>>>>>       if (timeout)
>>>>>>>>>           blk_queue_rq_timeout(nbd->disk->queue, timeout * HZ);
>>>>>>>>> +    else
>>>>>>>>> +        blk_queue_rq_timeout(nbd->disk->queue, 30 * HZ);
>>>>>>>>>   }
>>>>>>>>>   /* Must be called with config_lock held */
>>>>>>>>>
>>>>>>>>
>>>>>>>> What about the tag_set.timeout?  Thanks,
>>>>>>>
>>>>>>> I think user space could set io timeout to 0, thus we set 
>>>>>>> tag_set.timeout = 0 here and also we should tell the block layer
>>>>>>> to restore 30s timeout in case it is not. tag_set.timeout == 0
>>>>>>> imply 30s io timeout and retrying after timeout.
>>>>>>>
>>>>>>> (Sorry, I am not sure if I understand your question here. Could
>>>>>>> you explain a little more if needed?)
>>>>>>>
>>>>>>
>>>>>> I misunderstood what I was using the tagset timeout for.  We don't 
>>>>>> want this here, if we're dropping a config for an nbd device and we 
>>>>>> want to reset it to defaults then we need to add this to 
>>>>>> nbd_config_put().  Thanks,
>>>>>
>>>>> AFAIK If we killed a nbd server, then restarted it and reconfigured
>>>>> the nbd socket, I think we might not reconfigure IO timeout to 0 since
>>>>> nbd_config_put() is not called in such case. So could we still
>>>>> restore default timeout here. Or am I missing something?
>>>>>
>>>>
>>>> If you kill the NBD server then the config is going to be dropped and 
>>>> need to be reconfigured, so nbd_config_put() will definitely be 
>>>> called. The only case it wouldn't be is if you are using the netlink 
>>>> interface, in which case the device is going to keep all of its 
>>>> original settings. Are you not seeing the final nbd_config_put() 
>>>> being done when you kill the nbd server?  That seems like a bug if 
>>>> not, and that should be fixed, and then this timeout thing going in 
>>>> there will fix your issue.  Thanks,
>>>
>>> I was using the netlink interface. So I could use the reconnect
>>> feature to update the nbd server without impacting the user of
>>> nbd device.
>>>
>>> I did not see the final nbd_config_put() when I killed the nbd server.
>>> After I killed the nbd server, the recv_work() put 1 config_ref.
>>> Another ref count is still held by nbd_genl_connect(). I thought it
>>> was as expected.
>>>
>>> Beside in nbd_genl_reconfigure(), it is checked nbd->config_refs should
>>> not be zero by:
>>>          if (!refcount_inc_not_zero(&nbd->config_refs)) {
>>>                  dev_err(nbd_to_dev(nbd),
>>>                          "not configured, cannot reconfigure\n");
>>>                  nbd_put(nbd);
>>>                  return -EINVAL;
>>>          }
>>> So AFAIK this behavior is as expected.
>>
>> Ahh ok I see what you're getting at.  Ok I agree, you can add
>>
>> Reviewed-by: Josef Bacik <josef@toxicpanda.com>
> 
> Thanks for your review,

Applied, thanks.

-- 
Jens Axboe


Reply to: