Re: [PATCH] nbd: restore default timeout when setting it to zero
On 8/25/20 7:51 PM, Hou Pu wrote:
>
>
> On 2020/8/26 1:29 上午, Josef Bacik wrote:
>> On 8/25/20 4:27 AM, Hou Pu wrote:
>>>
>>>
>>> On 2020/8/24 10:02 PM, Josef Bacik wrote:
>>>> On 8/23/20 11:23 PM, Hou Pu wrote:
>>>>>
>>>>>
>>>>> On 2020/8/21 9:57 PM, Josef Bacik wrote:
>>>>>> On 8/21/20 3:21 AM, Hou Pu wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 2020/8/21 3:03 AM, Josef Bacik wrote:
>>>>>>>> On 8/10/20 8:00 AM, Hou Pu wrote:
>>>>>>>>> If we configured io timeout of nbd0 to 100s. Later after we
>>>>>>>>> finished using it, we configured nbd0 again and set the io
>>>>>>>>> timeout to 0. We expect it would timeout after 30 seconds
>>>>>>>>> and keep retry. But in fact we could not change the timeout
>>>>>>>>> when we set it to 0. the timeout is still the original 100s.
>>>>>>>>>
>>>>>>>>> So change the timeout to default 30s when we set it to zero.
>>>>>>>>> It also behaves same as commit 2da22da57348 ("nbd: fix zero
>>>>>>>>> cmd timeout handling v2").
>>>>>>>>>
>>>>>>>>> It becomes more important if we were reconfigure a nbd device
>>>>>>>>> and the io timeout it set to zero. Because it could take 30s
>>>>>>>>> to detect the new socket and thus io could be completed more
>>>>>>>>> quickly compared to 100s.
>>>>>>>>>
>>>>>>>>> Signed-off-by: Hou Pu <houpu@bytedance.com>
>>>>>>>>> ---
>>>>>>>>> drivers/block/nbd.c | 2 ++
>>>>>>>>> 1 file changed, 2 insertions(+)
>>>>>>>>>
>>>>>>>>> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
>>>>>>>>> index ce7e9f223b20..bc9dc1f847e1 100644
>>>>>>>>> --- a/drivers/block/nbd.c
>>>>>>>>> +++ b/drivers/block/nbd.c
>>>>>>>>> @@ -1360,6 +1360,8 @@ static void nbd_set_cmd_timeout(struct
>>>>>>>>> nbd_device *nbd, u64 timeout)
>>>>>>>>> nbd->tag_set.timeout = timeout * HZ;
>>>>>>>>> if (timeout)
>>>>>>>>> blk_queue_rq_timeout(nbd->disk->queue, timeout * HZ);
>>>>>>>>> + else
>>>>>>>>> + blk_queue_rq_timeout(nbd->disk->queue, 30 * HZ);
>>>>>>>>> }
>>>>>>>>> /* Must be called with config_lock held */
>>>>>>>>>
>>>>>>>>
>>>>>>>> What about the tag_set.timeout? Thanks,
>>>>>>>
>>>>>>> I think user space could set io timeout to 0, thus we set
>>>>>>> tag_set.timeout = 0 here and also we should tell the block layer
>>>>>>> to restore 30s timeout in case it is not. tag_set.timeout == 0
>>>>>>> imply 30s io timeout and retrying after timeout.
>>>>>>>
>>>>>>> (Sorry, I am not sure if I understand your question here. Could
>>>>>>> you explain a little more if needed?)
>>>>>>>
>>>>>>
>>>>>> I misunderstood what I was using the tagset timeout for. We don't
>>>>>> want this here, if we're dropping a config for an nbd device and we
>>>>>> want to reset it to defaults then we need to add this to
>>>>>> nbd_config_put(). Thanks,
>>>>>
>>>>> AFAIK If we killed a nbd server, then restarted it and reconfigured
>>>>> the nbd socket, I think we might not reconfigure IO timeout to 0 since
>>>>> nbd_config_put() is not called in such case. So could we still
>>>>> restore default timeout here. Or am I missing something?
>>>>>
>>>>
>>>> If you kill the NBD server then the config is going to be dropped and
>>>> need to be reconfigured, so nbd_config_put() will definitely be
>>>> called. The only case it wouldn't be is if you are using the netlink
>>>> interface, in which case the device is going to keep all of its
>>>> original settings. Are you not seeing the final nbd_config_put()
>>>> being done when you kill the nbd server? That seems like a bug if
>>>> not, and that should be fixed, and then this timeout thing going in
>>>> there will fix your issue. Thanks,
>>>
>>> I was using the netlink interface. So I could use the reconnect
>>> feature to update the nbd server without impacting the user of
>>> nbd device.
>>>
>>> I did not see the final nbd_config_put() when I killed the nbd server.
>>> After I killed the nbd server, the recv_work() put 1 config_ref.
>>> Another ref count is still held by nbd_genl_connect(). I thought it
>>> was as expected.
>>>
>>> Beside in nbd_genl_reconfigure(), it is checked nbd->config_refs should
>>> not be zero by:
>>> if (!refcount_inc_not_zero(&nbd->config_refs)) {
>>> dev_err(nbd_to_dev(nbd),
>>> "not configured, cannot reconfigure\n");
>>> nbd_put(nbd);
>>> return -EINVAL;
>>> }
>>> So AFAIK this behavior is as expected.
>>
>> Ahh ok I see what you're getting at. Ok I agree, you can add
>>
>> Reviewed-by: Josef Bacik <josef@toxicpanda.com>
>
> Thanks for your review,
Applied, thanks.
--
Jens Axboe
Reply to: