[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [PATCH] nbd: restore default timeout when setting it to zero





On 2020/8/26 1:29 上午, Josef Bacik wrote:
On 8/25/20 4:27 AM, Hou Pu wrote:


On 2020/8/24 10:02 PM, Josef Bacik wrote:
On 8/23/20 11:23 PM, Hou Pu wrote:


On 2020/8/21 9:57 PM, Josef Bacik wrote:
On 8/21/20 3:21 AM, Hou Pu wrote:


On 2020/8/21 3:03 AM, Josef Bacik wrote:
On 8/10/20 8:00 AM, Hou Pu wrote:
If we configured io timeout of nbd0 to 100s. Later after we
finished using it, we configured nbd0 again and set the io
timeout to 0. We expect it would timeout after 30 seconds
and keep retry. But in fact we could not change the timeout
when we set it to 0. the timeout is still the original 100s.

So change the timeout to default 30s when we set it to zero.
It also behaves same as commit 2da22da57348 ("nbd: fix zero
cmd timeout handling v2").

It becomes more important if we were reconfigure a nbd device
and the io timeout it set to zero. Because it could take 30s
to detect the new socket and thus io could be completed more
quickly compared to 100s.

Signed-off-by: Hou Pu <houpu@bytedance.com>
---
  drivers/block/nbd.c | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index ce7e9f223b20..bc9dc1f847e1 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -1360,6 +1360,8 @@ static void nbd_set_cmd_timeout(struct nbd_device *nbd, u64 timeout)
      nbd->tag_set.timeout = timeout * HZ;
      if (timeout)
          blk_queue_rq_timeout(nbd->disk->queue, timeout * HZ);
+    else
+        blk_queue_rq_timeout(nbd->disk->queue, 30 * HZ);
  }
  /* Must be called with config_lock held */


What about the tag_set.timeout?  Thanks,

I think user space could set io timeout to 0, thus we set tag_set.timeout = 0 here and also we should tell the block layer
to restore 30s timeout in case it is not. tag_set.timeout == 0
imply 30s io timeout and retrying after timeout.

(Sorry, I am not sure if I understand your question here. Could
you explain a little more if needed?)


I misunderstood what I was using the tagset timeout for.  We don't want this here, if we're dropping a config for an nbd device and we want to reset it to defaults then we need to add this to nbd_config_put().  Thanks,

AFAIK If we killed a nbd server, then restarted it and reconfigured
the nbd socket, I think we might not reconfigure IO timeout to 0 since
nbd_config_put() is not called in such case. So could we still
restore default timeout here. Or am I missing something?


If you kill the NBD server then the config is going to be dropped and need to be reconfigured, so nbd_config_put() will definitely be called. The only case it wouldn't be is if you are using the netlink interface, in which case the device is going to keep all of its original settings. Are you not seeing the final nbd_config_put() being done when you kill the nbd server?  That seems like a bug if not, and that should be fixed, and then this timeout thing going in there will fix your issue.  Thanks,

I was using the netlink interface. So I could use the reconnect
feature to update the nbd server without impacting the user of
nbd device.

I did not see the final nbd_config_put() when I killed the nbd server.
After I killed the nbd server, the recv_work() put 1 config_ref.
Another ref count is still held by nbd_genl_connect(). I thought it
was as expected.

Beside in nbd_genl_reconfigure(), it is checked nbd->config_refs should
not be zero by:
         if (!refcount_inc_not_zero(&nbd->config_refs)) {
                 dev_err(nbd_to_dev(nbd),
                         "not configured, cannot reconfigure\n");
                 nbd_put(nbd);
                 return -EINVAL;
         }
So AFAIK this behavior is as expected.

Ahh ok I see what you're getting at.  Ok I agree, you can add

Reviewed-by: Josef Bacik <josef@toxicpanda.com>

Thanks for your review,
Hou

Thanks,

Josef


Reply to: