Re: [PATCH v5 5/6] nbd: convert to use blk_mq_find_and_get_req()

To: Ming Lei <ming.lei@redhat.com>
Cc: <axboe@kernel.dk>, <josef@toxicpanda.com>, <hch@infradead.org>, <linux-block@vger.kernel.org>, <linux-kernel@vger.kernel.org>, <nbd@other.debian.org>, <yi.zhang@huawei.com>
Subject: Re: [PATCH v5 5/6] nbd: convert to use blk_mq_find_and_get_req()
From: "yukuai (C)" <yukuai3@huawei.com>
Date: Wed, 15 Sep 2021 11:36:47 +0800
Message-id: <[🔎] 60f68f6b-5fff-6a39-b77b-4bbb86f1c87e@huawei.com>
In-reply-to: <[🔎] YUFldgfRYrJyxFtz@T590>
References: <[🔎] 20210909141256.2606682-1-yukuai3@huawei.com> <[🔎] 20210909141256.2606682-6-yukuai3@huawei.com> <YT/2z4PSeW5oJWMq@T590> <[🔎] c6af73a2-f12d-eeef-616e-ae0cdb4f6f2d@huawei.com> <[🔎] YUBE4BJ7+kN1c4l8@T590> <[🔎] 374c6b37-b4b2-fe01-66be-ca2dbbc283e9@huawei.com> <[🔎] YUBTVBioqJ7qas2R@T590> <[🔎] b8301834-5541-76ee-13a9-0fa565fce7e3@huawei.com> <[🔎] YUCzr2ysb+vJ1x0W@T590> <[🔎] 8f1849a3-6bf2-6b14-7ef9-4969a9a5425b@huawei.com> <[🔎] YUFldgfRYrJyxFtz@T590>

On 2021/09/15 11:16, Ming Lei wrote:

On Wed, Sep 15, 2021 at 09:54:09AM +0800, yukuai (C) wrote:

On 2021/09/14 22:37, Ming Lei wrote:

On Tue, Sep 14, 2021 at 05:19:31PM +0800, yukuai (C) wrote:

On 在 2021/09/14 15:46, Ming Lei wrote:

If the above can happen, blk_mq_find_and_get_req() may not fix it too, just
wondering why not take the following simpler way for avoiding the UAF?

diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 5170a630778d..dfa5cce71f66 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -795,9 +795,13 @@ static void recv_work(struct work_struct *work)
    						     work);
    	struct nbd_device *nbd = args->nbd;
    	struct nbd_config *config = nbd->config;
+	struct request_queue *q = nbd->disk->queue;
    	struct nbd_cmd *cmd;
    	struct request *rq;
+	if (!percpu_ref_tryget(&q->q_usage_counter))
+                return;
+
    	while (1) {
    		cmd = nbd_read_stat(nbd, args->index);
    		if (IS_ERR(cmd)) {
@@ -813,6 +817,7 @@ static void recv_work(struct work_struct *work)
    		if (likely(!blk_should_fake_timeout(rq->q)))
    			blk_mq_complete_request(rq);
    	}
+	blk_queue_exit(q);
    	nbd_config_put(nbd);
    	atomic_dec(&config->recv_threads);
    	wake_up(&config->recv_wq);


Hi, Ming

This apporch is wrong.

If blk_mq_freeze_queue() is called, and nbd is waiting for all
request to complete. percpu_ref_tryget() will fail here, and deadlock
will occur because request can't complete in recv_work().


No, percpu_ref_tryget() won't fail until ->q_usage_counter is zero, when
it is perfectly fine to do nothing in recv_work().


Hi Ming

This apporch is a good idea, however we should not get q_usage_counter
in reccv_work(), because It will block freeze queue.

How about get q_usage_counter in nbd_read_stat(), and put in error path
or after request completion?


OK, looks I missed that nbd_read_stat() needs to wait for incoming reply
first, so how about the following change by partitioning nbd_read_stat()
into nbd_read_reply() and nbd_handle_reply()?


Hi, Ming

The change looks good to me.

Do you want to send a patch to fix this?

Thanks,
Kuai


diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 5170a630778d..477fe057fc93 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -683,38 +683,47 @@ static int nbd_send_cmd(struct nbd_device *nbd, struct nbd_cmd *cmd, int index)
  	return 0;
  }

-/* NULL returned = something went wrong, inform userspace */

-static struct nbd_cmd *nbd_read_stat(struct nbd_device *nbd, int index)
+static int nbd_read_reply(struct nbd_device *nbd, int index,
+		struct nbd_reply *reply)
  {
-	struct nbd_config *config = nbd->config;
  	int result;
-	struct nbd_reply reply;
-	struct nbd_cmd *cmd;
-	struct request *req = NULL;
-	u64 handle;
-	u16 hwq;
-	u32 tag;
-	struct kvec iov = {.iov_base = &reply, .iov_len = sizeof(reply)};
+	struct kvec iov = {.iov_base = reply, .iov_len = sizeof(*reply)};
  	struct iov_iter to;
-	int ret = 0;

- reply.magic = 0;

+	reply->magic = 0;
  	iov_iter_kvec(&to, READ, &iov, 1, sizeof(reply));
  	result = sock_xmit(nbd, index, 0, &to, MSG_WAITALL, NULL);
-	if (result <= 0) {
-		if (!nbd_disconnected(config))
+	if (result < 0) {
+		if (!nbd_disconnected(nbd->config))
  			dev_err(disk_to_dev(nbd->disk),
  				"Receive control failed (result %d)\n", result);
-		return ERR_PTR(result);
+		return result;
  	}

- if (ntohl(reply.magic) != NBD_REPLY_MAGIC) {

+	if (ntohl(reply->magic) != NBD_REPLY_MAGIC) {
  		dev_err(disk_to_dev(nbd->disk), "Wrong magic (0x%lx)\n",
-				(unsigned long)ntohl(reply.magic));
-		return ERR_PTR(-EPROTO);
+				(unsigned long)ntohl(reply->magic));
+		return -EPROTO;
  	}

- memcpy(&handle, reply.handle, sizeof(handle));

+	return 0;
+}
+
+/* NULL returned = something went wrong, inform userspace */
+static struct nbd_cmd *nbd_handle_reply(struct nbd_device *nbd, int index,
+		struct nbd_reply *reply)
+{
+	struct nbd_config *config = nbd->config;
+	int result;
+	struct nbd_cmd *cmd;
+	struct request *req = NULL;
+	u64 handle;
+	u16 hwq;
+	u32 tag;
+	struct iov_iter to;
+	int ret = 0;
+
+	memcpy(&handle, reply->handle, sizeof(handle));
  	tag = nbd_handle_to_tag(handle);
  	hwq = blk_mq_unique_tag_to_hwq(tag);
  	if (hwq < nbd->tag_set.nr_hw_queues)
@@ -747,9 +756,9 @@ static struct nbd_cmd *nbd_read_stat(struct nbd_device *nbd, int index)
  		ret = -ENOENT;
  		goto out;
  	}
-	if (ntohl(reply.error)) {
+	if (ntohl(reply->error)) {
  		dev_err(disk_to_dev(nbd->disk), "Other side returned error (%d)\n",
-			ntohl(reply.error));
+			ntohl(reply->error));
  		cmd->status = BLK_STS_IOERR;
  		goto out;
  	}
@@ -795,24 +804,36 @@ static void recv_work(struct work_struct *work)
  						     work);
  	struct nbd_device *nbd = args->nbd;
  	struct nbd_config *config = nbd->config;
+	struct request_queue *q = nbd->disk->queue;
+	struct nbd_sock *nsock;
  	struct nbd_cmd *cmd;
  	struct request *rq;

while (1) {

-		cmd = nbd_read_stat(nbd, args->index);
-		if (IS_ERR(cmd)) {
-			struct nbd_sock *nsock = config->socks[args->index];
+		struct nbd_reply reply;

- mutex_lock(&nsock->tx_lock);

-			nbd_mark_nsock_dead(nbd, nsock, 1);
-			mutex_unlock(&nsock->tx_lock);
+		if (nbd_read_reply(nbd, args->index, &reply))
  			break;
-		}

+ if (!percpu_ref_tryget(&q->q_usage_counter))

+			break;
+
+		cmd = nbd_handle_reply(nbd, args->index, &reply);
+		if (IS_ERR(cmd)) {
+			blk_queue_exit(q);
+			break;
+		}
  		rq = blk_mq_rq_from_pdu(cmd);
  		if (likely(!blk_should_fake_timeout(rq->q)))
  			blk_mq_complete_request(rq);
+		blk_queue_exit(q);
  	}
+
+	nsock = config->socks[args->index];
+	mutex_lock(&nsock->tx_lock);
+	nbd_mark_nsock_dead(nbd, nsock, 1);
+	mutex_unlock(&nsock->tx_lock);
+
  	nbd_config_put(nbd);
  	atomic_dec(&config->recv_threads);
  	wake_up(&config->recv_wq);


Thanks,
Ming

.

Reply to:

Follow-Ups:
- Re: [PATCH v5 5/6] nbd: convert to use blk_mq_find_and_get_req()
  - From: Ming Lei <ming.lei@redhat.com>

References:
- [PATCH v5 0/6] handle unexpected message from server
  - From: Yu Kuai <yukuai3@huawei.com>
- [PATCH v5 5/6] nbd: convert to use blk_mq_find_and_get_req()
  - From: Yu Kuai <yukuai3@huawei.com>
- Re: [PATCH v5 5/6] nbd: convert to use blk_mq_find_and_get_req()
  - From: Ming Lei <ming.lei@redhat.com>
- Re: [PATCH v5 5/6] nbd: convert to use blk_mq_find_and_get_req()
  - From: "yukuai (C)" <yukuai3@huawei.com>
- Re: [PATCH v5 5/6] nbd: convert to use blk_mq_find_and_get_req()
  - From: Ming Lei <ming.lei@redhat.com>
- Re: [PATCH v5 5/6] nbd: convert to use blk_mq_find_and_get_req()
  - From: "yukuai (C)" <yukuai3@huawei.com>
- Re: [PATCH v5 5/6] nbd: convert to use blk_mq_find_and_get_req()
  - From: Ming Lei <ming.lei@redhat.com>
- Re: [PATCH v5 5/6] nbd: convert to use blk_mq_find_and_get_req()
  - From: "yukuai (C)" <yukuai3@huawei.com>
- Re: [PATCH v5 5/6] nbd: convert to use blk_mq_find_and_get_req()
  - From: Ming Lei <ming.lei@redhat.com>
- Re: [PATCH v5 5/6] nbd: convert to use blk_mq_find_and_get_req()
  - From: "yukuai (C)" <yukuai3@huawei.com>
- Re: [PATCH v5 5/6] nbd: convert to use blk_mq_find_and_get_req()
  - From: Ming Lei <ming.lei@redhat.com>

Prev by Date: Re: [PATCH v5 5/6] nbd: convert to use blk_mq_find_and_get_req()
Next by Date: Re: [PATCH v5 5/6] nbd: convert to use blk_mq_find_and_get_req()
Previous by thread: Re: [PATCH v5 5/6] nbd: convert to use blk_mq_find_and_get_req()
Next by thread: Re: [PATCH v5 5/6] nbd: convert to use blk_mq_find_and_get_req()
Index(es):
- Date
- Thread