[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: blktests failures with v6.17-rc1 kernel



On Mon, Sep 01, 2025 at 10:34:23AM +0200, Daniel Wagner wrote:
> The test is removing the ports while the host driver is about to
> reconnect and accesses a stale pointer.
> 
> nvme_fc_create_association is calling nvme_fc_ctlr_inactive_on_rport in
> the error path. The problem is that nvme_fc_create_association gets half
> through the setup and then fails. In the cleanup path
> 
> 	dev_warn(ctrl->ctrl.device,
> 		"NVME-FC{%d}: create_assoc failed, assoc_id %llx ret %d\n",
> 		ctrl->cnum, ctrl->association_id, ret);
> 
> is issued and then nvme_fc_ctlr_inactive_on_rport is called. And there
> is the log message above, so it's clear the error path is taken.
> 
> But the thing is fcloop is not supposed to remove the ports when the
> host driver is still using it. So there is a race window where it's
> possible to enter nvme_fc_create_assocation and fcloop removing the
> ports.
> 
> So between nvme_fc_create_assocation and nvme_fc_ctlr_active_on_rport.

I think the problem is that nvme_fc_create_association is not holding
the rport locks when checking the port_state and marking the rport
active. This races with nvme_fc_unregister_remoteport.

diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index 3e12d4683ac7..03987f497a5b 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -3032,11 +3032,17 @@ nvme_fc_create_association(struct nvme_fc_ctrl *ctrl)

 	++ctrl->ctrl.nr_reconnects;

-	if (ctrl->rport->remoteport.port_state != FC_OBJSTATE_ONLINE)
+	spin_lock_irqsave(&ctrl->rport->lock, flags);
+	if (ctrl->rport->remoteport.port_state != FC_OBJSTATE_ONLINE) {
+		spin_unlock_irqrestore(&ctrl->rport->lock, flags);
 		return -ENODEV;
+	}

-	if (nvme_fc_ctlr_active_on_rport(ctrl))
+	if (nvme_fc_ctlr_active_on_rport(ctrl)) {
+		spin_unlock_irqrestore(&ctrl->rport->lock, flags);
 		return -ENOTUNIQ;
+	}
+	spin_unlock_irqrestore(&ctrl->rport->lock, flags);

 	dev_info(ctrl->ctrl.device,
 		"NVME-FC{%d}: create association : host wwpn 0x%016llx "

I'll to reproduce it and see if this patch does make a difference.


Reply to: