[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: blktests failures with v6.17-rc1 kernel



On Sep 01, 2025 / 11:02, Daniel Wagner wrote:
> On Mon, Sep 01, 2025 at 10:34:23AM +0200, Daniel Wagner wrote:
> > The test is removing the ports while the host driver is about to
> > reconnect and accesses a stale pointer.
> > 
> > nvme_fc_create_association is calling nvme_fc_ctlr_inactive_on_rport in
> > the error path. The problem is that nvme_fc_create_association gets half
> > through the setup and then fails. In the cleanup path
> > 
> > 	dev_warn(ctrl->ctrl.device,
> > 		"NVME-FC{%d}: create_assoc failed, assoc_id %llx ret %d\n",
> > 		ctrl->cnum, ctrl->association_id, ret);
> > 
> > is issued and then nvme_fc_ctlr_inactive_on_rport is called. And there
> > is the log message above, so it's clear the error path is taken.
> > 
> > But the thing is fcloop is not supposed to remove the ports when the
> > host driver is still using it. So there is a race window where it's
> > possible to enter nvme_fc_create_assocation and fcloop removing the
> > ports.
> > 
> > So between nvme_fc_create_assocation and nvme_fc_ctlr_active_on_rport.
> 
> I think the problem is that nvme_fc_create_association is not holding
> the rport locks when checking the port_state and marking the rport
> active. This races with nvme_fc_unregister_remoteport.
> 
> diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
> index 3e12d4683ac7..03987f497a5b 100644
> --- a/drivers/nvme/host/fc.c
> +++ b/drivers/nvme/host/fc.c
> @@ -3032,11 +3032,17 @@ nvme_fc_create_association(struct nvme_fc_ctrl *ctrl)
> 
>  	++ctrl->ctrl.nr_reconnects;
> 
> -	if (ctrl->rport->remoteport.port_state != FC_OBJSTATE_ONLINE)
> +	spin_lock_irqsave(&ctrl->rport->lock, flags);
> +	if (ctrl->rport->remoteport.port_state != FC_OBJSTATE_ONLINE) {
> +		spin_unlock_irqrestore(&ctrl->rport->lock, flags);
>  		return -ENODEV;
> +	}
> 
> -	if (nvme_fc_ctlr_active_on_rport(ctrl))
> +	if (nvme_fc_ctlr_active_on_rport(ctrl)) {
> +		spin_unlock_irqrestore(&ctrl->rport->lock, flags);
>  		return -ENOTUNIQ;
> +	}
> +	spin_unlock_irqrestore(&ctrl->rport->lock, flags);
> 
>  	dev_info(ctrl->ctrl.device,
>  		"NVME-FC{%d}: create association : host wwpn 0x%016llx "
> 
> I'll to reproduce it and see if this patch does make a difference.

I applied the fix patch above together with the previous fix patch on top of
v6.17-rc3, then I repeated nvme/061 with fc transport hundreds of times. I
did not observed the KASAN suaf. The fix patch looks working good. Thanks!


Reply to: