[Open-FCoE] [PATCH] libfc: Bug fix for race in rport code

Joe Eykholt jeykholt at cisco.com
Wed Apr 8 01:22:41 UTC 2009


Abhijeet Arvind Joglekar (abjoglek) wrote:
> Bug: libfc/fnic connected to JBODs. Rapidly pull/push in the JBODs to
> generate multiple RSCNs rapidly. There is a race in the remote port/disc
> state machine that causes multiple entries for the same remote port
> in /sys/class/fc_remote_ports.
> 
> Rogue ports are now tracked in a separate rogue list until they go to
> READY state and get moved to the real rports list. An incoming RSCN
> that causes rediscovery or causes a re-plogi to a remote port does not
> search the rogues list, and thus fails to log it off before starting a
> new plogi. This causes a race condition where 2 rogue ports are created
> to the same remote port. (The same problem existed when rogues were not
> tracked in any list)
> 
> This patch adds a fix by searching the rogue list in addition to the
> regular rport list on incoming RSCNs.
> 
> fc_disc_lookup_rport() is also called for incoming rport requests. I
> wanted to avoid modifying that function to also search in the rogue list,
> since that would result into a case where an incoming Plogi would find the
> rogue session, and significant change is required to support that.
> 
> So, instead, I added a new function fc_disc_lookup_all_ports() which
> searches in both lists. It is called on incoming RSCNs. On incoming rport
> requests, like Plogi etc, existing fc_disc_lookup_rport() function is used,
> thus avoiding changes to that code path.
> 
> Also, before re-discovery is initiated, the rogue list is searched and
> all rogue ports are first logged off.
> 
> Test: Tested the fix by running the same test case as above. This time,
> remote ports were logged off and re-logged on correctly, and resulted
> in single entry per remote port.
> 
> Please verify bug and fix for libfc/fcoe driver.
> 
> Signed-off-by: Abhijeet Joglekar <abjoglek at cisco.com>
> ---
>  drivers/scsi/libfc/fc_disc.c |   44 ++++++++++++++++++++++++++++++++++++++++--
>  1 files changed, 42 insertions(+), 2 deletions(-)
> 
> 
> diff --git a/drivers/scsi/libfc/fc_disc.c b/drivers/scsi/libfc/fc_disc.c
> index 6fabf66..f089db6 100644
> --- a/drivers/scsi/libfc/fc_disc.c
> +++ b/drivers/scsi/libfc/fc_disc.c
> @@ -56,6 +56,39 @@ static void fc_disc_single(struct fc_disc *, struct fc_disc_port *);
>  static void fc_disc_restart(struct fc_disc *);
>  
>  /**
> + * fc_disc_lookup_all_rports() - lookup a remote port by port_id in both
> + * the rports and the rogue_rports list

The function summary must be entirely on the first line.  If you want to
put more text about the function, it can appear after all the parameters.
Run the document generator after applying the patch to test it.

> + * @lport: Fibre Channel host port instance
> + * @port_id: remote port port_id to match
> + */
> +struct fc_rport *fc_disc_lookup_all_rports(const struct fc_lport *lport,
> +					   u32 port_id)
> +{
> +	const struct fc_disc *disc = &lport->disc;
> +	struct fc_rport *rport, *found = NULL;
> +	struct fc_rport_libfc_priv *rdata;
> +
> +	list_for_each_entry(rdata, &disc->rports, peers) {
> +		rport = PRIV_TO_RPORT(rdata);
> +		if (rport->port_id == port_id) {
> +			found = rport;
> +			goto out;
> +		}
> +	}
> +
> +	list_for_each_entry(rdata, &disc->rogue_rports, peers) {
> +		rport = PRIV_TO_RPORT(rdata);
> +		if (rport->port_id == port_id) {
> +			found = rport;
> +			goto out;
> +		}
> +	}
> +
> +out:
> +	return found;
> +}
> +
> +/**
>   * fc_disc_lookup_rport() - lookup a remote port by port_id
>   * @lport: Fibre Channel host port instance
>   * @port_id: remote port port_id to match
> @@ -249,10 +282,12 @@ static void fc_disc_recv_rscn_req(struct fc_seq *sp, struct fc_frame *fp,
>  			    redisc, lport->state, disc->pending);
>  		list_for_each_entry_safe(dp, next, &disc_ports, peers) {
>  			list_del(&dp->peers);
> -			rport = lport->tt.rport_lookup(lport, dp->ids.port_id);
> +			rport = fc_disc_lookup_all_rports(lport,
> +							  dp->ids.port_id);

>  			if (rport) {
>  				rdata = rport->dd_data;
> -				list_del(&rdata->peers);
> +				if (rdata->trans_state != FC_PORTSTATE_ROGUE)
> +					list_del(&rdata->peers);
>  				lport->tt.rport_logoff(rport);
>  			}
>  			fc_disc_single(disc, dp);
> @@ -320,6 +355,11 @@ static void fc_disc_restart(struct fc_disc *disc)
>  		lport->tt.rport_logoff(rport);
>  	}
>  
> +	list_for_each_entry_safe(rdata, next, &disc->rogue_rports, peers) {
> +		rport = PRIV_TO_RPORT(rdata);
> +		lport->tt.rport_logoff(rport);
> +	}
> +
>  	disc->requested = 1;
>  	if (!disc->pending)
>  		fc_disc_gpn_ft_req(disc);
> 
> 
> _______________________________________________
> devel mailing list
> devel at open-fcoe.org
> http://www.open-fcoe.org/mailman/listinfo/devel
> 




More information about the devel mailing list