[Open-FCoE] [PATCH] libfc: Bug fix for race in rport code

Abhijeet Arvind Joglekar (abjoglek) abjoglek at cisco.com
Wed Apr 8 01:56:57 UTC 2009


> -----Original Message-----
> From: Joe Eykholt (jeykholt) 
> Sent: Tuesday, April 07, 2009 6:23 PM
> To: Abhijeet Arvind Joglekar (abjoglek)
> Cc: devel at open-fcoe.org
> Subject: Re: [Open-FCoE] [PATCH] libfc: Bug fix for race in rport code
> 
> Abhijeet Arvind Joglekar (abjoglek) wrote:
> > Bug: libfc/fnic connected to JBODs. Rapidly pull/push in 
> the JBODs to 
> > generate multiple RSCNs rapidly. There is a race in the remote 
> > port/disc state machine that causes multiple entries for the same 
> > remote port in /sys/class/fc_remote_ports.
> > 
> > Rogue ports are now tracked in a separate rogue list until 
> they go to 
> > READY state and get moved to the real rports list. An incoming RSCN 
> > that causes rediscovery or causes a re-plogi to a remote 
> port does not 
> > search the rogues list, and thus fails to log it off before 
> starting a 
> > new plogi. This causes a race condition where 2 rogue ports are 
> > created to the same remote port. (The same problem existed 
> when rogues 
> > were not tracked in any list)
> > 
> > This patch adds a fix by searching the rogue list in 
> addition to the 
> > regular rport list on incoming RSCNs.
> > 
> > fc_disc_lookup_rport() is also called for incoming rport 
> requests. I 
> > wanted to avoid modifying that function to also search in the rogue 
> > list, since that would result into a case where an incoming Plogi 
> > would find the rogue session, and significant change is 
> required to support that.
> > 
> > So, instead, I added a new function 
> fc_disc_lookup_all_ports() which 
> > searches in both lists. It is called on incoming RSCNs. On incoming 
> > rport requests, like Plogi etc, existing fc_disc_lookup_rport() 
> > function is used, thus avoiding changes to that code path.
> > 
> > Also, before re-discovery is initiated, the rogue list is 
> searched and 
> > all rogue ports are first logged off.
> > 
> > Test: Tested the fix by running the same test case as above. This 
> > time, remote ports were logged off and re-logged on correctly, and 
> > resulted in single entry per remote port.
> > 
> > Please verify bug and fix for libfc/fcoe driver.
> > 
> > Signed-off-by: Abhijeet Joglekar <abjoglek at cisco.com>
> > ---
> >  drivers/scsi/libfc/fc_disc.c |   44 
> ++++++++++++++++++++++++++++++++++++++++--
> >  1 files changed, 42 insertions(+), 2 deletions(-)
> > 
> > 
> > diff --git a/drivers/scsi/libfc/fc_disc.c 
> > b/drivers/scsi/libfc/fc_disc.c index 6fabf66..f089db6 100644
> > --- a/drivers/scsi/libfc/fc_disc.c
> > +++ b/drivers/scsi/libfc/fc_disc.c
> > @@ -56,6 +56,39 @@ static void fc_disc_single(struct 
> fc_disc *, struct 
> > fc_disc_port *);  static void fc_disc_restart(struct fc_disc *);
> >  
> >  /**
> > + * fc_disc_lookup_all_rports() - lookup a remote port by 
> port_id in 
> > + both
> > + * the rports and the rogue_rports list
> 
> The function summary must be entirely on the first line.  If 
> you want to put more text about the function, it can appear 
> after all the parameters.
> Run the document generator after applying the patch to test it.

Ok, will re-send patch with the fix.

Thanks.




More information about the devel mailing list