[Open-FCoE] [PATCH] libfc: Bug fix for race in rport code
Abhijeet Arvind Joglekar (abjoglek)
abjoglek at cisco.com
Wed Apr 8 01:56:57 UTC 2009
> -----Original Message-----
> From: Joe Eykholt (jeykholt)
> Sent: Tuesday, April 07, 2009 6:23 PM
> To: Abhijeet Arvind Joglekar (abjoglek)
> Cc: devel at open-fcoe.org
> Subject: Re: [Open-FCoE] [PATCH] libfc: Bug fix for race in rport code
> Abhijeet Arvind Joglekar (abjoglek) wrote:
> > Bug: libfc/fnic connected to JBODs. Rapidly pull/push in
> the JBODs to
> > generate multiple RSCNs rapidly. There is a race in the remote
> > port/disc state machine that causes multiple entries for the same
> > remote port in /sys/class/fc_remote_ports.
> > Rogue ports are now tracked in a separate rogue list until
> they go to
> > READY state and get moved to the real rports list. An incoming RSCN
> > that causes rediscovery or causes a re-plogi to a remote
> port does not
> > search the rogues list, and thus fails to log it off before
> starting a
> > new plogi. This causes a race condition where 2 rogue ports are
> > created to the same remote port. (The same problem existed
> when rogues
> > were not tracked in any list)
> > This patch adds a fix by searching the rogue list in
> addition to the
> > regular rport list on incoming RSCNs.
> > fc_disc_lookup_rport() is also called for incoming rport
> requests. I
> > wanted to avoid modifying that function to also search in the rogue
> > list, since that would result into a case where an incoming Plogi
> > would find the rogue session, and significant change is
> required to support that.
> > So, instead, I added a new function
> fc_disc_lookup_all_ports() which
> > searches in both lists. It is called on incoming RSCNs. On incoming
> > rport requests, like Plogi etc, existing fc_disc_lookup_rport()
> > function is used, thus avoiding changes to that code path.
> > Also, before re-discovery is initiated, the rogue list is
> searched and
> > all rogue ports are first logged off.
> > Test: Tested the fix by running the same test case as above. This
> > time, remote ports were logged off and re-logged on correctly, and
> > resulted in single entry per remote port.
> > Please verify bug and fix for libfc/fcoe driver.
> > Signed-off-by: Abhijeet Joglekar <abjoglek at cisco.com>
> > ---
> > drivers/scsi/libfc/fc_disc.c | 44
> > 1 files changed, 42 insertions(+), 2 deletions(-)
> > diff --git a/drivers/scsi/libfc/fc_disc.c
> > b/drivers/scsi/libfc/fc_disc.c index 6fabf66..f089db6 100644
> > --- a/drivers/scsi/libfc/fc_disc.c
> > +++ b/drivers/scsi/libfc/fc_disc.c
> > @@ -56,6 +56,39 @@ static void fc_disc_single(struct
> fc_disc *, struct
> > fc_disc_port *); static void fc_disc_restart(struct fc_disc *);
> > /**
> > + * fc_disc_lookup_all_rports() - lookup a remote port by
> port_id in
> > + both
> > + * the rports and the rogue_rports list
> The function summary must be entirely on the first line. If
> you want to put more text about the function, it can appear
> after all the parameters.
> Run the document generator after applying the patch to test it.
Ok, will re-send patch with the fix.
More information about the devel