[Open-FCoE] deadlock in testing (Re: [RFC PATCH V3 11/13] fcoe: use RTNL mutex to serialize create/destroy and protect hostlist)

Chris Leech christopher.leech at intel.com
Mon Jul 13 18:53:39 UTC 2009

On Thu, Jul 09, 2009 at 05:39:24PM -0700, Joe Eykholt wrote:

> I forgot that you're now trying to destroy from the notifier.
> Is that really necessary?  I can see how it's handy, but making
> the administrator shut down the fcoe instances or do 'rmmod fcoe'
> isn't so bad and might actually prevent accidents.
> I don't think there's a clean way of doing the destroy from the notifier
> because of this flush issue and the callbacks you mentioned.  What could
> be possible is to disassociate with the netdev somehow and let the
> destroy happen asynchronously later or just leave fcoe around but
> disabled.
> If we did arrange to do a work item to do the destroy, would that
> let the rmmod of the NIC driver work?  Does rmmod wait long enough
> for the work item to do the destroy?  I guess you must've tried this
> and had it work.
> I'd vote for scrapping the ability to unload the NIC driver while fcoe is
> using it.

The main problem I see with this is the belief that you can prevent the
netdev from going away (the device itself, not the module).  Even with
the driver module referenced to prevent unloading, it is possible to hot
remove the device itself.

Regardless, the locking around trying to solve all of these issues is
certainly more complicated than I thought at first.  I think it might be
worth a recap of what's been learned recently.

1) The RTNL mutex cannot be held over the entire path of an FCoE
interface destroy, because the destroy path needs to flush several work
queue items.

The exact rules to keep in mind here are:

a) You can never call flush_scheduled_work() or flush_work() on the
default/event queue while holding RTNL.  The linkwatch events take RTNL
in the workqueue thread.

b) You can call cancel_work_sync() holding RTNL, if the work item you
are canceling does not take RTNL.

"b" doesn't help us because the FIP receive work might call
fcoe_update_src_mac() which takes RTNL.

2) Using a new mutex in the fcoe module does allow for proper managing
of create, destroy, module load and module unload.  If it were just
these cases, it might be possible to remove the hostlist lock and rely
on just this mutex.

This still leaves some open issues in trying to handle netdev
unregistered events:

a) Because the netdev notifier is called with RTNL held, the interface
destroy path cannot be called directly from here.

b) If some sort of delayed destroy were used here, the device should at
least be removed from the hostlist right away to prevent racing between
multiple destroy calls from different causes.  That causes the hostlist
lock to remain separate from the new create/destroy mutex in order to
prevent circular locking issues with RTNL.

c) The destroy can't be made asynchronous using schedule_work(), because
it then attempts to flush the very workqueue thread it's running in.  It
might be possible if it is safe to change all those flushes to

- Chris

More information about the devel mailing list