[Open-FCoE] [RESUBMIT-v2 PATCH] libfc, fcoe: fixed locking issues with lport->lp_mutex around lport->link_status

Vasu Dev vasu.dev at linux.intel.com
Wed Apr 22 00:32:09 UTC 2009

On Mon, 2009-04-20 at 23:47 -0500, Mike Christie wrote:
> Mike Christie wrote:
> > It makes a giant improvement. I am only down about 5%. There is a weird 
> > period during the test startup where it takes a couple seconds to get to 
> > its max throughput. If I just set it to 3 commands there is no delay seen.
> I meant the max throughtput I can get with your patch, which was 5% 
> lower than the 3 command case or the case without the qfull change.

Good to know that this is better now and thanks for quick turnaround on
this. I also did some measurement using previously mentioned single lun
test "disktest -PT -T30 -h1 -K128 -B256k -ID /dev/sdx  -D 0:100" with or
without recently submitted patches improving fcoe pending queue handling
on this top commit of fcoe-fixes tree:-

	commit b4e00c6751c5c38a8763859aacbbeae5ba39af1a
	Author: Abhijeet Joglekar <abjoglek at cisco.com>
	Date:   Fri Apr 3 14:58:30 2009 -0700
    	libfc: whenever queueing delete ev for rport, set state to NONE

I did three measurements for each case listed below and then selected
the best of three to understand relative impact on performance for any
need to tune cmd_per_lun or remove qfull check, identified cases were:-

 	1. no parameter change.
 	2. ignore added qfull check.
 	3. reduce cmd per lun to 3 from current 32
I'm using only relative changed % value from least measured value to
understand impact for any need to change cmd_per_lun or qfull check, I
think relative values will be more helpful here beside I just used my
devel setup which is not tuned for best performance to measure true
absolute performance values.

I used two types of luns with same 10G NIC with Cisco nexus switch, one
clarion lun and other RAM based lun and here are my findings with lun

1. Lun : DGC, model: RAID 5, ver:0326 

Title:		fcoe-fixes tip,	tip+fcoe handling patches
No change:	36.034%,	42.35% (highest)
Ignore qfull:	38.10%,		39.29%
cmd_per_lun=3:	0% (least),	5.78%

2. Lun : SANBlaze Model:VLUN FC RAMDisk  Rev: 4.1".

Title: 		fcoe-fixes tip,	tip+fcoe handling patches
No change:	0%,(least)	22.11% (close to high)
Ignore qfull:	6.52%,		23.24% (highest)
cmd_per_lun=3:	13.63%,		22.44% (close to high)

I see the case one with fcoe queue handling patches applied as win for
both luns for above mentioned disktest. 

However this is just for one lun case for 256K writes so using
cmd_per_lun=3 might be still needed in case of using too many
luns(>=256), how about I make another patch with tunable cmd_per_lun
fcoe parameter with default value as 3. I'm suggesting default as 3
since other FC HBA are using this value beside it will likely be helpful
with too many luns or other IO sizes. Although I do see -ve perf impact
on first lun above with 3 but you noticed 5% better with cmd_per_lun=3,
so I guess your lun could be close to lun-2 above since this case has
close numbers for all three cases with patches applied (right column).

I'm not sure if I need to remove qfull check at this point by looking at
data above, however I'm fine either way but would prefer to remove this
check once we are sure this is impacting perf adversely after added
recent fcoe queue handling improvements.

What do you think, especially on suggested tune able cmd_per_lun with
default value 3.


More information about the devel mailing list