[Open-FCoE] Few persistent errors/crashes on Initiator during initialization and Data IO

Vandana Salve vandana at linsyssoft.com
Fri Sep 26 05:35:34 UTC 2008


Hi,

We are persistently getting few crashes and errors on initiator (re-arch
code base kernel-2.6.27-rc5) such as

1. LUN Reset command failing
2. ABORT command failing
3. fc_fcp_rec_error/ fc_fcp_lock_pkt Invalid scsi cmd pointer on fcp packet

And these errors are not very constant, they are coming randomly. These are
coming during initialization. the disks gets visible on initiator but keeps
on crashing.

 Any pointers to debug them would be of great help.

Thanks,
Vandana

Below is the snippet of the messages seen on initiator


Sep 25 11:01:34 localhost kernel: fcoe_percpu_receive_thread : sid = 010102
did = 010101
Sep 25 11:01:34 localhost kernel: fc_exch_recv_seq_resp call resp function
Sep 25 11:01:34 localhost kernel: fc_eh_abort :  calling fc_fcp_pkt_abort
Sep 25 11:01:34 localhost kernel: fc_fcp_pkt_abort: fc_fcp_send_abort FAILED
Sep 25 11:01:34 localhost kernel: fc_eh_abort :  calling fc_fcp_pkt_abort
Sep 25 11:01:34 localhost kernel: fc_fcp_pkt_abort: fc_fcp_send_abort FAILED
Sep 25 11:01:34 localhost kernel: fc_eh_abort :  calling fc_fcp_pkt_abort
Sep 25 11:01:34 localhost kernel: fc_fcp_pkt_abort: fc_fcp_send_abort FAILED
Sep 25 11:01:34 localhost kernel: fc_eh_abort :  calling fc_fcp_pkt_abort
Sep 25 11:01:34 localhost kernel: fc_fcp_pkt_abort: fc_fcp_send_abort FAILED
Sep 25 11:01:34 localhost kernel: fc_eh_abort :  calling fc_fcp_pkt_abort
Sep 25 11:01:34 localhost kernel: fc_fcp_pkt_abort: fc_fcp_send_abort FAILED
Sep 25 11:01:34 localhost kernel: fc_eh_abort :  calling fc_fcp_pkt_abort
Sep 25 11:01:34 localhost kernel: fc_fcp_pkt_abort: fc_fcp_send_abort FAILED
Sep 25 11:01:34 localhost kernel: fc_eh_abort :  calling fc_fcp_pkt_abort
Sep 25 11:01:34 localhost kernel: fc_fcp_pkt_abort: fc_fcp_send_abort FAILED
Sep 25 11:01:34 localhost kernel: fc_eh_abort :  calling fc_fcp_pkt_abort
Sep 25 11:01:34 localhost kernel: fc_fcp_pkt_abort: fc_fcp_send_abort FAILED
Sep 25 11:01:34 localhost kernel: fcoe_xmit :fh->fh_r_ctl = 6 fh->fh_type =
8
Sep 25 11:01:34 localhost kernel: fcoe_xmit : sid = 010101 did = 010102
Sep 25 11:01:34 localhost kernel: fcoe_percpu_receive_thread: fh->fh_r_ctl =
7 fh->fh_type = 8
Sep 25 11:01:34 localhost kernel: fcoe_percpu_receive_thread : sid = 010102
did = 010101
Sep 25 11:01:34 localhost kernel: fc_exch_recv_seq_resp call resp function

ep 25 11:00:34 localhost kernel: fc_exch_recv_seq_resp call resp function
Sep 25 11:00:34 localhost kernel: fc_fcp_lock_pkt Invalid scsi cmd pointer
on fcp packet.
Sep 25 11:00:34 localhost kernel:
Sep 25 11:00:34 localhost kernel: fc_fcp_rec_resp : fc_fcp_lock_pkti goto
out
Sep 25 11:00:34 localhost kernel: fcoe_percpu_receive_thread: fh->fh_r_ctl =
23 fh->fh_type = 1
Sep 25 11:00:34 localhost kernel: fcoe_percpu_receive_thread : sid = 010102
did = 010101
Sep 25 11:00:34 localhost kernel: fc_exch_recv_seq_resp call resp function
Sep 25 11:00:34 localhost kernel: fc_fcp_lock_pkt Invalid scsi cmd pointer
on fcp packet.
Sep 25 11:00:34 localhost kernel:
Sep 25 11:00:34 localhost kernel: fc_fcp_rec_resp : fc_fcp_lock_pkti goto
out
Sep 25 11:00:34 localhost kernel: fcoe_percpu_receive_thread: fh->fh_r_ctl =
23 fh->fh_type = 1
Sep 25 11:00:34 localhost kernel: fcoe_percpu_receive_thread : sid = 010102
did = 010101
Sep 25 11:00:34 localhost kernel: fc_exch_recv_seq_resp call resp function
Sep 25 11:00:34 localhost kernel: fc_fcp_lock_pkt Invalid scsi cmd pointer
on fcp packet.



Sep 25 11:01:44 localhost kernel: fc_lun_reset lun reset failed
Sep 25 11:01:44 localhost kernel: ------------[ cut here ]------------
Sep 25 11:01:44 localhost kernel: WARNING: at include/scsi/libfc/libfc.h:489
fc_lport_enter_reset+0x61/0xad [libfc]()
Sep 25 11:01:44 localhost kernel: Modules linked in: fcoe libfc
scsi_transport_fc scsi_tgt autofs4 sunrpc loop dm_multipath ipv6 parport_pc
sr_mod i5000_edac parport edac_core cdrom pcspkr tg3 libphy iTCO_wdt
iTCO_vendor_support sfc mii dcdbas serio_raw button sg dm_snapshot dm_zero
dm_mirror dm_log dm_mod ata_generic pata_acpi ata_piix libata sd_mod
scsi_mod ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd [last unloaded: libfc]
Sep 25 11:01:44 localhost kernel: Pid: 3928, comm: scsi_eh_5 Tainted:
G        W 2.6.27-rc5-fcoe #1
Sep 25 11:01:44 localhost kernel:  [<c0424ae3>] warn_on_slowpath+0x40/0x63
Sep 25 11:01:44 localhost kernel:  [<c0419989>] __wake_up_common+0x2e/0x58
Sep 25 11:01:44 localhost kernel:  [<c041aafd>] __wake_up+0x29/0x39
Sep 25 11:01:44 localhost kernel:  [<c04e7b6b>] kobject_get+0xf/0x13
Sep 25 11:01:44 localhost kernel:  [<f884f0b7>] scsi_device_put+0x21/0x2e
[scsi_mod]
Sep 25 11:01:44 localhost kernel:  [<f884f212>]
__scsi_iterate_devices+0x50/0x58 [scsi_mod]
Sep 25 11:01:44 localhost kernel:  [<c042be74>] lock_timer_base+0x19/0x35
Sep 25 11:01:44 localhost kernel:  [<c042bf87>] __mod_timer+0x99/0xa3
Sep 25 11:01:44 localhost kernel:  [<c04318a7>]
queue_delayed_work_on+0x9a/0xa6
Sep 25 11:01:44 localhost kernel:  [<f89a4c50>]
fc_lport_enter_reset+0x61/0xad [libfc]
Sep 25 11:01:44 localhost kernel:  [<f89a8abc>] fc_eh_host_reset+0x12/0xa5
[libfc]
Sep 25 11:01:44 localhost kernel:  [<f8851bc5>]
scsi_try_host_reset+0x3a/0x88 [scsi_mod]
Sep 25 11:01:44 localhost kernel:  [<f88525f6>]
scsi_eh_ready_devs+0x3e9/0x4c7 [scsi_mod]
Sep 25 11:01:44 localhost kernel:  [<f8852cdc>]
scsi_error_handler+0x2ab/0x3f5 [scsi_mod]
Sep 25 11:01:44 localhost kernel:  [<f8852a31>] scsi_error_handler+0x0/0x3f5
[scsi_mod]
Sep 25 11:01:44 localhost kernel:  [<c0433b4f>] kthread+0x38/0x5d
Sep 25 11:01:44 localhost kernel:  [<c0433b17>] kthread+0x0/0x5d
Sep 25 11:01:44 localhost kernel:  [<c04044af>]
kernel_thread_helper+0x7/0x10
Sep 25 11:01:44 localhost kernel:  =======================
Sep 25 11:01:44 localhost kernel: ---[ end trace 14246bb417b6176b ]---
Sep 25 11:01:44 localhost kernel: fc_fcp_recv : IS_ERR goto errout
Sep 25 11:01:44 localhost kernel:
Sep 25 11:01:44 localhost kernel: fc_fcp_recv : at errout : calling
fc_fcp_error
Sep 25 11:01:44 localhost kernel:
Sep 25 11:01:44 localhost kernel: fc_fcp_error : fc_fcp_retry_cmd : goto
unlock
Sep 25 11:01:44 localhost kernel:
Sep 25 11:01:44 localhost kernel: fc_fcp_error : at unlock: calling
fc_fcp_unlock_pkt
Sep 25 11:01:44 localhost kernel: End of fc_fcp_recv
Sep 25 11:01:44 localhost kernel: ------------[ cut here ]------------
Sep 25 11:01:44 localhost kernel: WARNING: at include/scsi/libfc/libfc.h:489
fc_lport_enter_flogi+0x3c/0x138 [libfc]()
Sep 25 11:01:44 localhost kernel: Modules linked in: fcoe libfc
scsi_transport_fc scsi_tgt autofs4 sunrpc loop dm_multipath ipv6 parport_pc
sr_mod i5000_edac parport edac_core cdrom pcspkr tg3 libphy iTCO_wdt
iTCO_vendor_support sfc mii dcdbas serio_raw button sg dm_snapshot dm_zero
dm_mirror dm_log dm_mod ata_generic pata_acpi ata_piix libata sd_mod
scsi_mod ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd [last unloaded: libfc]
Sep 25 11:01:44 localhost kernel: Pid: 3928, comm: scsi_eh_5 Tainted:
G        W 2.6.27-rc5-fcoe #1
Sep 25 11:01:44 localhost kernel:  [<c0424ae3>] warn_on_slowpath+0x40/0x63
Sep 25 11:01:44 localhost kernel:  [<c0425469>] printk+0x14/0x18
Sep 25 11:01:44 localhost kernel:  [<f89a986e>] fc_fcp_recv+0xa3b/0xb1d
[libfc]
Sep 25 11:01:44 localhost kernel:  [<c041aafd>] __wake_up+0x29/0x39
Sep 25 11:01:44 localhost kernel:  [<c04747bc>] add_partial+0x2c/0x33
Sep 25 11:01:44 localhost kernel:  [<c047580e>] __slab_free+0x62/0x1de
Sep 25 11:01:44 localhost kernel:  [<c045b922>] mempool_free+0x63/0x67
Sep 25 11:01:44 localhost kernel:  [<c0476401>] kmem_cache_free+0x60/0x69
Sep 25 11:01:44 localhost kernel:  [<c045b922>] mempool_free+0x63/0x67
Sep 25 11:01:44 localhost kernel:  [<c0617add>] _spin_lock_bh+0x8/0x1e
Sep 25 11:01:44 localhost kernel:  [<f89a7420>] fc_rport_reset+0x56/0xce
[libfc]
Sep 25 11:01:44 localhost kernel:  [<f89a6355>]
fc_rport_reset_list+0x2c/0x54 [libfc]
Sep 25 11:01:44 localhost kernel:  [<f89a4af3>]
fc_lport_enter_flogi+0x3c/0x138 [libfc]
Sep 25 11:01:44 localhost kernel:  [<f89a4c98>]
fc_lport_enter_reset+0xa9/0xad [libfc]
Sep 25 11:01:44 localhost kernel:  [<f89a8abc>] fc_eh_host_reset+0x12/0xa5
[libfc]
Sep 25 11:01:44 localhost kernel:  [<f8851bc5>]
scsi_try_host_reset+0x3a/0x88 [scsi_mod]
Sep 25 11:01:44 localhost kernel:  [<f88525f6>]
scsi_eh_ready_devs+0x3e9/0x4c7 [scsi_mod]
Sep 25 11:01:44 localhost kernel:  [<f8852cdc>]
scsi_error_handler+0x2ab/0x3f5 [scsi_mod]
Sep 25 11:01:44 localhost kernel:  [<f8852a31>] scsi_error_handler+0x0/0x3f5
[scsi_mod]

Sep 25 11:01:57 localhost kernel:
Sep 25 11:01:57 localhost kernel: fc_fcp_timeout : calling fc_fcp_lock_pkt
Sep 25 11:01:57 localhost kernel: fcoe_xmit :fh->fh_r_ctl = 22 fh->fh_type =
1
Sep 25 11:01:57 localhost kernel: fcoe_xmit : sid = 010101 did = 010102
Sep 25 11:01:59 localhost kernel:
Sep 25 11:01:59 localhost kernel: fc_fcp_rec_error : calling fc_fcp_lock_pkt
Sep 25 11:01:59 localhost kernel: fc_fcp_rec_error REC fid 10102 error error
-1 retry 0/3
Sep 25 11:01:59 localhost kernel: fcoe_xmit :fh->fh_r_ctl = 22 fh->fh_type =
1
Sep 25 11:01:59 localhost kernel: fcoe_xmit : sid = 010101 did = 010102
Sep 25 11:01:59 localhost kernel: fcoe_xmit :fh->fh_r_ctl = 81 fh->fh_type =
0
Sep 25 11:01:59 localhost kernel: fcoe_xmit : sid = 010101 did = 010102
Sep 25 11:02:01 localhost kernel:
Sep 25 11:02:01 localhost kernel: fc_fcp_rec_error : calling fc_fcp_lock_pkt
Sep 25 11:02:01 localhost kernel: fc_fcp_rec_error REC fid 10102 error error
-1 retry 1/3
Sep 25 11:02:01 localhost kernel: fcoe_xmit :fh->fh_r_ctl = 22 fh->fh_type =
1
Sep 25 11:02:01 localhost kernel: fcoe_xmit : sid = 010101 did = 010102
Sep 25 11:02:01 localhost kernel: fcoe_xmit :fh->fh_r_ctl = 81 fh->fh_type =
0
Sep 25 11:02:01 localhost kernel: fcoe_xmit : sid = 010101 did = 010102
Sep 25 11:02:03 localhost kernel:
Sep 25 11:02:03 localhost kernel: fc_fcp_rec_error : calling fc_fcp_lock_pkt
Sep 25 11:02:03 localhost kernel: fc_fcp_rec_error REC fid 10102 error error
-1 retry 2/3
Sep 25 11:02:03 localhost kernel: fcoe_xmit :fh->fh_r_ctl = 22 fh->fh_type =
1
Sep 25 11:02:03 localhost kernel: fcoe_xmit : sid = 010101 did = 010102
Sep 25 11:02:03 localhost kernel: fcoe_xmit :fh->fh_r_ctl = 81 fh->fh_type =
0
Sep 25 11:02:03 localhost kernel: fcoe_xmit : sid = 010101 did = 010102
Sep 25 11:02:05 localhost kernel:
Sep 25 11:02:05 localhost kernel: fc_fcp_rec_error : calling fc_fcp_lock_pkt
Sep 25 11:02:05 localhost kernel: fc_fcp_rec_error REC fid 10102 error error
-1 retry 3/3
Sep 25 11:02:05 localhost kernel: fcoe_xmit :fh->fh_r_ctl = 81 fh->fh_type =
0
Sep 25 11:02:05 localhost kernel: fcoe_xmit : sid = 010101 did = 010102
Sep 25 11:02:05 localhost kernel: fcoe_xmit :fh->fh_r_ctl = 81 fh->fh_type =
0
Sep 25 11:02:05 localhost kernel: fcoe_xmit : sid = 010101 did = 010102
Sep 25 11:02:05 localhost kernel: fc_eh_abort :  calling fc_fcp_pkt_abort
Sep 25 11:02:05 localhost kernel: fc_fcp_pkt_abort: fc_fcp_send_abort FAILED
Sep 25 11:02:05 localhost kernel: fcoe_xmit :fh->fh_r_ctl = 6 fh->fh_type =
8



More information about the devel mailing list