Commits · ea3e2e72eeb3e8a9440a5da965914f9b12088626 · Kirill Smelkov / linux

21 Dec, 2010 40 commits

[SCSI] libfc: fix exchange being deleted when the abort itself is timed out · ea3e2e72

Yi Zou authored Nov 30, 2010

Should not continue when the abort itself is being timeout since in that case
the exchange will be deleted and relesased. We still want to call the
associated response handler to let the layer, e.g., fcp, know the exchange
itself is being timed out.
Signed-off-by: Yi Zou <yi.zou@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

ea3e2e72

[SCSI] libfc: do not fc_io_compl on fsp w/o any scsi_cmnd associated · d889b30a

Yi Zou authored Nov 30, 2010

Do not call fc_io_compl() on fsp w/o any scsi_cmnd, e.g., lun reset is built
inside fc_fcp, not from a scsi command from queuecommnd from scsi-ml, so in
in case target is buggy that is invalid flags in the FCP_RSP, as we have seen
in some SAN Blaze target where all bits in flags are 0, we do not want to call
io_compl on this fsp.

[ Comment block added by Robert Love ]
Signed-off-by: Yi Zou <yi.zou@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

d889b30a

[SCSI] libfc: add print of exchange id for debugging fc_fcp · 9b90dc80

Yi Zou authored Nov 30, 2010

This is very helpful to match up the corresponding exchange to the actual I/O
described by the fsp, particularly when you do a side-by-side comparison of
the syslog with your trace.
Signed-off-by: Yi Zou <yi.zou@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

9b90dc80

[SCSI] drivers/scsi/fcoe: Update WARN uses · 11aa9900

Joe Perches authored Nov 30, 2010

Add missing newlines.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

11aa9900

[SCSI] libfc: fix memory leakage in remote port · 0e9e3d3b

Hillf Danton authored Nov 30, 2010

There seems rdata should get put before return.
Signed-off-by: Hillf Danton <dhillf@gmail.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

0e9e3d3b

[SCSI] libfc: fix memory leakage in local port · 72e0daad

Hillf Danton authored Nov 30, 2010

There seems info should get freed when error encountered.
Signed-off-by: Hillf Danton <dhillf@gmail.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

72e0daad

[SCSI] libfc: fix memory leakage in local port · 2d6dfb00

Hillf Danton authored Nov 30, 2010

There seems info should get freed when error encountered.
Signed-off-by: Hillf Danton <dhillf@gmail.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

2d6dfb00

[SCSI] libfc: remove tgt_flags from fc_fcp_pkt struct · 05fee645

john fastabend authored Nov 30, 2010

We can easily remove the tgt_flags from fc_fcp_pkt struct
and use rpriv->tgt_flags directly where needed.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

05fee645

[SCSI] libfc: use rport timeout values for fcp recovery · e0883a3c

john fastabend authored Nov 30, 2010

Use the rport value for rec_tov for timeout values when
sending fcp commands. Currently, defaults are being used
which may or may not match the advertised values.

The default may cause i/o to timeout on networks that
set this value larger then the default value. To make
the timeout more configurable in the non-REC mode we
remove the FC_SCSI_ER_TIMEOUT completely allowing the
scsi-ml to do the timeout. This removes an unneeded
timer and allows the i/o timeout to be configured
using the scsi-ml knobs.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

e0883a3c

[SCSI] libfc: incorrect scsi host byte codes returned to scsi-ml · ac17ea8d

john fastabend authored Nov 30, 2010

The fcp packet recovery handler fc_fcp_recover() is called
when errors occurr in a fcp session. Currently it is
generically setting the status code to FC_CMD_RECOVERY for
all error types. This results in DID_BUS_BUSY errors
being returned to the scsi-ml.

DID_BUS_BUSY errors indicate "BUS stayed busy through time
out period" according to scsi.h. Many of the error reported
by fc_rcp_recovery() are pkt errors. Here we update
fc_fcp_recovery to use better host byte codes.

With certain FAST FAIL flags set DID_BUS_BUSY and DID_ERROR
will have different behaviors this was causing dm multipath
to fail quickly in some cases where a retry would be a
better action.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

ac17ea8d

[SCSI] libfc: fix stats computation in fc_queuecommand() · e90ff5ef

Hillf Danton authored Nov 30, 2010

There seems accumulation needed.
Signed-off-by: Hillf Danton <dhillf@gmail.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

e90ff5ef

[SCSI] libfc: fix mem leak in fc_seq_assign() · 530994d6

Hillf Danton authored Nov 30, 2010

There is a typo cleaned, which triggers memory leakage.
Signed-off-by: Hillf Danton <dhillf@gmail.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

530994d6

[SCSI] libfc: Fix incorrect locking and unlocking in FCP · 3c2c3bf2

Robert Love authored Nov 30, 2010

The error handler grabs the si->scsi_queue_lock, but
in the case where the fsp pointer is NULL it releases
the scsi_host lock. This can lead to a variety of
system hangs depending on which is used first- the
scsi_host lock or the scsi_queue_lock.

This patch simply unlocks the correct lock when fcp
is NULL.
Signed-off-by: Robert Love <robert.w.love@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

3c2c3bf2

[SCSI] libfc: tune fc_exch_em_alloc() to be O(2) · 2034c19c

Hillf Danton authored Nov 30, 2010

For allocating new exch from pool,  scanning for free slot in exch
array fluctuates when exch pool is close to exhaustion.

The fluctuation is smoothed, and the scan looks to be O(2).
Signed-off-by: Hillf Danton <dhillf@gmail.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

2034c19c

[SCSI] libfc: fix mem leak in fc_exch_recv_seq_resp() · 8236554a

Hillf Danton authored Nov 30, 2010

There seems that ep should get released, or it will no longer get freed.
Signed-off-by: Hillf Danton <dhillf@gmail.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

8236554a

[SCSI] libfc: fix NULL pointer dereference bug in fc_fcp_pkt_release · 80e736f8

Yi Zou authored Nov 30, 2010

This happens when then tearing down the fcoe interface with active I/O.
The back trace shows dead000000200200 in RAX, i.e., LIST_POISON2, indicating
that the fsp is already being dequeued, which is probably why no complaining
was seen in fc_fcp_destroy() about outstanding fsp not freed, since we dequeue
it in the end of fc_io_compl() before releasing it. The bug is due to the
fact that we have already destroyed lport's scsi_pkt_pool while on-going i/o
is still accessing it through fc_fcp_pkt_release(), like this trace or the
similar code path from scsi-ml to fc_eh_abort, etc. This is fixed by moving
the fc_fcp_destroy() after lport is detached from scsi-ml since fc_fcp_destroy
is supposed to called only once where no lport lock is taken, otherwise the
fc_fcp_pkt_release() would have to grab the lport lock.

 BUG: unable to handle kernel NULL pointer dereference at (null)
 .......
 RIP: 0010:[<0000000000000000>]
 [<(null)>] (null)
 RSP: 0018:ffff8803270f7b88  EFLAGS: 00010282
 RAX: dead000000200200 RBX: ffff880197d2fbc0 RCX: 0000000000005908
 RDX: ffff880195ea6d08 RSI: 0000000000000282 RDI: ffff880180f4fec0
 RBP: ffff8803270f7bc0 R08: ffff880197d2fbe0 R09: 0000000000000000
 R10: ffff88032867f090 R11: 0000000000000000 R12: ffff880195ea6d08
 R13: 0000000000000282 R14: ffff880180f4fec0 R15: 0000000000000000
 FS:  0000000000000000(0000) GS:ffff8801b5820000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
 CR2: 0000000000000000 CR3: 00000001a6eae000 CR4: 00000000000006e0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
 Process fc_rport_eq (pid: 5278, threadinfo ffff8803270f6000, task ffff880326254ab0)
 Stack:
 ffffffffa02c39ca ffff8803270f7ba0 ffff88019331cbc0 ffff880197d2fbc0
 0000000000000000 ffff8801a8c895e0 ffff8801a8c895e0 ffff8803270f7c10
 ffffffffa02c4962 ffff8803270f7be0 ffffffff814c94ab ffff8803270f7c10
 Call Trace:
 [<ffffffffa02c39ca>] ? fc_io_compl+0x10a/0x530 [libfc]
 [<ffffffffa02c4962>] fc_fcp_complete_locked+0x72/0x150 [libfc]
 [<ffffffff814c94ab>] ? _spin_unlock_bh+0x1b/0x20
 [<ffffffffa02b98ff>] ? fc_exch_done+0x3f/0x60 [libfc]
 [<ffffffffa02c4a8f>] fc_fcp_retry_cmd+0x4f/0x60 [libfc]
 [<ffffffffa02c6150>] fc_fcp_recv+0x9b0/0xc30 [libfc]
 [<ffffffff8106ba7a>] ? _call_console_drivers+0x4a/0x80
 [<ffffffff8107d5ec>] ? lock_timer_base+0x3c/0x70
 [<ffffffff8107e06b>] ? try_to_del_timer_sync+0x7b/0xe0
 [<ffffffffa02b9dcf>] fc_exch_mgr_reset+0x1df/0x250 [libfc]
 [<ffffffffa02c57a0>] ? fc_fcp_recv+0x0/0xc30 [libfc]
 [<ffffffffa02c1042>] fc_rport_work+0xf2/0x4e0 [libfc]
 [<ffffffff8109203e>] ? prepare_to_wait+0x4e/0x80
 [<ffffffffa02c0f50>] ? fc_rport_work+0x0/0x4e0 [libfc]
 [<ffffffff8108c6c0>] worker_thread+0x170/0x2a0
 [<ffffffff81091d50>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff8108c550>] ? worker_thread+0x0/0x2a0
 [<ffffffff810919e6>] kthread+0x96/0xa0
 [<ffffffff810141ca>] child_rip+0xa/0x20
 [<ffffffff81091950>] ? kthread+0x0/0xa0
 [<ffffffff810141c0>] ? child_rip+0x0/0x20
 Code:
 Bad RIP value.

 RIP
 [<(null)>] (null)
 RSP <ffff8803270f7b88>
 CR2: 0000000000000000
Signed-off-by: Yi Zou <yi.zou@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

80e736f8

[SCSI] libfc: remove define of fc_seq_exch in fc_exch.c · 12137f5c

Hillf Danton authored Nov 30, 2010

The define for fc_seq_exch is unnecessary, since it also appears in scsi/libfc.h
Signed-off-by: Hillf Danton <dhillf@gmail.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

12137f5c

[SCSI] bfa: fix endianess sparse check warnings · 50444a34

Maggie authored Nov 29, 2010

First round of fix for the endianess check warnings from make C=2 CF="-D__CHECK_ENDIAN__".
Signed-off-by: Maggie <xmzhang@brocade.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

50444a34

[SCSI] bfa: fix regular sparse check warnings. · 52f94b6f

Maggie authored Nov 29, 2010

Fix all sparse check warnings from make C=2.
Signed-off-by: Maggie <xmzhang@brocade.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

52f94b6f