- 04 Dec, 2009 38 commits
-
-
Joe Eykholt authored
Timer crashes were caused by freeing a struct fc_rport_priv with a timer pending, causing the timer facility list to be corrupted. This was during FC uplink flap tests with a lot of targets. After discovery, we were doing an PLOGI on an rdata that was in DELETE state but not yet removed from the lookup list. This moved the rdata from DELETE state to PLOGI state. If the PLOGI exchange allocation failed and needed to be retried, the timer scheduling could race with the free being done by fc_rport_work(). When fc_rport_login() is called on a rport in DELETE state, move it to a new state RESTART. In fc_rport_work, when handling a LOGO, STOPPED or FAILED event, look for restart state. In the RESTART case, don't take the rdata off the list and after the transport remote port is deleted and exchanges are reset, re-login to the remote port. Note that the new RESTART state also corrects a problem we had when re-discovering a port that had moved to DELETE state. In that case, a new rdata was created, but the old rdata would do an exchange manager reset affecting the FC_ID for both the new rdata and old rdata. With the new state, the new port isn't logged into until after any old exchanges are reset. Signed-off-by: Joe Eykholt <jeykholt@cisco.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
-
Abhijeet Joglekar authored
Signed-off-by: Abhijeet Joglekar <abjoglek@cisco.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
-
Abhijeet Joglekar authored
Signed-off-by: Abhijeet Joglekar <abjoglek@cisco.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
-
Abhijeet Joglekar authored
Driver was processing a fixed max number of cq descriptors per ISR. For instance, for the SCSI IO queue, number of IOs processed per ISR were 8. If hardware writes 9 cq descriptors to the cq and generates an interrupt, driver would process only 8 descriptors and decrement the outstanding credit count by 8. Unless another interrupt event happens, the hw does not generate any additional interrupt. This results in the cq descriptor sitting in the queue without being procesed and can cause IO timeouts and aborts. Modify all ISR functions to process all queued cq descriptors in one shot. Since bulk of ELS frame processing is done in thread context and bulk of SCSI IO processing is done in soft ISR deferred context, the cycles spent in the ISR per cq descriptor is small. Signed-off-by: Herman Lee <hermlee@cisco.com> Signed-off-by: Abhijeet Joglekar <abjoglek@cisco.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
-
Chris Leech authored
I was running into several different panics under stress, which I traced down to a few different possible slab corruption issues in error handling paths. I have not yet looked into why these exchange sends fail, but with these fixes my test system is much more stable under stress than before. fc_elsct_send() could fail and either leave the passed in frame intact (failure in fc_ct/els_fill) or the frame could have been freed if the failure was is fc_exch_seq_send(). The caller had no way of knowing, and there was a potential double free in the error handling in fc_fcp_rec(). Make fc_elsct_send() always free the frame before returning, and remove the fc_frame_free() call in fc_fcp_rec(). While fc_exch_seq_send() did always consume the frame, there were double free bugs in the error handling of fc_fcp_cmd_send() and fc_fcp_srr() as well. Numerous calls to error handling routines (fc_disc_error(), fc_lport_error(), fc_rport_error_retry() ) were passing in a frame pointer that had already been freed in the case of an error. I have changed the call sites to pass in a NULL pointer, but there may be more appropriate error codes to use. Question: Why do these error routines take a frame pointer anyway? I understand passing in a pointer encoded error to the response handlers, but the error routines take no action on a valid pointer and should never be called that way. Signed-off-by: Chris Leech <christopher.leech@intel.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
-
Yi Zou authored
Calls ndo_fcoe_enabled() of the associated netdev upon creating the FCoE instance to make sure LLD has all necessary resources allocated and setup properly before passing FCoE traffic. Similarly, calls ndo_fcoe_disable() upon destroying the FCoE instance on the associated netdev to allow the LLD to release all allocated resources for FCoE. Signed-off-by: Yi Zou <yi.zou@intel.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
-
Yi Zou authored
In case of sequence offload, in fc_fcp_send_data(), the skb_fill_page_info() called may end up adding more frags to the skb_shinfo(fp_skb(fp))->frags[], exceeding SKB_MAX_FRAGS, this eventually corrupts the memory. I am adding the FR_FRAME_SG_LEN back, but as SKB_MAX_FRAGS -1, leaving 1 for our fcoe_eof_crc page. And send will be broken into multiple large sends if the frame already contains more frags than skb handle. Signed-off-by: Yi Zou <yi.zou@intel.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
-
Yi Zou authored
Add a define of FCOE_MTU as 2158 bytes and use FCOE_MTU when the LLD is found to support NETIF_F_FCOE_MTU. The lport->mfs is then calculated out of the 2158 FCOE_MTU. Otherwise, we stick with the netdev->mtu, i.e., LAN MTU. Also, change the notification on NETDEV_CHANGEMTU event to bypass changing mfs when LAN MTU is changed if NETIF_F_FCOE_MTU is supported. Signed-off-by: Yi Zou <yi.zou@intel.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
-
Mike Christie authored
When doing echo ethX > /sys..../destroy I am getting errors when the tear down succeeds. It looks like the reason for this is because the rc var is not getting set when the destruction works. This just sets it to zero. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
-
Vasu Dev authored
Reported-by: Alex Lyakas <alexl@mellanox.co.il> Signed-off-by: Vasu Dev <vasu.dev@intel.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
-
Vasu Dev authored
Adds missing exch release when RRQ is accepted by calling fc_seq_ls_acc. Adds common exch release for fc_exch_els_rrq by use of out label. Reported-by: Alex Lyakas <alexl@mellanox.co.il> Signed-off-by: Vasu Dev <vasu.dev@intel.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
-
Vasu Dev authored
Initializing these libfc globals per lport could mess up exch allocation/free for existing lport. So this patch moves their initialization to fc_setup_exch_mgr so that these globals gets initialized only once for libfc. Reported-by: Alex Lyakas <alexl@mellanox.co.il> Signed-off-by: Vasu Dev <vasu.dev@intel.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
-
Joe Eykholt authored
It's possible and harmless to get FLOGI timeouts while in RESET state. Don't do a WARN_ON in that case. Also, split out the other WARN_ONs in fc_lport_timeout, so we can tell which one is hit by its line number. Signed-off-by: Joe Eykholt <jeykholt@cisco.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
-
Joe Eykholt authored
Fix minor errors. A debug message said an RLIR was received instead of ECHO. "Expected" was misspelled in several places. Fix a type cast from u32 to __be32. Rob, Some of these may have been also taken care of in your other doc cleanup patch. Feel free to fold them in. Signed-off-by: Joe Eykholt <jeykholt@cisco.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
-
Yi Zou authored
This bug is exposed when there is a link flap in LLD. Particularly, when it happens right after a SCSI write command is sent out, no FCP_DATA is sent, causing fsp->status_code to be set as FC_DATA_UNDRUN in fc_fcp_complete_locked even no SCSI status is received. Consequently, fc_io_compl treats this as DID_OK. This results in SCSI returning successful to the initial I/O request even there is no DATA actually sent. Particularly, if you run an I/O tool w/ data verification on, the read back for verification is gonna fail. This is fixed here by checking when FC_DATA_UNDRUN happens, SCSI status is received w/ FC_SRB_RCV_STATUS set in fsp->state. Signed-off-by: Yi Zou <yi.zou@intel.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
-
Robert Love authored
This argument isn't used, let's not pass it into the routine. Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
-
Robert Love authored
These are a few functions that were not used by other modules. They did not need to be exported so this patch removes the EXPORT_SYMBOLS call for each. Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
-
Yi Zou authored
Remove the redundant checking of netdev->netdev_ops as it will never be NULL. Signed-off-by: Yi Zou <yi.zou@intel.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
-
Yi Zou authored
xid 0 was used as an indication of invalid xid before but now xid 0 can be used as a valid exchange i. This patch fixes the ddp completion in fcp layer, i.e., in fc_fcp.c:fc_fcp_ddp_done() function, to make sure it does not use xid 0 for indication of an invalid xid, instead, it now uses use FC_XID_UNKNOWN for such indication. Signed-off-by: Yi Zou <yi.zou@intel.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
-
Joe Eykholt authored
A received Fibre Channel ELS PRLI request contains a bit that indicates whether the remote port supports certain retry processing sequences. The test for this bit was somehow coded to use multiply instead of AND! This case would apply only for target mode operation, and it is unlikely to be noticed as an initiator. Signed-off-by: Joe Eykholt <jeykholt@cisco.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
-
Brian King authored
Bump driver version to 1.0.7. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
-
Brian King authored
Adds support for FC passthru via BSG. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
-
Brian King authored
When issuing a Cancel to the virtual fibre channel adapter, the interface specifies a flags field for the client to indicate what kind of error recovery is being performed. Fix up these flags for terminate_rport_io to indicate an abort task set rather than a target reset. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
-
Brian King authored
Remove a parameter to ibmvfc_init_host which is always set to zero by all callers. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
-
Brian King authored
Need to grab the host lock around the call to ibmvfc_link_down. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
-
Brian King authored
When processing the response to either a LUN reset, target reset, or an abort task set, the ibmvfc driver needs to treat as success receiving a response with a non-zero status in the response IU along with a general transport error with the FCP response code being zero. The VIOS currently guarantees this cannot happen, but a future version of VIOS may allow this to be returned, so ensure we handle this response combination correctly for TMFs, as we already do for SCSI commands. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
-
Martin K. Petersen authored
This version fixes 64-bit modulo on 32-bit as well as inadvertent map updates when TP was disabled. Implement support for thin provisioning in scsi_debug. No actual memory de-allocation is taking place. The intent is to emulate a thinly provisioned storage device, not to be one. There are four new module options: - unmap_granularity specifies the granularity at which to track mapped blocks (specified in number of logical blocks). 2048 (1 MB) is a realistic value for disk arrays although some may have a finer granularity. - unmap_alignment specifies the first LBA which is naturally aligned on an unmap_granularity boundary. - unmap_max_desc specifies the maximum number of ranges that can be unmapped using one UNMAP command. If this is 0, only WRITE SAME is supported and UNMAP will cause a check condition. - unmap_max_blocks specifies the maximum number of blocks that can be unmapped using a single UNMAP command. Default is 0xffffffff. These parameters are reported in the new and extended block limits VPD. If unmap_granularity is specified the device is tagged as thin provisioning capable in READ CAPACITY(16). A bitmap is allocated to track whether blocks are mapped or not. A WRITE request will cause a block to be mapped. So will WRITE SAME unless the UNMAP bit is set. Blocks can be unmapped using either WRITE SAME or UNMAP. No accounting is done to track partial blocks. This means that only whole blocks will be marked free. This is how the array people tell me their firmwares work. GET LBA STATUS is also supported. This command reports whether a block is mapped or not, and how long the adjoining mapped/unmapped extent is. The block allocation bitmap can also be viewed from user space via: /sys/bus/pseudo/drivers/scsi_debug/map Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
-
Martin K. Petersen authored
Add definitions for UNMAP, WRITE SAME{16,32} and GET LBA STATUS commands. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
-
Giridhar Malavali authored
Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
-
Lalit Chandivade authored
Correct issues where the lower scsi-status would be improperly cleared, instead, allow the midlayer to process the status after the proper residual-count checks are performed. Finally, validate firmware status flags prior to assigning values from the FCP_RSP frame. Signed-off-by: Lalit Chandivade <lalit.chandivade@qlogic.com> Signed-off-by: Michael Hernandez <michael.hernandez@qlogic.com> Signed-off-by: Ravi Anand <ravi.anand@qlogic.com> Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
-
Andrew Vasquez authored
Original code would not register FC4 nor FDMI information after a logical tear-down of an VFC link. Code now triggers registration date during processing of a 'Report ID Acquisition IOCB', which is submitted after a FLOGI or FDISC completes. Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
-
Andrew Vasquez authored
Original code discarded response-info field information and assumed the command completed successfully without verifying the target's status within the FCP_RSP packet. Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
-
Andrew Vasquez authored
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
-
Giridhar Malavali authored
Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
-
Lalit Chandivade authored
In some case, the MPI and PHY versions when retrieved after the Execute-FW mailbox-command are incorrect (255.255.255.255). Instead, query the information after the check for firmware ready is done in the abort ISP path. Signed-off-by: Lalit Chandivade <lalit.chandivade@qlogic.com> Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
-
Andrew Vasquez authored
The mailbox register values may assist in debugging efforts. Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
-
Andrew Vasquez authored
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
-
Jing Huang authored
This patch fixes checkpatch errors/warnings in bfad files. Signed-off-by: Jing Huang <huangj@brocade.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
-
- 29 Oct, 2009 2 commits
-
-
Michael Reed authored
I was doing some large lun count testing with 2.6.31 and hit a BUG_ON() in fc_timeout_deleted_rport(), and it seems like it should have been just a matter of time before someone did. It seems invalid to set port_state under lock, then expect it to remain set after releasing the lock. Another thread called fc_remote_port_add() when the lock was released, changing the port_state. This patch removes the BUG_ON and moves the test of the port_state to inside the host_lock. It's been running for several weeks now with no ill effect. Signed-off-by: Michael Reed <mdr@sgi.com> Acked-by: James Smart <james.smart@emulex.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
-
Mike Christie authored
When the Integrity check is done in scsi_io_completion it will set error to -EILSEQ. However, at this point error is no longer used, and blk_end_request_err has -EIO hardcoded. It looks like there was just porting mistake with this patch http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=3e695f89c5debb735e4ff051e9e58d8fb4e95110 and we meant to send error upwards, so this patch changes the hard coded EIO to the error variable. I have only boot tested this patch. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
-