An error occurred fetching the project authors.
  1. 01 Oct, 2019 2 commits
    • James Smart's avatar
      scsi: lpfc: Fix device recovery errors after PLOGI failures · 0f154226
      James Smart authored
      When target-side fault injections are made, the driver isn't reconnecting
      to the remote port. The driver is logging "2753" error messages which
      state:
      
      "PLOGI failure DID:1B2400 Status:x3/xf0240008"
      
      The failures status is indicating a Illegal field error, which points to
      the Temporary RPI field being used for the ELS. This error typically means
      the driver used an RPI that was already registered (shouldn't be registered
      if using it in this context).
      
      Study has found that if the driver were in discovery attempts and
      encountered an error, it wouldn't flag the temporary rpi in error.  Yet the
      rpi was released for reallocation in these error paths and another ELS
      could allocate the rpi. In the failure situation a retry was done on an ELS
      that had encountered an error, and as the rpi wasn't marked in error, the
      ELS reused the rpi it originally allocated. But that rpi had been allocated
      by a different ELS issued after the original error and before the retry
      attempt. The different ELS had succeeded and the RPI was registered.
      
      Fix by marking the rpi state for the node to be in error, aka as needing
      reallocation, upon an error in the els processing.  Error state marking is
      always done prior to release back to the internal rpi free list, which the
      driver wasn't doing in cases prior.
      
      Also enhanced some of the logging to help in the next case of problem
      troubleshooting.
      
      Link: https://lore.kernel.org/r/20190922035906.10977-7-jsmart2021@gmail.comSigned-off-by: default avatarDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: default avatarJames Smart <jsmart2021@gmail.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      0f154226
    • James Smart's avatar
      scsi: lpfc: Fix rpi release when deleting vport · 97acd001
      James Smart authored
      A prior use-after-free mailbox fix solved it's problem by null'ing a ndlp
      pointer.  However, further testing has shown that this change causes a
      later state change to occasionally be skipped, which results in a reference
      count never being decremented thus the rpi is never released, which causes
      a vport delete to never succeed.
      
      Revise the fix in the prior patch to no longer null the ndlp. Instead the
      RELEASE_RPI flag is set which will drive the release of the rpi.
      
      Given the new code was added at a deep indentation level, refactor the code
      block using a new routine that avoids the indentation issues.
      
      Fixes: 	9b164068 ("scsi: lpfc: Fix use-after-free mailbox cmd completion")
      Link: https://lore.kernel.org/r/20190922035906.10977-6-jsmart2021@gmail.comSigned-off-by: default avatarDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: default avatarJames Smart <jsmart2021@gmail.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      97acd001
  2. 07 Sep, 2019 1 commit
  3. 20 Aug, 2019 6 commits
  4. 21 Jun, 2019 1 commit
  5. 13 Apr, 2019 1 commit
  6. 08 Apr, 2019 1 commit
  7. 04 Apr, 2019 1 commit
  8. 19 Mar, 2019 3 commits
  9. 06 Feb, 2019 3 commits
  10. 20 Dec, 2018 2 commits
  11. 13 Dec, 2018 1 commit
  12. 08 Dec, 2018 3 commits
    • James Smart's avatar
      scsi: lpfc: Defer LS_ACC to FLOGI on point to point logins · 0a9e9687
      James Smart authored
      The current discovery state machine the driver treated FLOGI oddly.  When
      point to point, an FLOGI is to be exchanged by the two ports, with the port
      with the most significant WWN then proceeding with PLOGI.  The
      implementation in the driver was keyed to closely with "what have I sent",
      not with what has happened between the two endpoints. Thus, it blatantly
      would ACC an FLOGI, but reject PLOGI's until it had its FLOGI ACC'd. The
      problem is - the sending of FLOGI may be delayed for some reason, or the
      response to FLOGI held off by the other side. In the failing situation the
      other side sent an FLOGI, which was ACC'd, then sent PLOGIs which were then
      rjt'd until the retry count for the PLOGIs were exceeded and the port gave
      up. The FLOGI may have been very late in transmit, or the response held off
      until the PLOGIs failed. Given the other port had the higher WWN, no PLOGIs
      would occur and communication stopped.
      
      Correct the situation by changing the FLOGI handling. Defer any response to
      an FLOGI until the driver has sent its FLOGI as well. Then, upon either
      completion of the sent FLOGI, or upon sending an ACC to a received FLOGI
      (which may be received before or just after FLOGI was sent). the driver
      will act on who has the higher WWN. if the other port does, the driver will
      noop any handling of an FLOGI response (if outstanding) and wait for PLOGI.
      If the local port does, the driver will transition to sending PLOGI and
      will noop any action on responding to an FLOGI (if not yet received).
      
      Fortunately, to implement this, it only took another state flag and
      deferring any FLOGI response if the FLOGI has yet to be transmit. All
      subsequent actions were already in place.
      Signed-off-by: default avatarDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: default avatarJames Smart <jsmart2021@gmail.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      0a9e9687
    • James Smart's avatar
      scsi: lpfc: Fix discovery failures during port failovers with lots of vports · dea16bda
      James Smart authored
      The driver is getting hit with 100s of RSCNs during remote port address
      changes. Each of those RSCN's ends up generating UNREG_RPI and REG_PRI
      mailbox commands.  The discovery engine within the driver doesn't wait for
      the mailbox command completions. Instead it sets state flags and moves
      forward. At some point, there's a massive backlog of mailbox commands which
      take time for the adapter to process. Additionally, it appears there were
      duplicate events from the switch so the driver generated duplicate mailbox
      commands for the same remote port.  During this window, failures on PLOGI
      and PRLI ELS's are see as the adapter is rejecting them as they are for
      remote ports that still have pending mailbox commands.
      
      Streamline the discovery engine so that PLOGI log checks for outstanding
      UNREG_RPIs and defer the processing until the commands complete. This
      better synchronizes the ELS transmission vs the RPI registrations.
      
      Filter out multiple UNREG_RPIs being queued up for the same remote port.
      
      Beef up log messages in this area.
      Signed-off-by: default avatarDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: default avatarJames Smart <jsmart2021@gmail.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      dea16bda
    • James Smart's avatar
      scsi: lpfc: refactor mailbox structure context fields · 3e1f0718
      James Smart authored
      The driver data structure for managing a mailbox command contained two
      context fields. Unfortunately, the context were considered "generic" to be
      used at the whim of the command code.  Of course, one section of code used
      fields this way, while another did it that way, and eventually there were
      mixups.
      
      Refactored the structure so that the generic contexts become a node context
      and a buffer context and all code standardizes on their use.
      Signed-off-by: default avatarDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: default avatarJames Smart <jsmart2021@gmail.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      3e1f0718
  13. 07 Nov, 2018 4 commits
  14. 17 Oct, 2018 1 commit
  15. 12 Sep, 2018 2 commits
  16. 11 Jul, 2018 1 commit
  17. 29 May, 2018 1 commit
  18. 18 Apr, 2018 1 commit
  19. 13 Mar, 2018 1 commit
  20. 23 Feb, 2018 1 commit
  21. 12 Feb, 2018 2 commits
  22. 05 Dec, 2017 1 commit