1. 21 May, 2012 1 commit
    • James Bottomley's avatar
      Merge tag 'isci-for-3.5' into misc · e3469333
      James Bottomley authored
      isci update for 3.5
      
      1/ Rework remote-node-context (RNC) handling for proper management of
         the silicon state machine in error handling and hot-plug conditions.
         Further details below, suffice to say if the RNC is mismanaged the
         silicon state machines may lock up.
      
      2/ Refactor the initialization code to be reused for suspend/resume support
      
      3/ Miscellaneous bug fixes to address discovery issues and hardware
         compatibility.
      
      RNC rework details from Jeff Skirvin:
      
      In the controller, devices as they appear on a SAS domain (or
      direct-attached SATA devices) are represented by memory structures known
      as "Remote Node Contexts" (RNCs).  These structures are transferred from
      main memory to the controller using a set of register commands; these
      commands include setting up the context ("posting"), removing the
      context ("invalidating"), and commands to control the scheduling of
      commands and connections to that remote device ("suspensions" and
      "resumptions").  There is a similar path to control RNC scheduling from
      the protocol engine, which interprets the results of command and data
      transmission and reception.
      
      In general, the controller chooses among non-suspended RNCs to find one
      that has work requiring scheduling the transmission of command and data
      frames to a target.  Likewise, when a target tries to return data back
      to the initiator, the state of the RNC is used by the controller to
      determine how to treat the incoming request. As an example, if the RNC
      is in the state "TX/RX Suspended", incoming SSP connection requests from
      the target will be rejected by the controller hardware.  When an RNC is
      "TX Suspended", it will not be selected by the controller hardware to
      start outgoing command or data operations (with certain priority-based
      exceptions).
      
      As mentioned above, there are two sources for management of the RNC
      states: commands from driver software, and the result of transmission
      and reception conditions of commands and data signaled by the controller
      hardware.  As an example of the latter, if an outgoing SSP command ends
      with a OPEN_REJECT(BAD_DESTINATION) status, the RNC state will
      transition to the "TX Suspended" state, and this is signaled by the
      controller hardware in the status to the completion of the pending
      command as well as signaled in a controller hardware event.  Examples of
      the former are included in the patch changelogs.
      
      Driver software is required to suspend the RNC in a "TX/RX Suspended"
      condition before any outstanding commands can be terminated.  Failure to
      guarantee this can lead to a complete hardware hang condition.  Earlier
      versions of the driver software did not guarantee that an RNC was
      correctly managed before I/O termination, and so operated in an unsafe
      way.
      
      Further, the driver performed unnecessary contortions to preserve the
      remote device command state and so was more complicated than it needed
      to be.  A simplifying driver assumption is that once an I/O has entered
      the error handler path without having completed in the target, the
      requirement on the driver is that all use of the sas_task must end.
      Beyond that, recovery of operation is dependent on libsas and other
      components to reset, rediscover and reconfigure the device before normal
      operation can restart.  In the driver, this simplifying assumption meant
      that the RNC management could be reduced to entry into the suspended
      state, terminating the targeted I/O request, and resuming the RNC as
      needed for device-specific management such as an SSP Abort Task or LUN
      Reset Management request.
      e3469333
  2. 17 May, 2012 39 commits