• James Smart's avatar
    scsi: lpfc: Fix incomplete NVME discovery when target · be0709e4
    James Smart authored
    NVMe device re-discovery does not complete. Dev_loss_tmo messages seen on
    initiator after recovery from a link disturbance.
    
    The failing case is the following:
    
    When the driver (as a NVME target) receives a PLOGI, the driver initiates
    an "unreg rpi" mailbox command. While the mailbox command is in progress,
    the driver requests that an ACC be sent to the initiator. The target's ACC
    is received by the initiator and the initiator then transmits a PLOGI. The
    driver receives the PLOGI prior to receiving the completion for the PLOGI
    response WQE that sent the ACC. (Different delivery sources from the hw so
    the race is very possible). Given the PLOGI is prior to the ACC completion
    (signifying PLOGI exchange complete), the driver LS_RJT's the PRLI. The
    "unreg rpi" mailbox then completes. Since PRLI has been received, the
    driver transmits a PLOGI to restart discovery, which the initiator then
    ACC's.  If the driver processes the (re)PLOGI ACC prior to the completing
    the handling for the earlier ACC it sent the intiators original PLOGI,
    there is no state change for completion of the (re)PLOGI. The ndlp remains
    in "PLOGI Sent" and the initiator continues sending PRLI's which are
    rejected by the target until timeout or retry is reached.
    
    Fix by: When in target mode, defer sending an ACC for the received PLOGI
    until unreg RPI completes.
    
    Link: https://lore.kernel.org/r/20191218235808.31922-2-jsmart2021@gmail.comSigned-off-by: default avatarDick Kennedy <dick.kennedy@broadcom.com>
    Signed-off-by: default avatarJames Smart <jsmart2021@gmail.com>
    Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
    be0709e4
lpfc_nportdisc.c 93.5 KB