• Ted Kim's avatar
    ib/cm: Change reject message type when destroying cm_id · c29ed5a4
    Ted Kim authored
    Problem reported by: Ted Kim <ted.h.kim@oracle.com>:
    
    We have a case where a Linux system and a non-Linux system are
    trying to interoperate.  The Linux host is the active side and
    starts the connection establishment, but later decides to not go
    through with the connection setup and does rdma_destroy_id().
    
    The rdma_destroy_id() eventually works its way down to cm_destroy_id()
    in core/cm.c, where a REJ is sent. The non-Linux system
    has some trouble recognizing the REJ because of:
    
    A. CM states which can't receive the REJ
    B. Some issues about REJ formatting (missing comm ID)
    
    ISSUE A: That part of the spec says, a Consumer Reject REJ can be
    sent for a connection abort, but it goes further
    and says: can send a REJ message with a "Consumer Reject"
    Reason code if they are in a CM state (i.e. REP
    Rcvd, MRA(REP) Sent, REQ Rcvd, MRA Sent) that allows
    a REJ to be sent (lines 35-38).
    
    Of the states listed there in that sentence, it would
    seem to limit the active side to using the Consumer Reject
    (for the abort case) in just the REP-Rcvd and MRA-REP-Sent
    states. That is basically only after the active side
    sees a REP (or alternatively goes down the state transitions
    to timeout in which case a Timeout REJ is sent).
    
    As a fix, in cm-destroy-id() move the IB-CM-MRA-REQ-RCVD case
    to the same as REQ-SENT.  Essentially, make a REJ sent after
    getting an MRA on active side a timeout rather than Consumer-
    Reject, which is arguably more correct with the CM state
    diagrams previous to getting a REP.
    Signed-off-by: default avatarTed Kim <ted.h.kim@oracle.com>
    Signed-off-by: default avatarSean Hefty <sean.hefty@intel.com>
    c29ed5a4
cm.c 108 KB