• Faisal Latif's avatar
    RDMA/nes: Fix crash in nes_accept() · c5a7d489
    Faisal Latif authored
    While running IMP_EXT's window test, we saw a crash in nes_accept().
    Here is the sequence of what happened:
    
    (1) In MVAPICH2, connect request is received for port #0.
    
    FIX:  Add a nes_connect() check to make sure local or remote tcp port
          is not 0.
    
    (2) Remote node's (passive) TCP stack sends a reset when it gets a
        connect request because of port = 0.  Active side set the connect
        error to IW_CM_EVENT_STATUS_REJECTED when it received the RST from
        remote node.
    
    FIX: The corect error code is -ECONNRESET.
    
    (3) Wrong error code of IW_CM_EVENT_STATUS_REJECTED causes the core to
        destroy its listener ports.  Here there are connections that may
        have sent an MPA request up and waiting for accept or reject.  But
        the listener and its cm_nodes have been freed already causing the
        crash noticed.
    
    FIX: The cm_node is freed only if its state is not
         NES_CM_STATE_MPAREQ_RCVD.  If cm_node's state is
         NES_CM_STATE_MPAREQ_RCVD then its new state is set to
         NES_CM_STATE_LISTENER_DESTROYED and it is not freed.  When
         nes_accept() or nes_reject() is received, its state is checked
         for NES_CM_STATE_LISTENER_DESTROYED and in this case the cm_node
         is freed and error is returned.
    Signed-off-by: default avatarFaisal Latif <faisal.latif@intel.com>
    Signed-off-by: default avatarRoland Dreier <rolandd@cisco.com>
    c5a7d489
nes_cm.h 10.9 KB