• Krishna Kumar's avatar
    RDMA/cma: Fix device removal race · 6e35aabe
    Krishna Kumar authored
    The race is as follows:
    
    A process : cma_process_remove() calls cma_remove_id_dev(),
    	    which sets id state to CMA_DEVICE_REMOVAL and
    	    calls wait_event(dev_remove).
    
    B process : cma_req_handler() had incremented dev_remove,
    	    and calls cma_acquire_ib_dev() and on failure
    	    calls cma_release_remove(), which does a
    	    wake_up of cma_process_remove(). Then
    	    cma_req_handler() calls rdma_destroy_id();
    
    A Process : cma_remove_id_dev() gets woken and checks the
    	    state of id, and since it is still (wrongly)
    	    CMA_DEVICE_REMOVAL, it calls notify_user(id)
    	    and if that fails, the caller - cma_process_remove()
    	    calls rdma_destroy_id(id). Two processes can
    	    call rdma_destroy_id(), resulting in one
    	    de-referencing kfreed id_priv.
    
    Fix is for process B to set CMA_DESTROYING in cma_req_handler()
    so that process A will return instead of doing a rdma_destroy_id().
    Signed-off-by: default avatarKrishna Kumar <krkumar2@in.ibm.com>
    Signed-off-by: default avatarSean Hefty <sean.hefty@intel.com>
    Signed-off-by: default avatarRoland Dreier <rolandd@cisco.com>
    6e35aabe
cma.c 52.9 KB