Commit 6e35aabe authored by Krishna Kumar's avatar Krishna Kumar Committed by Roland Dreier

RDMA/cma: Fix device removal race

The race is as follows:

A process : cma_process_remove() calls cma_remove_id_dev(),
	    which sets id state to CMA_DEVICE_REMOVAL and
	    calls wait_event(dev_remove).

B process : cma_req_handler() had incremented dev_remove,
	    and calls cma_acquire_ib_dev() and on failure
	    calls cma_release_remove(), which does a
	    wake_up of cma_process_remove(). Then
	    cma_req_handler() calls rdma_destroy_id();

A Process : cma_remove_id_dev() gets woken and checks the
	    state of id, and since it is still (wrongly)
	    CMA_DEVICE_REMOVAL, it calls notify_user(id)
	    and if that fails, the caller - cma_process_remove()
	    calls rdma_destroy_id(id). Two processes can
	    call rdma_destroy_id(), resulting in one
	    de-referencing kfreed id_priv.

Fix is for process B to set CMA_DESTROYING in cma_req_handler()
so that process A will return instead of doing a rdma_destroy_id().
Signed-off-by: default avatarKrishna Kumar <krkumar2@in.ibm.com>
Signed-off-by: default avatarSean Hefty <sean.hefty@intel.com>
Signed-off-by: default avatarRoland Dreier <rolandd@cisco.com>
parent 675a027c
...@@ -932,6 +932,7 @@ static int cma_req_handler(struct ib_cm_id *cm_id, struct ib_cm_event *ib_event) ...@@ -932,6 +932,7 @@ static int cma_req_handler(struct ib_cm_id *cm_id, struct ib_cm_event *ib_event)
mutex_unlock(&lock); mutex_unlock(&lock);
if (ret) { if (ret) {
ret = -ENODEV; ret = -ENODEV;
cma_exch(conn_id, CMA_DESTROYING);
cma_release_remove(conn_id); cma_release_remove(conn_id);
rdma_destroy_id(&conn_id->id); rdma_destroy_id(&conn_id->id);
goto out; goto out;
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment