Commit 3c3f2e32 authored by Sage Weil's avatar Sage Weil

ceph: fix connection fault con_work reentrancy problem

The messenger fault was clearing the BUSY bit, for reasons unclear.  This
made it possible for the con->ops->fault function to reopen the connection,
and requeue work in the workqueue--even though the current thread was
already in con_work.

This avoids a problem where the client busy loops with connection failures
on an unreachable OSD, but doesn't address the root cause of that problem.
Signed-off-by: default avatarSage Weil <sage@newdream.net>
parent e4cb4cb8
...@@ -1836,8 +1836,6 @@ static void ceph_fault(struct ceph_connection *con) ...@@ -1836,8 +1836,6 @@ static void ceph_fault(struct ceph_connection *con)
goto out; goto out;
} }
clear_bit(BUSY, &con->state); /* to avoid an improbable race */
mutex_lock(&con->mutex); mutex_lock(&con->mutex);
if (test_bit(CLOSED, &con->state)) if (test_bit(CLOSED, &con->state))
goto out_unlock; goto out_unlock;
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment