Commit d73a572d authored by Pavel Begunkov's avatar Pavel Begunkov Committed by Jens Axboe

io_uring: optimize local tw add ctx pinning

We currently pin the ctx for io_req_local_work_add() with
percpu_ref_get/put, which implies two rcu_read_lock/unlock pairs and some
extra overhead on top in the fast path. Replace it with a pure rcu read
and let io_ring_exit_work() synchronise against it.
Signed-off-by: default avatarPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/cbdfcb6b232627f30e9e50ef91f13c4f05910247.1680782017.git.asml.silence@gmail.comSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
parent ab1c590f
...@@ -1332,9 +1332,9 @@ void __io_req_task_work_add(struct io_kiocb *req, bool allow_local) ...@@ -1332,9 +1332,9 @@ void __io_req_task_work_add(struct io_kiocb *req, bool allow_local)
struct io_ring_ctx *ctx = req->ctx; struct io_ring_ctx *ctx = req->ctx;
if (allow_local && ctx->flags & IORING_SETUP_DEFER_TASKRUN) { if (allow_local && ctx->flags & IORING_SETUP_DEFER_TASKRUN) {
percpu_ref_get(&ctx->refs); rcu_read_lock();
io_req_local_work_add(req); io_req_local_work_add(req);
percpu_ref_put(&ctx->refs); rcu_read_unlock();
return; return;
} }
...@@ -3052,6 +3052,10 @@ static __cold void io_ring_exit_work(struct work_struct *work) ...@@ -3052,6 +3052,10 @@ static __cold void io_ring_exit_work(struct work_struct *work)
spin_lock(&ctx->completion_lock); spin_lock(&ctx->completion_lock);
spin_unlock(&ctx->completion_lock); spin_unlock(&ctx->completion_lock);
/* pairs with RCU read section in io_req_local_work_add() */
if (ctx->flags & IORING_SETUP_DEFER_TASKRUN)
synchronize_rcu();
io_ring_ctx_free(ctx); io_ring_ctx_free(ctx);
} }
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment