Commits · 6bb30855560e6343e7b88595d7c3159d0f848a04 · Kirill Smelkov / linux

29 Jan, 2023 40 commits

io_uring: if a linked request has REQ_F_FORCE_ASYNC then run it async · 6bb30855

Dylan Yudaken authored Jan 27, 2023

REQ_F_FORCE_ASYNC was being ignored for re-queueing linked
requests. Instead obey that flag.
Signed-off-by: Dylan Yudaken <dylany@meta.com>
Link: https://lore.kernel.org/r/20230127135227.3646353-2-dylany@meta.comSigned-off-by: Jens Axboe <axboe@kernel.dk>

6bb30855

io_uring: add reschedule point to handle_tw_list() · f5868008

Jens Axboe authored Jan 27, 2023

If CONFIG_PREEMPT_NONE is set and the task_work chains are long, we
could be running into issues blocking others for too long. Add a
reschedule check in handle_tw_list(), and flush the ctx if we need to
reschedule.

Cc: stable@vger.kernel.org # 5.10+
Signed-off-by: Jens Axboe <axboe@kernel.dk>

f5868008

io_uring: add a conditional reschedule to the IOPOLL cancelation loop · fcc926bb

Jens Axboe authored Jan 27, 2023

If the kernel is configured with CONFIG_PREEMPT_NONE, we could be
sitting in a tight loop reaping events but not giving them a chance to
finish. This results in a trace ala:

rcu: INFO: rcu_sched self-detected stall on CPU
rcu: 	2-...!: (5249 ticks this GP) idle=935c/1/0x4000000000000000 softirq=4265/4274 fqs=1
	(t=5251 jiffies g=465 q=4135 ncpus=4)
rcu: rcu_sched kthread starved for 5249 jiffies! g465 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
rcu: 	Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
rcu: RCU grace-period kthread stack dump:
task:rcu_sched       state:R  running task     stack:0     pid:12    ppid:2      flags:0x00000008
Call trace:
 __switch_to+0xb0/0xc8
 __schedule+0x43c/0x520
 schedule+0x4c/0x98
 schedule_timeout+0xbc/0xdc
 rcu_gp_fqs_loop+0x308/0x344
 rcu_gp_kthread+0xd8/0xf0
 kthread+0xb8/0xc8
 ret_from_fork+0x10/0x20
rcu: Stack dump where RCU GP kthread last ran:
Task dump for CPU 0:
task:kworker/u8:10   state:R  running task     stack:0     pid:89    ppid:2      flags:0x0000000a
Workqueue: events_unbound io_ring_exit_work
Call trace:
 __switch_to+0xb0/0xc8
 0xffff0000c8fefd28
CPU: 2 PID: 95 Comm: kworker/u8:13 Not tainted 6.2.0-rc5-00042-g40316e337c80-dirty #2759
Hardware name: linux,dummy-virt (DT)
Workqueue: events_unbound io_ring_exit_work
pstate: 61400005 (nZCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
pc : io_do_iopoll+0x344/0x360
lr : io_do_iopoll+0xb8/0x360
sp : ffff800009bebc60
x29: ffff800009bebc60 x28: 0000000000000000 x27: 0000000000000000
x26: ffff0000c0f67d48 x25: ffff0000c0f67840 x24: ffff800008950024
x23: 0000000000000001 x22: 0000000000000000 x21: ffff0000c27d3200
x20: ffff0000c0f67840 x19: ffff0000c0f67800 x18: 0000000000000000
x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
x14: 0000000000000001 x13: 0000000000000001 x12: 0000000000000000
x11: 0000000000000179 x10: 0000000000000870 x9 : ffff800009bebd60
x8 : ffff0000c27d3ad0 x7 : fefefefefefefeff x6 : 0000646e756f626e
x5 : ffff0000c0f67840 x4 : 0000000000000000 x3 : ffff0000c2398000
x2 : 0000000000000000 x1 : 0000000000000000 x0 : 0000000000000000
Call trace:
 io_do_iopoll+0x344/0x360
 io_uring_try_cancel_requests+0x21c/0x334
 io_ring_exit_work+0x90/0x40c
 process_one_work+0x1a4/0x254
 worker_thread+0x1ec/0x258
 kthread+0xb8/0xc8
 ret_from_fork+0x10/0x20

Add a cond_resched() in the cancelation IOPOLL loop to fix this.

Cc: stable@vger.kernel.org # 5.10+
Signed-off-by: Jens Axboe <axboe@kernel.dk>

fcc926bb

io_uring: return normal tw run linking optimisation · 50470fc5

Pavel Begunkov authored Jan 23, 2023

io_submit_flush_completions() may produce new task_work items, so it's a
good idea to recheck the task_work list after flushing completions. The
optimisation is not new and was accidentially removed by
f88262e6 ("io_uring: lockless task list")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/a7ed5ede84de190832cc33ebbcdd6e91cd90f5b6.1674484266.git.asml.silence@gmail.comSigned-off-by: Jens Axboe <axboe@kernel.dk>

50470fc5

io_uring: refactor tctx_task_work · cb6bf7f2

Pavel Begunkov authored Jan 23, 2023

Merge almost identical sections of tctx_task_work(), this will make code
modifications later easier and also inlines handle_tw_list().
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/d06592d91e3e7559e7a4dbb8907d110863008dc7.1674484266.git.asml.silence@gmail.com
[axboe: fold in setting count to zero patch from Tom Rix]
Signed-off-by: Jens Axboe <axboe@kernel.dk>

cb6bf7f2

io_uring: refactor io_put_task helpers · 5afa4650

Pavel Begunkov authored Jan 23, 2023

Add a helper for putting refs from the target task context, rename
__io_put_task() and add a couple of comments around. Use the remote
version for __io_req_complete_post(), the local is only needed for
__io_submit_flush_completions().
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/3bf92ebd594769d8a5d648472a8e335f2031d542.1674484266.git.asml.silence@gmail.comSigned-off-by: Jens Axboe <axboe@kernel.dk>

5afa4650

io_uring: refactor req allocation · c8576f3e

Pavel Begunkov authored Jan 23, 2023

Follow the io_get_sqe pattern returning the result via a pointer
and hide request cache refill inside io_alloc_req().
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/8c37c2e8a3cb5e4cd6a8ae3b91371227a92708a6.1674484266.git.asml.silence@gmail.comSigned-off-by: Jens Axboe <axboe@kernel.dk>

c8576f3e

io_uring: improve io_get_sqe · b5083dfa

Pavel Begunkov authored Jan 23, 2023

Return an SQE from io_get_sqe() as a parameter and use the return value
to determine if it failed or not. This enables the compiler to compile out
the sqe NULL check when we know that the return SQE is valid.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/9cceb11329240ea097dffef6bf0a675bca14cf42.1674484266.git.asml.silence@gmail.com
[axboe: remove bogus const modifier on return value]
Signed-off-by: Jens Axboe <axboe@kernel.dk>

b5083dfa

io_uring: kill outdated comment about overflow flush · b2aa66af

Pavel Begunkov authored Jan 23, 2023

__io_cqring_overflow_flush() doesn't return anything anymore, remove
outdate comment.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/4ce2bcbb17eac80cdf883fd1459d5ee6586e238c.1674484266.git.asml.silence@gmail.comSigned-off-by: Jens Axboe <axboe@kernel.dk>

b2aa66af

io_uring: use user visible tail in io_uring_poll() · c10bb646

Pavel Begunkov authored Jan 23, 2023

We return POLLIN from io_uring_poll() depending on whether there are
CQEs for the userspace, and so we should use the user visible tail
pointer instead of a transient cached value.

Cc: stable@vger.kernel.org
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/228ffcbf30ba98856f66ffdb9a6a60ead1dd96c0.1674484266.git.asml.silence@gmail.comSigned-off-by: Jens Axboe <axboe@kernel.dk>

c10bb646

io_uring: pass in io_issue_def to io_assign_file() · f4992544

Jens Axboe authored Jan 20, 2023

This generates better code for me, avoiding an extra load on arm64, and
both call sites already have this variable available for easy passing.
Signed-off-by: Jens Axboe <axboe@kernel.dk>

f4992544

io_uring: Enable KASAN for request cache · c1755c25

Breno Leitao authored Jan 18, 2023

Every io_uring request is represented by struct io_kiocb, which is
cached locally by io_uring (not SLAB/SLUB) in the list called
submit_state.freelist. This patch simply enabled KASAN for this free
list.

This list is initially created by KMEM_CACHE, but later, managed by
io_uring. This patch basically poisons the objects that are not used
(i.e., they are the free list), and unpoisons it when the object is
allocated/removed from the list.

Touching these poisoned objects while in the freelist will cause a KASAN
warning.
Suggested-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

c1755c25

io_uring: handle TIF_NOTIFY_RESUME when checking for task_work · b5d3ae20

Jens Axboe authored Jan 24, 2023

If TIF_NOTIFY_RESUME is set, then we need to call resume_user_mode_work()
for PF_IO_WORKER threads. They never return to usermode, hence never get
a chance to process any items that are marked by this flag. Most notably
this includes the final put of files, but also any throttling markers set
by block cgroups.

Cc: stable@vger.kernel.org # 5.10+
Signed-off-by: Jens Axboe <axboe@kernel.dk>

b5d3ae20

io_uring/msg-ring: ensure flags passing works for task_work completions · 8572df94

Jens Axboe authored Jan 21, 2023

If the target ring is using IORING_SETUP_SINGLE_ISSUER and we're posting
a message from a different thread, then we need to ensure that the
fallback task_work that posts the CQE knwos about the flags passing as
well. If not we'll always be posting 0 as the flags.

Fixes: 3563d7ed58a5 ("io_uring/msg_ring: Pass custom flags to the cqe")
Signed-off-by: Jens Axboe <axboe@kernel.dk>

8572df94

io_uring: Split io_issue_def struct · f30bd4d0

Breno Leitao authored Jan 12, 2023

This patch removes some "cold" fields from `struct io_issue_def`.

The plan is to keep only highly used fields into `struct io_issue_def`, so,
it may be hot in the cache. The hot fields are basically all the bitfields
and the callback functions for .issue and .prep.

The other less frequently used fields are now located in a secondary and
cold struct, called `io_cold_def`.

This is the size for the structs:

Before: io_issue_def = 56 bytes
After: io_issue_def = 24 bytes; io_cold_def = 40 bytes
Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/20230112144411.2624698-2-leitao@debian.orgSigned-off-by: Jens Axboe <axboe@kernel.dk>

f30bd4d0

io_uring: Rename struct io_op_def · a7dd2782

Breno Leitao authored Jan 12, 2023

The current io_op_def struct is becoming huge and the name is a bit
generic.

The goal of this patch is to rename this struct to `io_issue_def`. This
struct will contain the hot functions associated with the issue code
path.

For now, this patch only renames the structure, and an upcoming patch
will break up the structure in two, moving the non-issue fields to a
secondary struct.
Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/20230112144411.2624698-1-leitao@debian.orgSigned-off-by: Jens Axboe <axboe@kernel.dk>

a7dd2782

io_uring: refactor __io_req_complete_post · 68a2cc1b

Pavel Begunkov authored Jan 16, 2023

Keep parts of __io_req_complete_post() relying on req->flags together so
the value can be cached.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/2b4fbb42f404a0e75c4d9f0a5b16f314a839d0a9.1673887636.git.asml.silence@gmail.comSigned-off-by: Jens Axboe <axboe@kernel.dk>

68a2cc1b

io_uring: optimise ctx flags layout · 632ffe09

Pavel Begunkov authored Jan 16, 2023

There may be different cost for reeading just one byte or more, so it's
benificial to keep ctx flag bits that we access together in a single
byte. That affected code generation of __io_cq_unlock_post_flush() and
removed one memory load.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/bbe8ca4705704690319d65e45845f9fc9d35f420.1673887636.git.asml.silence@gmail.comSigned-off-by: Jens Axboe <axboe@kernel.dk>

632ffe09

io_uring: simplify fallback execution · 31f084b7

Pavel Begunkov authored Jan 16, 2023

Lock the ring with uring_lock in io_fallback_req_func(), which should
make it a bit safer and easier. With that we also don't need refs
pinning as io_ring_exit_work() will wait until uring_lock is freed.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/56170e6a0cbfc8edee2794c6613e8f6f1d76d276.1673887636.git.asml.silence@gmail.comSigned-off-by: Jens Axboe <axboe@kernel.dk>

31f084b7

io_uring: don't export io_put_task() · 89800a2d

Pavel Begunkov authored Jan 16, 2023

io_put_task() is only used in uring.c so enclose it there together with
__io_put_task().
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/43c7f9227e2ab215f1a6069dadbc5382bed346fe.1673887636.git.asml.silence@gmail.comSigned-off-by: Jens Axboe <axboe@kernel.dk>

89800a2d

io_uring: return back links tw run optimisation · b0b7a7d2

Pavel Begunkov authored Jan 16, 2023

io_submit_flush_completions() may queue new requests for tw execution,
especially true for linked requests. Recheck the tw list for emptiness
after flushing completions.

Note that this doesn't really fix the commit referenced below, but it
does reinstate an optimization that existed before that got merged.

Fixes: f88262e6 ("io_uring: lockless task list")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/6328acdbb5e60efc762b18003382de077e6e1367.1673887636.git.asml.silence@gmail.comSigned-off-by: Jens Axboe <axboe@kernel.dk>

b0b7a7d2

io_uring: make io_sqpoll_wait_sq return void · 88b80534

Quanfa Fu authored Jan 15, 2023

Change the return type to void since it always return 0, and no need
to do the checking in syscall io_uring_enter.
Signed-off-by: Quanfa Fu <quanfafu@gmail.com>
Link: https://lore.kernel.org/r/20230115071519.554282-1-quanfafu@gmail.comSigned-off-by: Jens Axboe <axboe@kernel.dk>

88b80534

io_uring: optimise deferred tw execution · c3f4d39e

Pavel Begunkov authored Jan 09, 2023

We needed fake nodes in __io_run_local_work() and to avoid unecessary wake
ups while the task already running task_works, but we don't need them
anymore since wake ups are protected by cq_waiting, which is always
cleared by the time we're executing deferred task_work items.

Note that because of loose sync around cq_waiting clearing
io_req_local_work_add() may wake the task more than once, but that's
fine and should be rare to not hurt perf.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/8839534891f0a2f1076e78554a31ea7e099f7de5.1673274244.git.asml.silence@gmail.comSigned-off-by: Jens Axboe <axboe@kernel.dk>

c3f4d39e

io_uring: add io_req_local_work_add wake fast path · d80c0f00

Pavel Begunkov authored Jan 09, 2023

Don't wake the master task after queueing a deferred tw unless it's
currently waiting in io_cqring_wait.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/717702d772825a6647e6c315b4690277ba84c3fc.1673274244.git.asml.silence@gmail.comSigned-off-by: Jens Axboe <axboe@kernel.dk>

d80c0f00

io_uring: waitqueue-less cq waiting · 130bd686

Pavel Begunkov authored Jan 09, 2023

With DEFER_TASKRUN only ctx->submitter_task might be waiting for CQEs,
we can use this to optimise io_cqring_wait(). Replace ->cq_wait
waitqueue with waking the task directly.

It works but misses an important optimisation covered by the following
patch, so this patch without follow ups might hurt performance.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/103d174d35d919d4cb0922d8a9c93a8f0c35f74a.1673274244.git.asml.silence@gmail.comSigned-off-by: Jens Axboe <axboe@kernel.dk>

130bd686

io_uring: wake up optimisations · 3181e22f

Pavel Begunkov authored Jan 09, 2023

Flush completions is done either from the submit syscall or by the
task_work, both are in the context of the submitter task, and when it
goes for a single threaded rings like implied by ->task_complete, there
won't be any waiters on ->cq_wait but the master task. That means that
there can be no tasks sleeping on cq_wait while we run
__io_submit_flush_completions() and so waking up can be skipped.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/60ad9768ec74435a0ddaa6eec0ffa7729474f69f.1673274244.git.asml.silence@gmail.comSigned-off-by: Jens Axboe <axboe@kernel.dk>

3181e22f

io_uring: add lazy poll_wq activation · bca39f39

Pavel Begunkov authored Jan 09, 2023

Even though io_poll_wq_wake()'s waitqueue_active reuses a barrier we do
for another waitqueue, it's not going to be the case in the future and
so we want to have a fast path for it when the ring has never been
polled.

Move poll_wq wake ups into __io_commit_cqring_flush() using a new flag
called ->poll_activated. The idea behind the flag is to set it when the
ring was polled for the first time. This requires additional sync to not
miss events, which is done here by using task_work for ->task_complete
rings, and by default enabling the flag for all other types of rings.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/060785e8e9137a920b232c0c7f575b131af19cac.1673274244.git.asml.silence@gmail.comSigned-off-by: Jens Axboe <axboe@kernel.dk>

bca39f39

io_uring: separate wq for ring polling · 7b235dd8

Pavel Begunkov authored Jan 09, 2023

Don't use ->cq_wait for ring polling but add a separate wait queue for
it. We need it for following patches.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/dea0be0bf990503443c5c6c337fc66824af7d590.1673274244.git.asml.silence@gmail.comSigned-off-by: Jens Axboe <axboe@kernel.dk>

7b235dd8

io_uring: move io_run_local_work_locked · 360173ab

Pavel Begunkov authored Jan 09, 2023

io_run_local_work_locked() is only used in io_uring.c, move it there.
With that we can also make __io_run_local_work() static.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/91757bcb33e5774e49fed6f2b6e058630608119b.1673274244.git.asml.silence@gmail.comSigned-off-by: Jens Axboe <axboe@kernel.dk>

360173ab

io_uring: mark io_run_local_work static · 3e565555

Pavel Begunkov authored Jan 09, 2023

io_run_local_work is enclosed in io_uring.c, we don't need to export it.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/b477fb81f5e77044f724a06fe245d5c078659364.1673274244.git.asml.silence@gmail.comSigned-off-by: Jens Axboe <axboe@kernel.dk>

3e565555

io_uring: don't set TASK_RUNNING in local tw runner · 2f413956

Pavel Begunkov authored Jan 09, 2023

The CQ waiting loop sets TASK_RUNNING before trying to execute
task_work, no need to repeat it in io_run_local_work().
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/9d9422c429ef3f9457b4f4b8288bf4789564f33b.1673274244.git.asml.silence@gmail.comSigned-off-by: Jens Axboe <axboe@kernel.dk>

2f413956

io_uring: refactor io_wake_function · bd550173

Pavel Begunkov authored Jan 09, 2023

Remove a local variable ctx in io_wake_function(), we don't need it if
io_should_wake() triggers it to wake up.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/e60eb1008aebe286aab7d34c772ed01c447bddb1.1673274244.git.asml.silence@gmail.comSigned-off-by: Jens Axboe <axboe@kernel.dk>

bd550173

io_uring: move submitter_task out of cold cacheline · dde40322

Pavel Begunkov authored Jan 09, 2023

->submitter_task is used somewhat more frequent now than before, i.e.
for local tw enqueue and run, let's move it from the end of ctx, which
is full of cold data, to the first cacheline with mostly constants.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/415ca91dc5ad1dec612b892e489cda98e1069542.1673274244.git.asml.silence@gmail.comSigned-off-by: Jens Axboe <axboe@kernel.dk>

dde40322

io_uring: remove excessive unlikely on IS_ERR · 81594e7e

Dmitrii Bundin authored Jan 09, 2023

The IS_ERR function uses the IS_ERR_VALUE macro under the hood which
already wraps the condition into unlikely.
Signed-off-by: Dmitrii Bundin <dmitrii.bundin.a@gmail.com>
Link: https://lore.kernel.org/r/20230109185854.25698-1-dmitrii.bundin.a@gmail.comSigned-off-by: Jens Axboe <axboe@kernel.dk>

81594e7e

io_uring/msg_ring: Pass custom flags to the cqe · cbeb47a7

Breno Leitao authored Jan 03, 2023

This patch adds a new flag (IORING_MSG_RING_FLAGS_PASS) in the message
ring operations (IORING_OP_MSG_RING). This new flag enables the sender
to specify custom flags, which will be copied over to cqe->flags in the
receiving ring.  These custom flags should be specified using the
sqe->file_index field.

This mechanism provides additional flexibility when sending messages
between rings.
Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://lore.kernel.org/r/20230103160507.617416-1-leitao@debian.orgSigned-off-by: Jens Axboe <axboe@kernel.dk>

cbeb47a7

io_uring: keep timeout in io_wait_queue · d33a39e5

Pavel Begunkov authored Jan 05, 2023

Move waiting timeout into io_wait_queue
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/e4b48a9e26a3b1cf97c80121e62d4b5ab873d28d.1672916894.git.asml.silence@gmail.comSigned-off-by: Jens Axboe <axboe@kernel.dk>

d33a39e5

io_uring: optimise non-timeout waiting · 46ae7eef

Pavel Begunkov authored Jan 05, 2023

Unlike the jiffy scheduling version, schedule_hrtimeout() jumps a few
functions before getting into schedule() even if there is no actual
timeout needed. Some tests showed that it takes up to 1% of CPU cycles.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/89f880574eceee6f4899783377ead234df7b3d04.1672916894.git.asml.silence@gmail.comSigned-off-by: Jens Axboe <axboe@kernel.dk>

46ae7eef

io_uring: set TASK_RUNNING right after schedule · 326a9e48

Pavel Begunkov authored Jan 05, 2023

Instead of constantly watching that the state of the task is running
before executing tw or taking locks in io_cqring_wait(), switch it back
to TASK_RUNNING immediately.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/246dddee247d89fd52023f785ed17cc34962a008.1672916894.git.asml.silence@gmail.comSigned-off-by: Jens Axboe <axboe@kernel.dk>

326a9e48

io_uring: simplify io_has_work · 490c00eb

Pavel Begunkov authored Jan 05, 2023

->work_llist should never be non-empty for a non DEFER_TASKRUN ring, so
we can safely skip checking the flag.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/26af9f73c09a56c9a035f94db56127358688f3aa.1672916894.git.asml.silence@gmail.comSigned-off-by: Jens Axboe <axboe@kernel.dk>

490c00eb

io_uring: mimimise io_cqring_wait_schedule · 846072f1

Pavel Begunkov authored Jan 05, 2023

io_cqring_wait_schedule() is called after we started waiting on the cq
wq and set the state to TASK_INTERRUPTIBLE, for that reason we have to
constantly worry whether we has returned the state back to running or
not. Leave only quick checks in io_cqring_wait_schedule() and move the
rest including running task work to the callers. Note, we run tw in the
loop after the sched checks because of the fast path in the beginning of
the function.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/2814fabe75e2e019e7ca43ea07daa94564349805.1672916894.git.asml.silence@gmail.comSigned-off-by: Jens Axboe <axboe@kernel.dk>

846072f1