1. 14 Nov, 2019 1 commit
  2. 13 Nov, 2019 3 commits
    • Jens Axboe's avatar
      io-wq: ensure we have a stable view of ->cur_work for cancellations · 36c2f922
      Jens Axboe authored
      worker->cur_work is currently protected by the lock of the wqe that the
      worker belongs to. When we send a signal to a worker, we need a stable
      view of ->cur_work, so we need to hold that lock. But this doesn't work
      so well, since we have the opposite order potentially on queueing work.
      If POLL_ADD is used with a signalfd, then io_poll_wake() is called with
      the signal lock, and that sometimes needs to insert work items.
      
      Add a specific worker lock that protects the current work item. Then we
      can guarantee that the task we're sending a signal is currently
      processing the exact work we think it is.
      Reported-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      Reviewed-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      36c2f922
    • Jens Axboe's avatar
      io_wq: add get/put_work handlers to io_wq_create() · 7d723065
      Jens Axboe authored
      For cancellation, we need to ensure that the work item stays valid for
      as long as ->cur_work is valid. Right now we can't safely dereference
      the work item even under the wqe->lock, because while the ->cur_work
      pointer will remain valid, the work could be completing and be freed
      in parallel.
      
      Only invoke ->get/put_work() on items we know that the caller queued
      themselves. Add IO_WQ_WORK_INTERNAL for io-wq to use, which is needed
      when we're queueing a flush item, for instance.
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      7d723065
    • Jens Axboe's avatar
      io_uring: check for validity of ->rings in teardown · 15dff286
      Jens Axboe authored
      Normally the rings are always valid, the exception is if we failed to
      allocate the rings at setup time. syzbot reports this:
      
      RSP: 002b:00007ffd6e8aa078 EFLAGS: 00000246 ORIG_RAX: 00000000000001a9
      RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000441229
      RDX: 0000000000000002 RSI: 0000000020000140 RDI: 0000000000000d0d
      RBP: 00007ffd6e8aa090 R08: 0000000000000001 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: ffffffffffffffff
      R13: 0000000000000003 R14: 0000000000000000 R15: 0000000000000000
      kasan: CONFIG_KASAN_INLINE enabled
      kasan: GPF could be caused by NULL-ptr deref or user memory access
      general protection fault: 0000 [#1] PREEMPT SMP KASAN
      CPU: 1 PID: 8903 Comm: syz-executor410 Not tainted 5.4.0-rc7-next-20191113
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
      Google 01/01/2011
      RIP: 0010:__read_once_size include/linux/compiler.h:199 [inline]
      RIP: 0010:__io_commit_cqring fs/io_uring.c:496 [inline]
      RIP: 0010:io_commit_cqring+0x1e1/0xdb0 fs/io_uring.c:592
      Code: 03 0f 8e df 09 00 00 48 8b 45 d0 4c 8d a3 c0 00 00 00 4c 89 e2 48 c1
      ea 03 44 8b b8 c0 01 00 00 48 b8 00 00 00 00 00 fc ff df <0f> b6 14 02 4c
      89 e0 83 e0 07 83 c0 03 38 d0 7c 08 84 d2 0f 85 61
      RSP: 0018:ffff88808f51fc08 EFLAGS: 00010006
      RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff815abe4a
      RDX: 0000000000000018 RSI: ffffffff81d168d5 RDI: ffff8880a9166100
      RBP: ffff88808f51fc70 R08: 0000000000000004 R09: ffffed1011ea3f7d
      R10: ffffed1011ea3f7c R11: 0000000000000003 R12: 00000000000000c0
      R13: ffff8880a91661c0 R14: 1ffff1101522cc10 R15: 0000000000000000
      FS:  0000000001e7a880(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000020000140 CR3: 000000009a74c000 CR4: 00000000001406e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
        io_cqring_overflow_flush+0x6b9/0xa90 fs/io_uring.c:673
        io_ring_ctx_wait_and_kill+0x24f/0x7c0 fs/io_uring.c:4260
        io_uring_create fs/io_uring.c:4600 [inline]
        io_uring_setup+0x1256/0x1cc0 fs/io_uring.c:4626
        __do_sys_io_uring_setup fs/io_uring.c:4639 [inline]
        __se_sys_io_uring_setup fs/io_uring.c:4636 [inline]
        __x64_sys_io_uring_setup+0x54/0x80 fs/io_uring.c:4636
        do_syscall_64+0xfa/0x760 arch/x86/entry/common.c:290
        entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x441229
      Code: e8 5c ae 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7
      48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff
      ff 0f 83 bb 0a fc ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007ffd6e8aa078 EFLAGS: 00000246 ORIG_RAX: 00000000000001a9
      RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000441229
      RDX: 0000000000000002 RSI: 0000000020000140 RDI: 0000000000000d0d
      RBP: 00007ffd6e8aa090 R08: 0000000000000001 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: ffffffffffffffff
      R13: 0000000000000003 R14: 0000000000000000 R15: 0000000000000000
      Modules linked in:
      ---[ end trace b0f5b127a57f623f ]---
      RIP: 0010:__read_once_size include/linux/compiler.h:199 [inline]
      RIP: 0010:__io_commit_cqring fs/io_uring.c:496 [inline]
      RIP: 0010:io_commit_cqring+0x1e1/0xdb0 fs/io_uring.c:592
      Code: 03 0f 8e df 09 00 00 48 8b 45 d0 4c 8d a3 c0 00 00 00 4c 89 e2 48 c1
      ea 03 44 8b b8 c0 01 00 00 48 b8 00 00 00 00 00 fc ff df <0f> b6 14 02 4c
      89 e0 83 e0 07 83 c0 03 38 d0 7c 08 84 d2 0f 85 61
      RSP: 0018:ffff88808f51fc08 EFLAGS: 00010006
      RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff815abe4a
      RDX: 0000000000000018 RSI: ffffffff81d168d5 RDI: ffff8880a9166100
      RBP: ffff88808f51fc70 R08: 0000000000000004 R09: ffffed1011ea3f7d
      R10: ffffed1011ea3f7c R11: 0000000000000003 R12: 00000000000000c0
      R13: ffff8880a91661c0 R14: 1ffff1101522cc10 R15: 0000000000000000
      FS:  0000000001e7a880(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000020000140 CR3: 000000009a74c000 CR4: 00000000001406e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      
      which is exactly the case of failing to allocate the SQ/CQ rings, and
      then entering shutdown. Check if the rings are valid before trying to
      access them at shutdown time.
      
      Reported-by: syzbot+21147d79607d724bd6f3@syzkaller.appspotmail.com
      Fixes: 1d7bb1d5 ("io_uring: add support for backlogged CQ ring")
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      15dff286
  3. 12 Nov, 2019 2 commits
    • Jens Axboe's avatar
      io_uring: fix potential deadlock in io_poll_wake() · 7c9e7f0f
      Jens Axboe authored
      We attempt to run the poll completion inline, but we're using trylock to
      do so. This avoids a deadlock since we're grabbing the locks in reverse
      order at this point, we already hold the poll wq lock and we're trying
      to grab the completion lock, while the normal rules are the reverse of
      that order.
      
      IO completion for a timeout link will need to grab the completion lock,
      but that's not safe from this context. Put the completion under the
      completion_lock in io_poll_wake(), and mark the request as entering
      the completion with the completion_lock already held.
      
      Fixes: 2665abfd ("io_uring: add support for linked SQE timeouts")
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      7c9e7f0f
    • Jens Axboe's avatar
      io_uring: use correct "is IO worker" helper · 960e432d
      Jens Axboe authored
      Since we switched to io-wq, the dependent link optimization for when to
      pass back work inline has been broken. Fix this by providing a suitable
      io-wq helper for io_uring to use to detect when to do this.
      
      Fixes: 561fb04a ("io_uring: replace workqueue usage with io-wq")
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      960e432d
  4. 11 Nov, 2019 11 commits
    • Jens Axboe's avatar
      io_uring: fix -ENOENT issue with linked timer with short timeout · 76a46e06
      Jens Axboe authored
      If you prep a read (for example) that needs to get punted to async
      context with a timer, if the timeout is sufficiently short, the timer
      request will get completed with -ENOENT as it could not find the read.
      
      The issue is that we prep and start the timer before we start the read.
      Hence the timer can trigger before the read is even started, and the end
      result is then that the timer completes with -ENOENT, while the read
      starts instead of being cancelled by the timer.
      
      Fix this by splitting the linked timer into two parts:
      
      1) Prep and validate the linked timer
      2) Start timer
      
      The read is then started between steps 1 and 2, so we know that the
      timer will always have a consistent view of the read request state.
      Reported-by: default avatarHrvoje Zeba <zeba.hrvoje@gmail.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      76a46e06
    • Jens Axboe's avatar
      io_uring: don't do flush cancel under inflight_lock · 768134d4
      Jens Axboe authored
      We can't safely cancel under the inflight lock. If the work hasn't been
      started yet, then io_wq_cancel_work() simply marks the work as cancelled
      and invokes the work handler. But if the work completion needs to grab
      the inflight lock because it's grabbing user files, then we'll deadlock
      trying to finish the work as we already hold that lock.
      
      Instead grab a reference to the request, if it isn't already zero. If
      it's zero, then we know it's going through completion anyway, and we
      can safely ignore it. If it's not zero, then we can drop the lock and
      attempt to cancel from there.
      
      This also fixes a missing finish_wait() at the end of
      io_uring_cancel_files().
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      768134d4
    • Jens Axboe's avatar
      io_uring: flag SQPOLL busy condition to userspace · c1edbf5f
      Jens Axboe authored
      Now that we have backpressure, for SQPOLL, we have one more condition
      that warrants flagging that the application needs to enter the kernel:
      we failed to submit IO due to backpressure. Make sure we catch that
      and flag it appropriately.
      
      If we run into backpressure issues with the SQPOLL thread, flag it
      as such to the application by setting IORING_SQ_NEED_WAKEUP. This will
      cause the application to enter the kernel, and that will flush the
      backlog and clear the condition.
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      c1edbf5f
    • Jens Axboe's avatar
      io_uring: make ASYNC_CANCEL work with poll and timeout · 47f46768
      Jens Axboe authored
      It's a little confusing that we have multiple types of command
      cancellation opcodes now that we have a generic one. Make the generic
      one work with POLL_ADD and TIMEOUT commands as well, that makes for an
      easier to use API for the application. The fact that they currently
      don't is a bit confusing.
      
      Add a helper that takes care of it, so we can user it from both
      IORING_OP_ASYNC_CANCEL and from the linked timeout cancellation.
      Reported-by: default avatarHrvoje Zeba <zeba.hrvoje@gmail.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      47f46768
    • Jens Axboe's avatar
      io_uring: provide fallback request for OOM situations · 0ddf92e8
      Jens Axboe authored
      One thing that really sucks for userspace APIs is if the kernel passes
      back -ENOMEM/-EAGAIN for resource shortages. The application really has
      no idea of what to do in those cases. Should it try and reap
      completions? Probably a good idea. Will it solve the issue? Who knows.
      
      This patch adds a simple fallback mechanism if we fail to allocate
      memory for a request. If we fail allocating memory from the slab for a
      request, we punt to a pre-allocated request. There's just one of these
      per io_ring_ctx, but the important part is if we ever return -EBUSY to
      the application, the applications knows that it can wait for events and
      make forward progress when events have completed. This is the important
      part.
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      0ddf92e8
    • Jens Axboe's avatar
      io_uring: convert accept4() -ERESTARTSYS into -EINTR · 8e3cca12
      Jens Axboe authored
      If we cancel a pending accept operating with a signal, we get
      -ERESTARTSYS returned. Turn that into -EINTR for userspace, we should
      not be return -ERESTARTSYS.
      
      Fixes: 17f2fe35 ("io_uring: add support for IORING_OP_ACCEPT")
      Reported-by: default avatarHrvoje Zeba <zeba.hrvoje@gmail.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      8e3cca12
    • Jens Axboe's avatar
      io_uring: fix error clear of ->file_table in io_sqe_files_register() · 46568e9b
      Jens Axboe authored
      syzbot reports that when using failslab and friends, we can get a double
      free in io_sqe_files_unregister():
      
      BUG: KASAN: double-free or invalid-free in
      io_sqe_files_unregister+0x20b/0x300 fs/io_uring.c:3185
      
      CPU: 1 PID: 8819 Comm: syz-executor452 Not tainted 5.4.0-rc6-next-20191108
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
      Google 01/01/2011
      Call Trace:
        __dump_stack lib/dump_stack.c:77 [inline]
        dump_stack+0x197/0x210 lib/dump_stack.c:118
        print_address_description.constprop.0.cold+0xd4/0x30b mm/kasan/report.c:374
        kasan_report_invalid_free+0x65/0xa0 mm/kasan/report.c:468
        __kasan_slab_free+0x13a/0x150 mm/kasan/common.c:450
        kasan_slab_free+0xe/0x10 mm/kasan/common.c:480
        __cache_free mm/slab.c:3426 [inline]
        kfree+0x10a/0x2c0 mm/slab.c:3757
        io_sqe_files_unregister+0x20b/0x300 fs/io_uring.c:3185
        io_ring_ctx_free fs/io_uring.c:3998 [inline]
        io_ring_ctx_wait_and_kill+0x348/0x700 fs/io_uring.c:4060
        io_uring_release+0x42/0x50 fs/io_uring.c:4068
        __fput+0x2ff/0x890 fs/file_table.c:280
        ____fput+0x16/0x20 fs/file_table.c:313
        task_work_run+0x145/0x1c0 kernel/task_work.c:113
        exit_task_work include/linux/task_work.h:22 [inline]
        do_exit+0x904/0x2e60 kernel/exit.c:817
        do_group_exit+0x135/0x360 kernel/exit.c:921
        __do_sys_exit_group kernel/exit.c:932 [inline]
        __se_sys_exit_group kernel/exit.c:930 [inline]
        __x64_sys_exit_group+0x44/0x50 kernel/exit.c:930
        do_syscall_64+0xfa/0x760 arch/x86/entry/common.c:290
        entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x43f2c8
      Code: 31 b8 c5 f7 ff ff 48 8b 5c 24 28 48 8b 6c 24 30 4c 8b 64 24 38 4c 8b
      6c 24 40 4c 8b 74 24 48 4c 8b 7c 24 50 48 83 c4 58 c3 66 <0f> 1f 84 00 00
      00 00 00 48 8d 35 59 ca 00 00 0f b6 d2 48 89 fb 48
      RSP: 002b:00007ffd5b976008 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
      RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 000000000043f2c8
      RDX: 0000000000000000 RSI: 000000000000003c RDI: 0000000000000000
      RBP: 00000000004bf0a8 R08: 00000000000000e7 R09: ffffffffffffffd0
      R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000001
      R13: 00000000006d1180 R14: 0000000000000000 R15: 0000000000000000
      
      This happens if we fail allocating the file tables. For that case we do
      free the file table correctly, but we forget to set it to NULL. This
      means that ring teardown will see it as being non-NULL, and attempt to
      free it again.
      
      Fix this by clearing the file_table pointer if we free the table.
      
      Reported-by: syzbot+3254bc44113ae1e331ee@syzkaller.appspotmail.com
      Fixes: 65e19f54 ("io_uring: support for larger fixed file sets")
      Reviewed-by: default avatarBob Liu <bob.liu@oracle.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      46568e9b
    • Jackie Liu's avatar
      io_uring: separate the io_free_req and io_free_req_find_next interface · c69f8dbe
      Jackie Liu authored
      Similar to the distinction between io_put_req and io_put_req_find_next,
      io_free_req has been modified similarly, with no functional changes.
      Signed-off-by: default avatarJackie Liu <liuyun01@kylinos.cn>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      c69f8dbe
    • Jackie Liu's avatar
      io_uring: keep io_put_req only responsible for release and put req · ec9c02ad
      Jackie Liu authored
      We already have io_put_req_find_next to find the next req of the link.
      we should not use the io_put_req function to find them. They should be
      functions of the same level.
      Signed-off-by: default avatarJackie Liu <liuyun01@kylinos.cn>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      ec9c02ad
    • Jackie Liu's avatar
      io_uring: remove passed in 'ctx' function parameter ctx if possible · a197f664
      Jackie Liu authored
      Many times, the core of the function is req, and req has already set
      req->ctx at initialization time, so there is no need to pass in the
      ctx from the caller.
      
      Cleanup, no functional change.
      Signed-off-by: default avatarJackie Liu <liuyun01@kylinos.cn>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      a197f664
    • Jens Axboe's avatar
      io_uring: reduce/pack size of io_ring_ctx · 206aefde
      Jens Axboe authored
      With the recent flurry of additions and changes to io_uring, the
      layout of io_ring_ctx has become a bit stale. We're right now at
      704 bytes in size on my x86-64 build, or 11 cachelines. This
      patch does two things:
      
      - We have to completion structs embedded, that we only use for
        quiesce of the ctx (or shutdown) and for sqthread init cases.
        That 2x32 bytes right there, let's dynamically allocate them.
      
      - Reorder the struct a bit with an eye on cachelines, use cases,
        and holes.
      
      With this patch, we're down to 512 bytes, or 8 cachelines.
      Reviewed-by: default avatarJackie Liu <liuyun01@kylinos.cn>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      206aefde
  5. 07 Nov, 2019 2 commits
    • Jens Axboe's avatar
      io_uring: properly mark async work as bounded vs unbounded · 5f8fd2d3
      Jens Axboe authored
      Now that io-wq supports separating the two request lifetime types, mark
      the following IO as having unbounded runtimes:
      
      - Any read/write to a non-regular file
      - Any specific networked IO
      - Any poll command
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      5f8fd2d3
    • Jens Axboe's avatar
      io-wq: add support for bounded vs unbunded work · c5def4ab
      Jens Axboe authored
      io_uring supports request types that basically have two different
      lifetimes:
      
      1) Bounded completion time. These are requests like disk reads or writes,
         which we know will finish in a finite amount of time.
      2) Unbounded completion time. These are generally networked IO, where we
         have no idea how long they will take to complete. Another example is
         POLL commands.
      
      This patch provides support for io-wq to handle these differently, so we
      don't starve bounded requests by tying up workers for too long. By default
      all work is bounded, unless otherwise specified in the work item.
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      c5def4ab
  6. 09 Nov, 2019 2 commits
    • Jens Axboe's avatar
      io-wq: io_wqe_run_queue() doesn't need to use list_empty_careful() · 91d666ea
      Jens Axboe authored
      We hold the wqe lock at this point (which is also annotated), so there's
      no need to use the careful variant of list_empty().
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      91d666ea
    • Jens Axboe's avatar
      io_uring: add support for backlogged CQ ring · 1d7bb1d5
      Jens Axboe authored
      Currently we drop completion events, if the CQ ring is full. That's fine
      for requests with bounded completion times, but it may make it harder or
      impossible to use io_uring with networked IO where request completion
      times are generally unbounded. Or with POLL, for example, which is also
      unbounded.
      
      After this patch, we never overflow the ring, we simply store requests
      in a backlog for later flushing. This flushing is done automatically by
      the kernel. To prevent the backlog from growing indefinitely, if the
      backlog is non-empty, we apply back pressure on IO submissions. Any
      attempt to submit new IO with a non-empty backlog will get an -EBUSY
      return from the kernel. This is a signal to the application that it has
      backlogged CQ events, and that it must reap those before being allowed
      to submit more IO.
      
      Note that if we do return -EBUSY, we will have filled whatever
      backlogged events into the CQ ring first, if there's room. This means
      the application can safely reap events WITHOUT entering the kernel and
      waiting for them, they are already available in the CQ ring.
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      1d7bb1d5
  7. 08 Nov, 2019 3 commits
    • Jens Axboe's avatar
      io_uring: pass in io_kiocb to fill/add CQ handlers · 78e19bbe
      Jens Axboe authored
      This is in preparation for handling CQ ring overflow a bit smarter. We
      should not have any functional changes in this patch. Most of the
      changes are fairly straight forward, the only ones that stick out a bit
      are the ones that change __io_free_req() to take the reference count
      into account. If the request hasn't been submitted yet, we know it's
      safe to simply ignore references and free it. But let's clean these up
      too, as later patches will depend on the caller doing the right thing if
      the completion logging grabs a reference to the request.
      Reviewed-by: default avatarPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      78e19bbe
    • Jens Axboe's avatar
      io_uring: make io_cqring_events() take 'ctx' as argument · 84f97dc2
      Jens Axboe authored
      The rings can be derived from the ctx, and we need the ctx there for
      a future change.
      
      No functional changes in this patch.
      Reviewed-by: default avatarPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      84f97dc2
    • Jens Axboe's avatar
      io_uring: add support for linked SQE timeouts · 2665abfd
      Jens Axboe authored
      While we have support for generic timeouts, we don't have a way to tie
      a timeout to a specific SQE. The generic timeouts simply trigger wakeups
      on the CQ ring.
      
      This adds support for IORING_OP_LINK_TIMEOUT. This command is only valid
      as a link to a previous command. The timeout specific can be either
      relative or absolute, following the same rules as IORING_OP_TIMEOUT. If
      the timeout triggers before the dependent command completes, it will
      attempt to cancel that command. Likewise, if the dependent command
      completes before the timeout triggers, it will cancel the timeout.
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      2665abfd
  8. 07 Nov, 2019 4 commits
  9. 06 Nov, 2019 4 commits
  10. 05 Nov, 2019 2 commits
  11. 04 Nov, 2019 2 commits
  12. 02 Nov, 2019 1 commit
  13. 01 Nov, 2019 3 commits
    • Jens Axboe's avatar
      io_uring: remove io_uring_add_to_prev() trace event · 0069fc6b
      Jens Axboe authored
      This internal logic was killed with the conversion to io-wq, so we no
      longer have a need for this particular trace. Kill it.
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      0069fc6b
    • Jackie Liu's avatar
      io_uring: set -EINTR directly when a signal wakes up in io_cqring_wait · e9ffa5c2
      Jackie Liu authored
      We didn't use -ERESTARTSYS to tell the application layer to restart the
      system call, but instead return -EINTR. we can set -EINTR directly when
      wakeup by the signal, which can help us save an assignment operation and
      comparison operation.
      Reviewed-by: default avatarBob Liu <bob.liu@oracle.com>
      Signed-off-by: default avatarJackie Liu <liuyun01@kylinos.cn>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      e9ffa5c2
    • Jens Axboe's avatar
      io_uring: support for generic async request cancel · 62755e35
      Jens Axboe authored
      This adds support for IORING_OP_ASYNC_CANCEL, which will attempt to
      cancel requests that have been punted to async context and are now
      in-flight. This works for regular read/write requests to files, as
      long as they haven't been started yet. For socket based IO (or things
      like accept4(2)), we can cancel work that is already running as well.
      
      To cancel a request, the sqe must have ->addr set to the user_data of
      the request it wishes to cancel. If the request is cancelled
      successfully, the original request is completed with -ECANCELED
      and the cancel request is completed with a result of 0. If the
      request was already running, the original may or may not complete
      in error. The cancel request will complete with -EALREADY for that
      case. And finally, if the request to cancel wasn't found, the cancel
      request is completed with -ENOENT.
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      62755e35