1. 18 May, 2022 4 commits
    • Jens Axboe's avatar
      io_uring: add support for ring mapped supplied buffers · c7fb1942
      Jens Axboe authored
      Provided buffers allow an application to supply io_uring with buffers
      that can then be grabbed for a read/receive request, when the data
      source is ready to deliver data. The existing scheme relies on using
      IORING_OP_PROVIDE_BUFFERS to do that, but it can be difficult to use
      in real world applications. It's pretty efficient if the application
      is able to supply back batches of provided buffers when they have been
      consumed and the application is ready to recycle them, but if
      fragmentation occurs in the buffer space, it can become difficult to
      supply enough buffers at the time. This hurts efficiency.
      
      Add a register op, IORING_REGISTER_PBUF_RING, which allows an application
      to setup a shared queue for each buffer group of provided buffers. The
      application can then supply buffers simply by adding them to this ring,
      and the kernel can consume then just as easily. The ring shares the head
      with the application, the tail remains private in the kernel.
      
      Provided buffers setup with IORING_REGISTER_PBUF_RING cannot use
      IORING_OP_{PROVIDE,REMOVE}_BUFFERS for adding or removing entries to the
      ring, they must use the mapped ring. Mapped provided buffer rings can
      co-exist with normal provided buffers, just not within the same group ID.
      
      To gauge overhead of the existing scheme and evaluate the mapped ring
      approach, a simple NOP benchmark was written. It uses a ring of 128
      entries, and submits/completes 32 at the time. 'Replenish' is how
      many buffers are provided back at the time after they have been
      consumed:
      
      Test			Replenish			NOPs/sec
      ================================================================
      No provided buffers	NA				~30M
      Provided buffers	32				~16M
      Provided buffers	 1				~10M
      Ring buffers		32				~27M
      Ring buffers		 1				~27M
      
      The ring mapped buffers perform almost as well as not using provided
      buffers at all, and they don't care if you provided 1 or more back at
      the same time. This means application can just replenish as they go,
      rather than need to batch and compact, further reducing overhead in the
      application. The NOP benchmark above doesn't need to do any compaction,
      so that overhead isn't even reflected in the above test.
      Co-developed-by: default avatarDylan Yudaken <dylany@fb.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      c7fb1942
    • Jens Axboe's avatar
      io_uring: add io_pin_pages() helper · d8c2237d
      Jens Axboe authored
      Abstract this out from io_sqe_buffer_register() so we can use it
      elsewhere too without duplicating this code.
      
      No intended functional changes in this patch.
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      d8c2237d
    • Jens Axboe's avatar
      io_uring: add buffer selection support to IORING_OP_NOP · 3d200242
      Jens Axboe authored
      Obviously not really useful since it's not transferring data, but it
      is helpful in benchmarking overhead of provided buffers.
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      3d200242
    • Jens Axboe's avatar
      io_uring: fix locking state for empty buffer group · e7637a49
      Jens Axboe authored
      io_provided_buffer_select() must drop the submit lock, if needed, even
      in the error handling case. Failure to do so will leave us with the
      ctx->uring_lock held, causing spew like:
      
      ====================================
      WARNING: iou-wrk-366/368 still has locks held!
      5.18.0-rc6-00294-gdf8dc7004331 #994 Not tainted
      ------------------------------------
      1 lock held by iou-wrk-366/368:
       #0: ffff0000c72598a8 (&ctx->uring_lock){+.+.}-{3:3}, at: io_ring_submit_lock+0x20/0x48
      
      stack backtrace:
      CPU: 4 PID: 368 Comm: iou-wrk-366 Not tainted 5.18.0-rc6-00294-gdf8dc7004331 #994
      Hardware name: linux,dummy-virt (DT)
      Call trace:
       dump_backtrace.part.0+0xa4/0xd4
       show_stack+0x14/0x5c
       dump_stack_lvl+0x88/0xb0
       dump_stack+0x14/0x2c
       debug_check_no_locks_held+0x84/0x90
       try_to_freeze.isra.0+0x18/0x44
       get_signal+0x94/0x6ec
       io_wqe_worker+0x1d8/0x2b4
       ret_from_fork+0x10/0x20
      
      and triggering later hangs off get_signal() because we attempt to
      re-grab the lock.
      
      Reported-by: syzbot+987d7bb19195ae45208c@syzkaller.appspotmail.com
      Fixes: 149c69b0 ("io_uring: abstract out provided buffer list selection")
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      e7637a49
  2. 14 May, 2022 4 commits
  3. 13 May, 2022 8 commits
  4. 09 May, 2022 11 commits
  5. 05 May, 2022 4 commits
  6. 30 Apr, 2022 7 commits
  7. 25 Apr, 2022 2 commits
    • Jens Axboe's avatar
      io_uring: fix compile warning for 32-bit builds · 69cc1b6f
      Jens Axboe authored
      If IO_URING_SCM_ALL isn't set, as it would not be on 32-bit builds,
      then we trigger a warning:
      
      fs/io_uring.c: In function '__io_sqe_files_unregister':
      fs/io_uring.c:8992:13: warning: unused variable 'i' [-Wunused-variable]
       8992 |         int i;
            |             ^
      
      Move the ifdef up to include the 'i' variable declaration.
      Reported-by: default avatarStephen Rothwell <sfr@canb.auug.org.au>
      Fixes: 5e45690a ("io_uring: store SCM state in io_fixed_file->file_ptr")
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      69cc1b6f
    • Dylan Yudaken's avatar
      io_uring: return an error when cqe is dropped · 155bc950
      Dylan Yudaken authored
      Right now io_uring will not actively inform userspace if a CQE is
      dropped. This is extremely rare, requiring a CQ ring overflow, as well as
      a GFP_ATOMIC kmalloc failure. However the consequences could cause for
      example applications to go into an undefined state, possibly waiting for a
      CQE that never arrives.
      
      Return an error code (EBADR) in these cases. Since this is expected to be
      incredibly rare, try and avoid as much as possible affecting the hot code
      paths, and so it only is returned lazily and when there is no other
      available CQEs.
      
      Once the error is returned, reset the error condition assuming the user is
      either ok with it or will clean up appropriately.
      Signed-off-by: default avatarDylan Yudaken <dylany@fb.com>
      Link: https://lore.kernel.org/r/20220421091345.2115755-6-dylany@fb.comSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
      155bc950