• Jens Axboe's avatar
    io_uring: free io_buffer_list entries via RCU · 5cf4f52e
    Jens Axboe authored
    mmap_lock nests under uring_lock out of necessity, as we may be doing
    user copies with uring_lock held. However, for mmap of provided buffer
    rings, we attempt to grab uring_lock with mmap_lock already held from
    do_mmap(). This makes lockdep, rightfully, complain:
    
    WARNING: possible circular locking dependency detected
    6.7.0-rc1-00009-gff3337ebaf94-dirty #4438 Not tainted
    ------------------------------------------------------
    buf-ring.t/442 is trying to acquire lock:
    ffff00020e1480a8 (&ctx->uring_lock){+.+.}-{3:3}, at: io_uring_validate_mmap_request.isra.0+0x4c/0x140
    
    but task is already holding lock:
    ffff0000dc226190 (&mm->mmap_lock){++++}-{3:3}, at: vm_mmap_pgoff+0x124/0x264
    
    which lock already depends on the new lock.
    
    the existing dependency chain (in reverse order) is:
    
    -> #1 (&mm->mmap_lock){++++}-{3:3}:
           __might_fault+0x90/0xbc
           io_register_pbuf_ring+0x94/0x488
           __arm64_sys_io_uring_register+0x8dc/0x1318
           invoke_syscall+0x5c/0x17c
           el0_svc_common.constprop.0+0x108/0x130
           do_el0_svc+0x2c/0x38
           el0_svc+0x4c/0x94
           el0t_64_sync_handler+0x118/0x124
           el0t_64_sync+0x168/0x16c
    
    -> #0 (&ctx->uring_lock){+.+.}-{3:3}:
           __lock_acquire+0x19a0/0x2d14
           lock_acquire+0x2e0/0x44c
           __mutex_lock+0x118/0x564
           mutex_lock_nested+0x20/0x28
           io_uring_validate_mmap_request.isra.0+0x4c/0x140
           io_uring_mmu_get_unmapped_area+0x3c/0x98
           get_unmapped_area+0xa4/0x158
           do_mmap+0xec/0x5b4
           vm_mmap_pgoff+0x158/0x264
           ksys_mmap_pgoff+0x1d4/0x254
           __arm64_sys_mmap+0x80/0x9c
           invoke_syscall+0x5c/0x17c
           el0_svc_common.constprop.0+0x108/0x130
           do_el0_svc+0x2c/0x38
           el0_svc+0x4c/0x94
           el0t_64_sync_handler+0x118/0x124
           el0t_64_sync+0x168/0x16c
    
    From that mmap(2) path, we really just need to ensure that the buffer
    list doesn't go away from underneath us. For the lower indexed entries,
    they never go away until the ring is freed and we can always sanely
    reference those as long as the caller has a file reference. For the
    higher indexed ones in our xarray, we just need to ensure that the
    buffer list remains valid while we return the address of it.
    
    Free the higher indexed io_buffer_list entries via RCU. With that we can
    avoid needing ->uring_lock inside mmap(2), and simply hold the RCU read
    lock around the buffer list lookup and address check.
    
    To ensure that the arrayed lookup either returns a valid fully formulated
    entry via RCU lookup, add an 'is_ready' flag that we access with store
    and release memory ordering. This isn't needed for the xarray lookups,
    but doesn't hurt either. Since this isn't a fast path, retain it across
    both types. Similarly, for the allocated array inside the ctx, ensure
    we use the proper load/acquire as setup could in theory be running in
    parallel with mmap.
    
    While in there, add a few lockdep checks for documentation purposes.
    
    Cc: stable@vger.kernel.org
    Fixes: c56e022c ("io_uring: add support for user mapped provided buffer ring")
    Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
    5cf4f52e
io_uring.c 124 KB