• Dave Wysochanski's avatar
    NFS: Fix nfs_netfs_issue_read() xarray locking for writeback interrupt · fd5860ab
    Dave Wysochanski authored
    The loop inside nfs_netfs_issue_read() currently does not disable
    interrupts while iterating through pages in the xarray to submit
    for NFS read.  This is not safe though since after taking xa_lock,
    another page in the mapping could be processed for writeback inside
    an interrupt, and deadlock can occur.  The fix is simple and clean
    if we use xa_for_each_range(), which handles the iteration with RCU
    while reducing code complexity.
    
    The problem is easily reproduced with the following test:
     mount -o vers=3,fsc 127.0.0.1:/export /mnt/nfs
     dd if=/dev/zero of=/mnt/nfs/file1.bin bs=4096 count=1
     echo 3 > /proc/sys/vm/drop_caches
     dd if=/mnt/nfs/file1.bin of=/dev/null
     umount /mnt/nfs
    
    On the console with a lockdep-enabled kernel a message similar to
    the following will be seen:
    
     ================================
     WARNING: inconsistent lock state
     6.7.0-lockdbg+ #10 Not tainted
     --------------------------------
     inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
     test5/1708 [HC0[0]:SC0[0]:HE1:SE1] takes:
     ffff888127baa598 (&xa->xa_lock#4){+.?.}-{3:3}, at:
    nfs_netfs_issue_read+0x1b2/0x4b0 [nfs]
     {IN-SOFTIRQ-W} state was registered at:
       lock_acquire+0x144/0x380
       _raw_spin_lock_irqsave+0x4e/0xa0
       __folio_end_writeback+0x17e/0x5c0
       folio_end_writeback+0x93/0x1b0
       iomap_finish_ioend+0xeb/0x6a0
       blk_update_request+0x204/0x7f0
       blk_mq_end_request+0x30/0x1c0
       blk_complete_reqs+0x7e/0xa0
       __do_softirq+0x113/0x544
       __irq_exit_rcu+0xfe/0x120
       irq_exit_rcu+0xe/0x20
       sysvec_call_function_single+0x6f/0x90
       asm_sysvec_call_function_single+0x1a/0x20
       pv_native_safe_halt+0xf/0x20
       default_idle+0x9/0x20
       default_idle_call+0x67/0xa0
       do_idle+0x2b5/0x300
       cpu_startup_entry+0x34/0x40
       start_secondary+0x19d/0x1c0
       secondary_startup_64_no_verify+0x18f/0x19b
     irq event stamp: 176891
     hardirqs last  enabled at (176891): [<ffffffffa67a0be4>]
    _raw_spin_unlock_irqrestore+0x44/0x60
     hardirqs last disabled at (176890): [<ffffffffa67a0899>]
    _raw_spin_lock_irqsave+0x79/0xa0
     softirqs last  enabled at (176646): [<ffffffffa515d91e>]
    __irq_exit_rcu+0xfe/0x120
     softirqs last disabled at (176633): [<ffffffffa515d91e>]
    __irq_exit_rcu+0xfe/0x120
    
     other info that might help us debug this:
      Possible unsafe locking scenario:
    
            CPU0
            ----
       lock(&xa->xa_lock#4);
       <Interrupt>
         lock(&xa->xa_lock#4);
    
      *** DEADLOCK ***
    
     2 locks held by test5/1708:
      #0: ffff888127baa498 (&sb->s_type->i_mutex_key#22){++++}-{4:4}, at:
          nfs_start_io_read+0x28/0x90 [nfs]
      #1: ffff888127baa650 (mapping.invalidate_lock#3){.+.+}-{4:4}, at:
          page_cache_ra_unbounded+0xa4/0x280
    
     stack backtrace:
     CPU: 6 PID: 1708 Comm: test5 Kdump: loaded Not tainted 6.7.0-lockdbg+
     Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-1.fc39
    04/01/2014
     Call Trace:
      dump_stack_lvl+0x5b/0x90
      mark_lock+0xb3f/0xd20
      __lock_acquire+0x77b/0x3360
      _raw_spin_lock+0x34/0x80
      nfs_netfs_issue_read+0x1b2/0x4b0 [nfs]
      netfs_begin_read+0x77f/0x980 [netfs]
      nfs_netfs_readahead+0x45/0x60 [nfs]
      nfs_readahead+0x323/0x5a0 [nfs]
      read_pages+0xf3/0x5c0
      page_cache_ra_unbounded+0x1c8/0x280
      filemap_get_pages+0x38c/0xae0
      filemap_read+0x206/0x5e0
      nfs_file_read+0xb7/0x140 [nfs]
      vfs_read+0x2a9/0x460
      ksys_read+0xb7/0x140
    
    Fixes: 000dbe0b ("NFS: Convert buffered read paths to use netfs when fscache is enabled")
    Suggested-by: default avatarJeff Layton <jlayton@redhat.com>
    Signed-off-by: default avatarDave Wysochanski <dwysocha@redhat.com>
    Reviewed-by: default avatarJeff Layton <jlayton@kernel.org>
    Reviewed-by: default avatarDavid Howells <dhowells@redhat.com>
    Signed-off-by: default avatarTrond Myklebust <trond.myklebust@hammerspace.com>
    fd5860ab
fscache.c 10 KB