• Kent Overstreet's avatar
    bcachefs: Don't call lock_graph_descend() with wait lock held · 6547ebab
    Kent Overstreet authored
    This fixes a deadlock:
    
    01305 WARNING: possible circular locking dependency detected
    01305 6.3.0-ktest-gf4de9bee61af #5305 Tainted: G        W
    01305 ------------------------------------------------------
    01305 cat/14658 is trying to acquire lock:
    01305 ffffffc00982f460 (fs_reclaim){+.+.}-{0:0}, at: __kmem_cache_alloc_node+0x48/0x278
    01305
    01305 but task is already holding lock:
    01305 ffffff8011aaf040 (&lock->wait_lock){+.+.}-{2:2}, at: bch2_check_for_deadlock+0x4b8/0xa58
    01305
    01305 which lock already depends on the new lock.
    01305
    01305
    01305 the existing dependency chain (in reverse order) is:
    01305
    01305 -> #2 (&lock->wait_lock){+.+.}-{2:2}:
    01305        _raw_spin_lock+0x54/0x70
    01305        __six_lock_wakeup+0x40/0x1b0
    01305        six_unlock_ip+0xe8/0x248
    01305        bch2_btree_key_cache_scan+0x720/0x940
    01305        shrink_slab.constprop.0+0x284/0x770
    01305        shrink_node+0x390/0x828
    01305        balance_pgdat+0x390/0x6d0
    01305        kswapd+0x2e4/0x718
    01305        kthread+0x184/0x1a8
    01305        ret_from_fork+0x10/0x20
    01305
    01305 -> #1 (&c->lock#2){+.+.}-{3:3}:
    01305        __mutex_lock+0x104/0x14a0
    01305        mutex_lock_nested+0x30/0x40
    01305        bch2_btree_key_cache_scan+0x5c/0x940
    01305        shrink_slab.constprop.0+0x284/0x770
    01305        shrink_node+0x390/0x828
    01305        balance_pgdat+0x390/0x6d0
    01305        kswapd+0x2e4/0x718
    01305        kthread+0x184/0x1a8
    01305        ret_from_fork+0x10/0x20
    01305
    01305 -> #0 (fs_reclaim){+.+.}-{0:0}:
    01305        __lock_acquire+0x19d0/0x2930
    01305        lock_acquire+0x1dc/0x458
    01305        fs_reclaim_acquire+0x9c/0xe0
    01305        __kmem_cache_alloc_node+0x48/0x278
    01305        __kmalloc_node_track_caller+0x5c/0x278
    01305        krealloc+0x94/0x180
    01305        bch2_printbuf_make_room.part.0+0xac/0x118
    01305        bch2_prt_printf+0x150/0x1e8
    01305        bch2_btree_bkey_cached_common_to_text+0x170/0x298
    01305        bch2_btree_trans_to_text+0x244/0x348
    01305        print_cycle+0x7c/0xb0
    01305        break_cycle+0x254/0x528
    01305        bch2_check_for_deadlock+0x59c/0xa58
    01305        bch2_btree_deadlock_read+0x174/0x200
    01305        full_proxy_read+0x94/0xf0
    01305        vfs_read+0x15c/0x3a8
    01305        ksys_read+0xb8/0x148
    01305        __arm64_sys_read+0x48/0x60
    01305        invoke_syscall.constprop.0+0x64/0x138
    01305        do_el0_svc+0x84/0x138
    01305        el0_svc+0x34/0x80
    01305        el0t_64_sync_handler+0xb0/0xb8
    01305        el0t_64_sync+0x14c/0x150
    01305
    01305 other info that might help us debug this:
    01305
    01305 Chain exists of:
    01305   fs_reclaim --> &c->lock#2 --> &lock->wait_lock
    01305
    01305  Possible unsafe locking scenario:
    01305
    01305        CPU0                    CPU1
    01305        ----                    ----
    01305   lock(&lock->wait_lock);
    01305                                lock(&c->lock#2);
    01305                                lock(&lock->wait_lock);
    01305   lock(fs_reclaim);
    01305
    01305  *** DEADLOCK ***
    Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
    6547ebab
btree_locking.c 19 KB