• Josef Bacik's avatar
    btrfs: use btrfs_get_dev_args_from_path in dev removal ioctls · 1a15eb72
    Josef Bacik authored
    For device removal and replace we call btrfs_find_device_by_devspec,
    which if we give it a device path and nothing else will call
    btrfs_get_dev_args_from_path, which opens the block device and reads the
    super block and then looks up our device based on that.
    
    However at this point we're holding the sb write "lock", so reading the
    block device pulls in the dependency of ->open_mutex, which produces the
    following lockdep splat
    
    ======================================================
    WARNING: possible circular locking dependency detected
    5.14.0-rc2+ #405 Not tainted
    ------------------------------------------------------
    losetup/11576 is trying to acquire lock:
    ffff9bbe8cded938 ((wq_completion)loop0){+.+.}-{0:0}, at: flush_workqueue+0x67/0x5e0
    
    but task is already holding lock:
    ffff9bbe88e4fc68 (&lo->lo_mutex){+.+.}-{3:3}, at: __loop_clr_fd+0x41/0x660 [loop]
    
    which lock already depends on the new lock.
    
    the existing dependency chain (in reverse order) is:
    
    -> #4 (&lo->lo_mutex){+.+.}-{3:3}:
           __mutex_lock+0x7d/0x750
           lo_open+0x28/0x60 [loop]
           blkdev_get_whole+0x25/0xf0
           blkdev_get_by_dev.part.0+0x168/0x3c0
           blkdev_open+0xd2/0xe0
           do_dentry_open+0x161/0x390
           path_openat+0x3cc/0xa20
           do_filp_open+0x96/0x120
           do_sys_openat2+0x7b/0x130
           __x64_sys_openat+0x46/0x70
           do_syscall_64+0x38/0x90
           entry_SYSCALL_64_after_hwframe+0x44/0xae
    
    -> #3 (&disk->open_mutex){+.+.}-{3:3}:
           __mutex_lock+0x7d/0x750
           blkdev_get_by_dev.part.0+0x56/0x3c0
           blkdev_get_by_path+0x98/0xa0
           btrfs_get_bdev_and_sb+0x1b/0xb0
           btrfs_find_device_by_devspec+0x12b/0x1c0
           btrfs_rm_device+0x127/0x610
           btrfs_ioctl+0x2a31/0x2e70
           __x64_sys_ioctl+0x80/0xb0
           do_syscall_64+0x38/0x90
           entry_SYSCALL_64_after_hwframe+0x44/0xae
    
    -> #2 (sb_writers#12){.+.+}-{0:0}:
           lo_write_bvec+0xc2/0x240 [loop]
           loop_process_work+0x238/0xd00 [loop]
           process_one_work+0x26b/0x560
           worker_thread+0x55/0x3c0
           kthread+0x140/0x160
           ret_from_fork+0x1f/0x30
    
    -> #1 ((work_completion)(&lo->rootcg_work)){+.+.}-{0:0}:
           process_one_work+0x245/0x560
           worker_thread+0x55/0x3c0
           kthread+0x140/0x160
           ret_from_fork+0x1f/0x30
    
    -> #0 ((wq_completion)loop0){+.+.}-{0:0}:
           __lock_acquire+0x10ea/0x1d90
           lock_acquire+0xb5/0x2b0
           flush_workqueue+0x91/0x5e0
           drain_workqueue+0xa0/0x110
           destroy_workqueue+0x36/0x250
           __loop_clr_fd+0x9a/0x660 [loop]
           block_ioctl+0x3f/0x50
           __x64_sys_ioctl+0x80/0xb0
           do_syscall_64+0x38/0x90
           entry_SYSCALL_64_after_hwframe+0x44/0xae
    
    other info that might help us debug this:
    
    Chain exists of:
      (wq_completion)loop0 --> &disk->open_mutex --> &lo->lo_mutex
    
     Possible unsafe locking scenario:
    
           CPU0                    CPU1
           ----                    ----
      lock(&lo->lo_mutex);
                                   lock(&disk->open_mutex);
                                   lock(&lo->lo_mutex);
      lock((wq_completion)loop0);
    
     *** DEADLOCK ***
    
    1 lock held by losetup/11576:
     #0: ffff9bbe88e4fc68 (&lo->lo_mutex){+.+.}-{3:3}, at: __loop_clr_fd+0x41/0x660 [loop]
    
    stack backtrace:
    CPU: 0 PID: 11576 Comm: losetup Not tainted 5.14.0-rc2+ #405
    Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014
    Call Trace:
     dump_stack_lvl+0x57/0x72
     check_noncircular+0xcf/0xf0
     ? stack_trace_save+0x3b/0x50
     __lock_acquire+0x10ea/0x1d90
     lock_acquire+0xb5/0x2b0
     ? flush_workqueue+0x67/0x5e0
     ? lockdep_init_map_type+0x47/0x220
     flush_workqueue+0x91/0x5e0
     ? flush_workqueue+0x67/0x5e0
     ? verify_cpu+0xf0/0x100
     drain_workqueue+0xa0/0x110
     destroy_workqueue+0x36/0x250
     __loop_clr_fd+0x9a/0x660 [loop]
     ? blkdev_ioctl+0x8d/0x2a0
     block_ioctl+0x3f/0x50
     __x64_sys_ioctl+0x80/0xb0
     do_syscall_64+0x38/0x90
     entry_SYSCALL_64_after_hwframe+0x44/0xae
    RIP: 0033:0x7f31b02404cb
    
    Instead what we want to do is populate our device lookup args before we
    grab any locks, and then pass these args into btrfs_rm_device().  From
    there we can find the device and do the appropriate removal.
    Suggested-by: default avatarAnand Jain <anand.jain@oracle.com>
    Reviewed-by: default avatarAnand Jain <anand.jain@oracle.com>
    Signed-off-by: default avatarJosef Bacik <josef@toxicpanda.com>
    Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
    1a15eb72
volumes.h 18.1 KB