• Filipe Manana's avatar
    btrfs: fix race between quota disable and relocation · 8a4a0b2a
    Filipe Manana authored
    If we disable quotas while we have a relocation of a metadata block group
    that has extents belonging to the quota root, we can cause the relocation
    to fail with -ENOENT. This is because relocation builds backref nodes for
    extents of the quota root and later needs to walk the backrefs and access
    the quota root - however if in between a task disables quotas, it results
    in deleting the quota root from the root tree (with btrfs_del_root(),
    called from btrfs_quota_disable().
    
    This can be sporadically triggered by test case btrfs/255 from fstests:
    
      $ ./check btrfs/255
      FSTYP         -- btrfs
      PLATFORM      -- Linux/x86_64 debian0 6.4.0-rc6-btrfs-next-134+ #1 SMP PREEMPT_DYNAMIC Thu Jun 15 11:59:28 WEST 2023
      MKFS_OPTIONS  -- /dev/sdc
      MOUNT_OPTIONS -- /dev/sdc /home/fdmanana/btrfs-tests/scratch_1
    
      btrfs/255 6s ... _check_dmesg: something found in dmesg (see /home/fdmanana/git/hub/xfstests/results//btrfs/255.dmesg)
      - output mismatch (see /home/fdmanana/git/hub/xfstests/results//btrfs/255.out.bad)
          --- tests/btrfs/255.out	2023-03-02 21:47:53.876609426 +0000
          +++ /home/fdmanana/git/hub/xfstests/results//btrfs/255.out.bad	2023-06-16 10:20:39.267563212 +0100
          @@ -1,2 +1,4 @@
           QA output created by 255
          +ERROR: error during balancing '/home/fdmanana/btrfs-tests/scratch_1': No such file or directory
          +There may be more info in syslog - try dmesg | tail
           Silence is golden
          ...
          (Run 'diff -u /home/fdmanana/git/hub/xfstests/tests/btrfs/255.out /home/fdmanana/git/hub/xfstests/results//btrfs/255.out.bad'  to see the entire diff)
      Ran: btrfs/255
      Failures: btrfs/255
      Failed 1 of 1 tests
    
    To fix this make the quota disable operation take the cleaner mutex, as
    relocation of a block group also takes this mutex. This is also what we
    do when deleting a subvolume/snapshot, we take the cleaner mutex in the
    cleaner kthread (at cleaner_kthread()) and then we call btrfs_del_root()
    at btrfs_drop_snapshot() while under the protection of the cleaner mutex.
    
    Fixes: bed92eae ("Btrfs: qgroup implementation and prototypes")
    CC: stable@vger.kernel.org # 5.4+
    Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
    Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
    8a4a0b2a
qgroup.c 116 KB