• Filipe Manana's avatar
    Btrfs: fix infinite loop during fsync after rename operations · b5e4ff9d
    Filipe Manana authored
    Recently fsstress (from fstests) sporadically started to trigger an
    infinite loop during fsync operations. This turned out to be because
    support for the rename exchange and whiteout operations was added to
    fsstress in fstests. These operations, unlike any others in fsstress,
    cause file names to be reused, whence triggering this issue. However
    it's not necessary to use rename exchange and rename whiteout operations
    trigger this issue, simple rename operations and file creations are
    enough to trigger the issue.
    
    The issue boils down to when we are logging inodes that conflict (that
    had the name of any inode we need to log during the fsync operation), we
    keep logging them even if they were already logged before, and after
    that we check if there's any other inode that conflicts with them and
    then add it again to the list of inodes to log. Skipping already logged
    inodes fixes the issue.
    
    Consider the following example:
    
      $ mkfs.btrfs -f /dev/sdb
      $ mount /dev/sdb /mnt
    
      $ mkdir /mnt/testdir                           # inode 257
    
      $ touch /mnt/testdir/zz                        # inode 258
      $ ln /mnt/testdir/zz /mnt/testdir/zz_link
    
      $ touch /mnt/testdir/a                         # inode 259
    
      $ sync
    
      # The following 3 renames achieve the same result as a rename exchange
      # operation (<rename_exchange> /mnt/testdir/zz_link to /mnt/testdir/a).
    
      $ mv /mnt/testdir/a /mnt/testdir/a/tmp
      $ mv /mnt/testdir/zz_link /mnt/testdir/a
      $ mv /mnt/testdir/a/tmp /mnt/testdir/zz_link
    
      # The following rename and file creation give the same result as a
      # rename whiteout operation (<rename_whiteout> zz to a2).
    
      $ mv /mnt/testdir/zz /mnt/testdir/a2
      $ touch /mnt/testdir/zz                        # inode 260
    
      $ xfs_io -c fsync /mnt/testdir/zz
        --> results in the infinite loop
    
    The following steps happen:
    
    1) When logging inode 260, we find that its reference named "zz" was
       used by inode 258 in the previous transaction (through the commit
       root), so inode 258 is added to the list of conflicting indoes that
       need to be logged;
    
    2) After logging inode 258, we find that its reference named "a" was
       used by inode 259 in the previous transaction, and therefore we add
       inode 259 to the list of conflicting inodes to be logged;
    
    3) After logging inode 259, we find that its reference named "zz_link"
       was used by inode 258 in the previous transaction - we add inode 258
       to the list of conflicting inodes to log, again - we had already
       logged it before at step 3. After logging it again, we find again
       that inode 259 conflicts with him, and we add again 259 to the list,
       etc - we end up repeating all the previous steps.
    
    So fix this by skipping logging of conflicting inodes that were already
    logged.
    
    Fixes: 6b5fc433 ("Btrfs: fix fsync after succession of renames of different files")
    CC: stable@vger.kernel.org # 5.1+
    Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
    Reviewed-by: default avatarJosef Bacik <josef@toxicpanda.com>
    Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
    b5e4ff9d
tree-log.c 171 KB