• Filipe Manana's avatar
    btrfs: use inode_logged() at need_log_inode() · bf1f4fd3
    Filipe Manana authored
    At need_log_inode() we directly check the ->logged_trans field of the
    given inode to check if it was previously logged in the transaction, with
    the goal of skipping logging the inode again when it's not necessary.
    The ->logged_trans field in not persisted in the inode item or elsewhere,
    it's only stored in memory (struct btrfs_inode), so it's transient and
    lost once the inode is evicted and then loaded again. Once an inode is
    loaded, we are conservative and set ->logged_trans to 0, which may mean
    that either the inode was never logged in the current transaction or it
    was logged but evicted before being loaded again.
    
    Instead of checking the inode's ->logged_trans field directly, we can
    use instead the helper inode_logged(), which will really check if the
    inode was logged before in the current transaction in case we have a
    ->logged_trans field with a value of 0. This will prevent unnecessarily
    logging an inode when it's not needed, and in some cases preventing a
    transaction commit, in case the logging requires a fallback to a
    transaction commit. The following test script shows a scenario where
    due to eviction we fallback a transaction commit when trying to fsync
    a file that was renamed:
    
      $ cat test.sh
      #!/bin/bash
    
      DEV=/dev/nullb0
      MNT=/mnt/nullb0
    
      num_init_files=10000
      num_new_files=10000
    
      mkfs.btrfs -f $DEV
      mount -o ssd $DEV $MNT
    
      mkdir $MNT/testdir
      for ((i = 1; i <= $num_init_files; i++)); do
          echo -n > $MNT/testdir/file_$i
      done
    
      echo -n > $MNT/testdir/foo
    
      sync
    
      # Add some files so that there's more work in the transaction other
      # than just renaming file foo.
      for ((i = 1; i <= $num_new_files; i++)); do
          echo -n > $MNT/testdir/new_file_$i
      done
    
      # Fsync the directory first.
      xfs_io -c "fsync" $MNT/testdir
    
      # Rename file foo.
      mv $MNT/testdir/foo $MNT/testdir/bar
    
      # Now trigger eviction of the test directory's inode.
      # Once loaded again, it will have logged_trans set to 0 and
      # last_unlink_trans set to the current transaction.
      echo 2 > /proc/sys/vm/drop_caches
    
      # Fsync file bar (ex-foo).
      # Before the patch the fsync would result in a transaction commit
      # because the inode for file bar has last_unlink_trans set to the
      # current transaction, so it will attempt to log the parent directory
      # as well, which will fallback to a full transaction commit because
      # it also has its last_unlink_trans set to the current transaction,
      # due to the inode eviction.
      start=$(date +%s%N)
      xfs_io -c "fsync" $MNT/testdir/bar
      end=$(date +%s%N)
      dur=$(( (end - start) / 1000000 ))
    
      echo "file fsync took: $dur milliseconds"
    
      umount $MNT
    
    Before this patch:  fsync took 22 milliseconds
    After this patch:   fsync took  8 milliseconds
    Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
    Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
    bf1f4fd3
tree-log.c 210 KB