• Filipe Manana's avatar
    Btrfs: add missing inode update when punching hole · e8c1c76e
    Filipe Manana authored
    When punching a file hole if we endup only zeroing parts of a page,
    because the start offset isn't a multiple of the sector size or the
    start offset and length fall within the same page, we were not updating
    the inode item. This prevented an fsync from doing anything, if no other
    file changes happened in the current transaction, because the fields
    in btrfs_inode used to check if the inode needs to be fsync'ed weren't
    updated.
    
    This issue is easy to reproduce and the following excerpt from the
    xfstest case I made shows how to trigger it:
    
      _scratch_mkfs >> $seqres.full 2>&1
      _init_flakey
      _mount_flakey
    
      # Create our test file.
      $XFS_IO_PROG -f -c "pwrite -S 0x22 -b 16K 0 16K" \
          $SCRATCH_MNT/foo | _filter_xfs_io
    
      # Fsync the file, this makes btrfs update some btrfs inode specific fields
      # that are used to track if the inode needs to be written/updated to the fsync
      # log or not. After this fsync, the new values for those fields indicate that
      # a subsequent fsync does not need to touch the fsync log.
      $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/foo
    
      # Force a commit of the current transaction. After this point, any operation
      # that modifies the data or metadata of our file, should update those fields in
      # the btrfs inode with values that make the next fsync operation write to the
      # fsync log.
      sync
    
      # Punch a hole in our file. This small range affects only 1 page.
      # This made the btrfs hole punching implementation write only some zeroes in
      # one page, but it did not update the btrfs inode fields used to determine if
      # the next fsync needs to write to the fsync log.
      $XFS_IO_PROG -c "fpunch 8000 4K" $SCRATCH_MNT/foo
    
      # Another variation of the previously mentioned case.
      $XFS_IO_PROG -c "fpunch 15000 100" $SCRATCH_MNT/foo
    
      # Now fsync the file. This was a no-operation because the previous hole punch
      # operation didn't update the inode's fields mentioned before, so they remained
      # with the values they had after the first fsync - that is, they indicate that
      # it is not needed to write to fsync log.
      $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/foo
    
      echo "File content before:"
      od -t x1 $SCRATCH_MNT/foo
    
      # Simulate a crash/power loss.
      _load_flakey_table $FLAKEY_DROP_WRITES
      _unmount_flakey
    
      # Enable writes and mount the fs. This makes the fsync log replay code run.
      _load_flakey_table $FLAKEY_ALLOW_WRITES
      _mount_flakey
    
      # Because the last fsync didn't do anything, here the file content matched what
      # it was after the first fsync, before the holes were punched, and not what it
      # was after the holes were punched.
      echo "File content after:"
      od -t x1 $SCRATCH_MNT/foo
    
    This issue has been around since 2012, when the punch hole implementation
    was added, commit 2aaa6655 ("Btrfs: add hole punching").
    
    A test case for xfstests follows soon.
    Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
    Reviewed-by: default avatarLiu Bo <bo.li.liu@oracle.com>
    Signed-off-by: default avatarChris Mason <clm@fb.com>
    e8c1c76e
file.c 75.6 KB