• Filipe Manana's avatar
    Btrfs: fix data corruption due to cloning of eof block · ec6d90a4
    Filipe Manana authored
    commit ac765f83 upstream.
    
    We currently allow cloning a range from a file which includes the last
    block of the file even if the file's size is not aligned to the block
    size. This is fine and useful when the destination file has the same size,
    but when it does not and the range ends somewhere in the middle of the
    destination file, it leads to corruption because the bytes between the EOF
    and the end of the block have undefined data (when there is support for
    discard/trimming they have a value of 0x00).
    
    Example:
    
     $ mkfs.btrfs -f /dev/sdb
     $ mount /dev/sdb /mnt
    
     $ export foo_size=$((256 * 1024 + 100))
     $ xfs_io -f -c "pwrite -S 0x3c 0 $foo_size" /mnt/foo
     $ xfs_io -f -c "pwrite -S 0xb5 0 1M" /mnt/bar
    
     $ xfs_io -c "reflink /mnt/foo 0 512K $foo_size" /mnt/bar
    
     $ od -A d -t x1 /mnt/bar
     0000000 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5
     *
     0524288 3c 3c 3c 3c 3c 3c 3c 3c 3c 3c 3c 3c 3c 3c 3c 3c
     *
     0786528 3c 3c 3c 3c 00 00 00 00 00 00 00 00 00 00 00 00
     0786544 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
     *
     0790528 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5
     *
     1048576
    
    The bytes in the range from 786532 (512Kb + 256Kb + 100 bytes) to 790527
    (512Kb + 256Kb + 4Kb - 1) got corrupted, having now a value of 0x00 instead
    of 0xb5.
    
    This is similar to the problem we had for deduplication that got recently
    fixed by commit de02b9f6 ("Btrfs: fix data corruption when
    deduplicating between different files").
    
    Fix this by not allowing such operations to be performed and return the
    errno -EINVAL to user space. This is what XFS is doing as well at the VFS
    level. This change however now makes us return -EINVAL instead of
    -EOPNOTSUPP for cases where the source range maps to an inline extent and
    the destination range's end is smaller then the destination file's size,
    since the detection of inline extents is done during the actual process of
    dropping file extent items (at __btrfs_drop_extents()). Returning the
    -EINVAL error is done early on and solely based on the input parameters
    (offsets and length) and destination file's size. This makes us consistent
    with XFS and anyone else supporting cloning since this case is now checked
    at a higher level in the VFS and is where the -EINVAL will be returned
    from starting with kernel 4.20 (the VFS changed was introduced in 4.20-rc1
    by commit 07d19dc9 ("vfs: avoid problematic remapping requests into
    partial EOF block"). So this change is more geared towards stable kernels,
    as it's unlikely the new VFS checks get removed intentionally.
    
    A test case for fstests follows soon, as well as an update to filter
    existing tests that expect -EOPNOTSUPP to accept -EINVAL as well.
    
    CC: <stable@vger.kernel.org> # 4.4+
    Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
    Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
    Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
    ec6d90a4
ioctl.c 146 KB