• Filipe Manana's avatar
    btrfs: search for delalloc more efficiently during lseek/fiemap · 8ddc8274
    Filipe Manana authored
    During lseek (SEEK_HOLE/DATA) and fiemap, when processing a file range
    that corresponds to a hole or a prealloc extent, we have to check if
    there's any delalloc in the range. We do it by searching for delalloc
    ranges in the inode's io_tree (for unflushed delalloc) and in the inode's
    extent map tree (for delalloc that is flushing).
    
    We avoid searching the extent map tree if the number of outstanding
    extents is 0, as in that case we can't have extent maps for our search
    range in the tree that correspond to delalloc that is flushing. However
    if we have any unflushed delalloc, due to buffered writes or mmap writes,
    then the outstanding extents counter is not 0 and we'll search the extent
    map tree. The tree may be large because it can have lots of extent maps
    that were loaded by reads or created by previous writes, therefore taking
    a significant time to search the tree, specially if have a file with a
    lot of holes and/or prealloc extents.
    
    We can improve on this by instead of searching the extent map tree,
    searching the ordered extents tree of the inode, since when delalloc is
    flushing we create an ordered extent along with the new extent map, while
    holding the respective file range locked in the inode's io_tree. The
    ordered extents tree is typically much smaller, since ordered extents have
    a short life and get removed from the tree once they are completed, while
    extent maps can stay for a very long time in the extent map tree, either
    created by previous writes or loaded by read operations.
    
    So use the ordered extents tree instead of the extent maps tree.
    
    This change is part of a patchset that has the goal to make performance
    better for applications that use lseek's SEEK_HOLE and SEEK_DATA modes to
    iterate over the extents of a file. Two examples are the cp program from
    coreutils 9.0+ and the tar program (when using its --sparse / -S option).
    A sample test and results are listed in the changelog of the last patch
    in the series:
    
      1/9 btrfs: remove leftover setting of EXTENT_UPTODATE state in an inode's io_tree
      2/9 btrfs: add an early exit when searching for delalloc range for lseek/fiemap
      3/9 btrfs: skip unnecessary delalloc searches during lseek/fiemap
      4/9 btrfs: search for delalloc more efficiently during lseek/fiemap
      5/9 btrfs: remove no longer used btrfs_next_extent_map()
      6/9 btrfs: allow passing a cached state record to count_range_bits()
      7/9 btrfs: update stale comment for count_range_bits()
      8/9 btrfs: use cached state when looking for delalloc ranges with fiemap
      9/9 btrfs: use cached state when looking for delalloc ranges with lseek
    Reported-by: default avatarWang Yugui <wangyugui@e16-tech.com>
    Link: https://lore.kernel.org/linux-btrfs/20221106073028.71F9.409509F4@e16-tech.com/
    Link: https://lore.kernel.org/linux-btrfs/CAL3q7H5NSVicm7nYBJ7x8fFkDpno8z3PYt5aPU43Bajc1H0h1Q@mail.gmail.com/Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
    Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
    8ddc8274
file.c 107 KB