• Filipe Manana's avatar
    btrfs: make NOCOW checks for existence of checksums in a range more efficient · 8d2a83a9
    Filipe Manana authored
    Before deciding if we can do a NOCOW write into a range, one of the things
    we have to do is check if there are checksum items for that range. We do
    that through the btrfs_lookup_csums_list() function, which searches for
    checksums and adds them to a list supplied by the caller.
    
    But all we need is to check if there is any checksum, we don't need to
    look for all of them and collect them into a list, which requires more
    search time in the checksums tree, allocating memory for checksums items
    to add to the list, copy checksums from a leaf into those list items,
    then free that memory, etc. This is all unnecessary overhead, wasting
    mostly CPU time, and perhaps some occasional IO if we need to read from
    disk any extent buffers.
    
    So change btrfs_lookup_csums_list() to allow to return immediately in
    case it finds any checksum, without the need to add it to a list and read
    it from a leaf. This is accomplished by allowing a NULL list parameter and
    making the function return 1 if it found any checksum, 0 if it didn't
    found any, and a negative value in case of an error.
    
    The following test with fio was used to measure performance:
    
      $ cat test.sh
      #!/bin/bash
    
      DEV=/dev/nullb0
      MNT=/mnt/nullb0
    
      cat <<EOF > /tmp/fio-job.ini
      [global]
      name=fio-rand-write
      filename=$MNT/fio-rand-write
      rw=randwrite
      bssplit=4k/20:8k/20:16k/20:32k/20:64k/20
      direct=1
      numjobs=16
      fallocate=posix
      time_based
      runtime=300
    
      [file1]
      size=8G
      ioengine=io_uring
      iodepth=16
      EOF
    
      umount $MNT &> /dev/null
      mkfs.btrfs -f $DEV
      mount -o ssd $DEV $MNT
    
      fio /tmp/fio-job.ini
      umount $MNT
    
    The test was run on a release kernel (Debian's default kernel config).
    
    The results before this patch:
    
      WRITE: bw=139MiB/s (146MB/s), 8204KiB/s-9504KiB/s (8401kB/s-9732kB/s), io=17.0GiB (18.3GB), run=125317-125344msec
    
    The results after this patch:
    
      WRITE: bw=153MiB/s (160MB/s), 9241KiB/s-10.0MiB/s (9463kB/s-10.5MB/s), io=17.0GiB (18.3GB), run=114054-114071msec
    Reviewed-by: default avatarQu Wenruo <wqu@suse.com>
    Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
    Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
    Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
    8d2a83a9
inode.c 311 KB