• Filipe Manana's avatar
    btrfs: make extent state merges more efficient during insertions · c91ea4bf
    Filipe Manana authored
    When inserting a new extent state record into an io tree that happens to
    be mergeable, we currently do the following:
    
    1) Insert the extent state record in the io tree's rbtree. This requires
       going down the tree to find where to insert it, and during the
       insertion we often need to balance the rbtree;
    
    2) We then check if the previous node is mergeable, so we call rb_prev()
       to find it, which requires some looping to find the previous node;
    
    3) If the previous node is mergeable, we adjust our node to include the
       range of the previous node and then delete the previous node from the
       rbtree, which again may need to balance the rbtree;
    
    4) Then we check if the next node is mergeable with the node we inserted,
       so we call rb_next(), which requires some looping too. If the next node
       is indeed mergeable, we expand the range of our node to include the
       next node's range and then delete the next node from the rbtree, which
       again may need to balance the tree.
    
    So these are quite of lot of iterations and looping over the rbtree, and
    some of the operations may need to rebalance the rb tree. This can be made
    a bit more efficient by:
    
    1) When iterating the rbtree, once we find a node that is mergeable with
       the node we want to insert, we can just adjust that node's range with
       the range of the node to insert - this avoids continuing iterating
       over the tree and deleting a node from the rbtree;
    
    2) If we expand the range of a mergeable node, then we find the next or
       the previous node, depending on other we merged a range to the right or
       to the left of the node we are currently at during the iteration. This
       merging is as before, we find the next or previous node with rb_next()
       or rb_prev() and if that other node is mergeable with the current one,
       we adjust the range of the current node and remove the other node from
       the rbtree;
    
    3) Whenever we need to insert the new extent state record it's because
       we don't have any extent state record in the rbtree which can be
       merged, so we can remove the call to merge_state() after the insertion,
       saving rb_next() and rb_prev() calls, which require some looping.
    
    So update the insertion function insert_state() to have this behaviour.
    
    Running dbench for 120 seconds and capturing the execution times of
    set_extent_bit() at pin_down_extent(), resulted in the following data
    (time values are in nanoseconds):
    
    Before this change:
    
      Count: 2278299
      Range:  0.000 - 4003728.000; Mean: 713.436; Median: 612.000; Stddev: 3606.952
      Percentiles:  90th: 1187.000; 95th: 1350.000; 99th: 1724.000
           0.000 -       7.534:       5 |
           7.534 -      35.418:      36 |
          35.418 -     154.403:     273 |
         154.403 -     662.138: 1244016 #####################################################
         662.138 -    2828.745: 1031335 ############################################
        2828.745 -   12074.102:    1395 |
       12074.102 -   51525.930:     806 |
       51525.930 -  219874.955:     162 |
      219874.955 -  938254.688:      22 |
      938254.688 - 4003728.000:       3 |
    
    After this change:
    
      Count: 2275862
      Range:  0.000 - 1605175.000; Mean: 678.903; Median: 590.000; Stddev: 2149.785
      Percentiles:  90th: 1105.000; 95th: 1245.000; 99th: 1590.000
           0.000 -      10.219:      10 |
          10.219 -      40.957:      36 |
          40.957 -     155.907:     262 |
         155.907 -     585.789: 1127214 ####################################################
         585.789 -    2193.431: 1145134 #####################################################
        2193.431 -    8205.578:    1648 |
        8205.578 -   30689.378:    1039 |
       30689.378 -  114772.699:     362 |
      114772.699 -  429221.537:      52 |
      429221.537 - 1605175.000:      10 |
    
    Maximum duration (range), average duration, percentiles and standard
    deviation are all better.
    Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
    Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
    Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
    c91ea4bf
extent-io-tree.c 48.1 KB