• Chris Mason's avatar
    Btrfs: Make btrfs_drop_snapshot work in larger and more efficient chunks · bd56b302
    Chris Mason authored
    Every transaction in btrfs creates a new snapshot, and then schedules the
    snapshot from the last transaction for deletion.  Snapshot deletion
    works by walking down the btree and dropping the reference counts
    on each btree block during the walk.
    
    If if a given leaf or node has a reference count greater than one,
    the reference count is decremented and the subtree pointed to by that
    node is ignored.
    
    If the reference count is one, walking continues down into that node
    or leaf, and the references of everything it points to are decremented.
    
    The old code would try to work in small pieces, walking down the tree
    until it found the lowest leaf or node to free and then returning.  This
    was very friendly to the rest of the FS because it didn't have a huge
    impact on other operations.
    
    But it wouldn't always keep up with the rate that new commits added new
    snapshots for deletion, and it wasn't very optimal for the extent
    allocation tree because it wasn't finding leaves that were close together
    on disk and processing them at the same time.
    
    This changes things to walk down to a level 1 node and then process it
    in bulk.  All the leaf pointers are sorted and the leaves are dropped
    in order based on their extent number.
    
    The extent allocation tree and commit code are now fast enough for
    this kind of bulk processing to work without slowing the rest of the FS
    down.  Overall it does less IO and is better able to keep up with
    snapshot deletions under high load.
    Signed-off-by: default avatarChris Mason <chris.mason@oracle.com>
    bd56b302
inode.c 135 KB