• Filipe Manana's avatar
    Btrfs: fix race between scrub and block group deletion · 020d5b73
    Filipe Manana authored
    Scrub can race with the cleaner kthread deleting block groups that are
    unused (and with relocation too) leading to a failure with error -EINVAL
    that gets returned to user space.
    
    The following diagram illustrates how it happens:
    
                  CPU 1                                 CPU 2
    
     cleaner kthread
       btrfs_delete_unused_bgs()
    
         gets block group X from
         fs_info->unused_bgs
    
         sets block group to RO
    
           btrfs_remove_chunk(bg X)
    
             deletes device extents
    
                                             scrub_enumerate_chunks()
    
                                               searches device tree using
                                               its commit root
    
                                               finds device extent for
                                               block group X
    
                                               gets block group X from the tree
                                               fs_info->block_group_cache_tree
                                               (via btrfs_lookup_block_group())
    
                                               sets bg X to RO (again)
    
              btrfs_remove_block_group(bg X)
    
                deletes block group from
                fs_info->block_group_cache_tree
    
                removes extent map from
                fs_info->mapping_tree
    
                                                   scrub_chunk(offset X)
    
                                                     searches fs_info->mapping_tree
                                                     for extent map starting at
                                                     offset X
    
                                                        --> doesn't find any such
                                                            extent map
                                                        --> returns -EINVAL and scrub
                                                            errors out to userspace
                                                            with -EINVAL
    
    Fix this by dealing with an extent map lookup failure as an indicator of
    block group deletion.
    Issue reproduced with fstest btrfs/071.
    Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
    Signed-off-by: default avatarChris Mason <clm@fb.com>
    020d5b73
scrub.c 114 KB