• Qu Wenruo's avatar
    btrfs: Don't remove block group that still has pinned down bytes · 43794446
    Qu Wenruo authored
    [BUG]
    Under certain KVM load and LTP tests, it is possible to hit the
    following calltrace if quota is enabled:
    
    BTRFS critical (device vda2): unable to find logical 8820195328 length 4096
    BTRFS critical (device vda2): unable to find logical 8820195328 length 4096
    
    WARNING: CPU: 0 PID: 49 at ../block/blk-core.c:172 blk_status_to_errno+0x1a/0x30
    CPU: 0 PID: 49 Comm: kworker/u2:1 Not tainted 4.12.14-15-default #1 SLE15 (unreleased)
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014
    Workqueue: btrfs-endio-write btrfs_endio_write_helper [btrfs]
    task: ffff9f827b340bc0 task.stack: ffffb4f8c0304000
    RIP: 0010:blk_status_to_errno+0x1a/0x30
    Call Trace:
     submit_extent_page+0x191/0x270 [btrfs]
     ? btrfs_create_repair_bio+0x130/0x130 [btrfs]
     __do_readpage+0x2d2/0x810 [btrfs]
     ? btrfs_create_repair_bio+0x130/0x130 [btrfs]
     ? run_one_async_done+0xc0/0xc0 [btrfs]
     __extent_read_full_page+0xe7/0x100 [btrfs]
     ? run_one_async_done+0xc0/0xc0 [btrfs]
     read_extent_buffer_pages+0x1ab/0x2d0 [btrfs]
     ? run_one_async_done+0xc0/0xc0 [btrfs]
     btree_read_extent_buffer_pages+0x94/0xf0 [btrfs]
     read_tree_block+0x31/0x60 [btrfs]
     read_block_for_search.isra.35+0xf0/0x2e0 [btrfs]
     btrfs_search_slot+0x46b/0xa00 [btrfs]
     ? kmem_cache_alloc+0x1a8/0x510
     ? btrfs_get_token_32+0x5b/0x120 [btrfs]
     find_parent_nodes+0x11d/0xeb0 [btrfs]
     ? leaf_space_used+0xb8/0xd0 [btrfs]
     ? btrfs_leaf_free_space+0x49/0x90 [btrfs]
     ? btrfs_find_all_roots_safe+0x93/0x100 [btrfs]
     btrfs_find_all_roots_safe+0x93/0x100 [btrfs]
     btrfs_find_all_roots+0x45/0x60 [btrfs]
     btrfs_qgroup_trace_extent_post+0x20/0x40 [btrfs]
     btrfs_add_delayed_data_ref+0x1a3/0x1d0 [btrfs]
     btrfs_alloc_reserved_file_extent+0x38/0x40 [btrfs]
     insert_reserved_file_extent.constprop.71+0x289/0x2e0 [btrfs]
     btrfs_finish_ordered_io+0x2f4/0x7f0 [btrfs]
     ? pick_next_task_fair+0x2cd/0x530
     ? __switch_to+0x92/0x4b0
     btrfs_worker_helper+0x81/0x300 [btrfs]
     process_one_work+0x1da/0x3f0
     worker_thread+0x2b/0x3f0
     ? process_one_work+0x3f0/0x3f0
     kthread+0x11a/0x130
     ? kthread_create_on_node+0x40/0x40
     ret_from_fork+0x35/0x40
    
    BTRFS critical (device vda2): unable to find logical 8820195328 length 16384
    BTRFS: error (device vda2) in btrfs_finish_ordered_io:3023: errno=-5 IO failure
    BTRFS info (device vda2): forced readonly
    BTRFS error (device vda2): pending csums is 2887680
    
    [CAUSE]
    It's caused by race with block group auto removal:
    
    - There is a meta block group X, which has only one tree block
      The tree block belongs to fs tree 257.
    - In current transaction, some operation modified fs tree 257
      The tree block gets COWed, so the block group X is empty, and marked
      as unused, queued to be deleted.
    - Some workload (like fsync) wakes up cleaner_kthread()
      Which will call btrfs_delete_unused_bgs() to remove unused block
      groups.
      So block group X along its chunk map get removed.
    - Some delalloc work finished for fs tree 257
      Quota needs to get the original reference of the extent, which will
      read tree blocks of commit root of 257.
      Then since the chunk map gets removed, the above warning gets
      triggered.
    
    [FIX]
    Just let btrfs_delete_unused_bgs() skip block group which still has
    pinned bytes.
    
    However there is a minor side effect: currently we only queue empty
    blocks at update_block_group(), and such empty block group with pinned
    bytes won't go through update_block_group() again, such block group
    won't be removed, until it gets new extent allocated and removed.
    Signed-off-by: default avatarQu Wenruo <wqu@suse.com>
    Reviewed-by: default avatarFilipe Manana <fdmanana@suse.com>
    Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
    43794446
extent-tree.c 302 KB