• Josef Bacik's avatar
    btrfs: don't drop extent_map for free space inode on write error · 5571e41e
    Josef Bacik authored
    While running the CI for an unrelated change I hit the following panic
    with generic/648 on btrfs_holes_spacecache.
    
    assertion failed: block_start != EXTENT_MAP_HOLE, in fs/btrfs/extent_io.c:1385
    ------------[ cut here ]------------
    kernel BUG at fs/btrfs/extent_io.c:1385!
    invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
    CPU: 1 PID: 2695096 Comm: fsstress Kdump: loaded Tainted: G        W          6.8.0-rc2+ #1
    RIP: 0010:__extent_writepage_io.constprop.0+0x4c1/0x5c0
    Call Trace:
     <TASK>
     extent_write_cache_pages+0x2ac/0x8f0
     extent_writepages+0x87/0x110
     do_writepages+0xd5/0x1f0
     filemap_fdatawrite_wbc+0x63/0x90
     __filemap_fdatawrite_range+0x5c/0x80
     btrfs_fdatawrite_range+0x1f/0x50
     btrfs_write_out_cache+0x507/0x560
     btrfs_write_dirty_block_groups+0x32a/0x420
     commit_cowonly_roots+0x21b/0x290
     btrfs_commit_transaction+0x813/0x1360
     btrfs_sync_file+0x51a/0x640
     __x64_sys_fdatasync+0x52/0x90
     do_syscall_64+0x9c/0x190
     entry_SYSCALL_64_after_hwframe+0x6e/0x76
    
    This happens because we fail to write out the free space cache in one
    instance, come back around and attempt to write it again.  However on
    the second pass through we go to call btrfs_get_extent() on the inode to
    get the extent mapping.  Because this is a new block group, and with the
    free space inode we always search the commit root to avoid deadlocking
    with the tree, we find nothing and return a EXTENT_MAP_HOLE for the
    requested range.
    
    This happens because the first time we try to write the space cache out
    we hit an error, and on an error we drop the extent mapping.  This is
    normal for normal files, but the free space cache inode is special.  We
    always expect the extent map to be correct.  Thus the second time
    through we end up with a bogus extent map.
    
    Since we're deprecating this feature, the most straightforward way to
    fix this is to simply skip dropping the extent map range for this failed
    range.
    
    I shortened the test by using error injection to stress the area to make
    it easier to reproduce.  With this patch in place we no longer panic
    with my error injection test.
    
    CC: stable@vger.kernel.org # 4.14+
    Reviewed-by: default avatarFilipe Manana <fdmanana@suse.com>
    Signed-off-by: default avatarJosef Bacik <josef@toxicpanda.com>
    Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
    5571e41e
inode.c 315 KB