• Omar Sandoval's avatar
    Btrfs: fix missing delayed iputs on unmount · d6fd0ae2
    Omar Sandoval authored
    There's a race between close_ctree() and cleaner_kthread().
    close_ctree() sets btrfs_fs_closing(), and the cleaner stops when it
    sees it set, but this is racy; the cleaner might have already checked
    the bit and could be cleaning stuff. In particular, if it deletes unused
    block groups, it will create delayed iputs for the free space cache
    inodes. As of "btrfs: don't run delayed_iputs in commit", we're no
    longer running delayed iputs after a commit. Therefore, if the cleaner
    creates more delayed iputs after delayed iputs are run in
    btrfs_commit_super(), we will leak inodes on unmount and get a busy
    inode crash from the VFS.
    
    Fix it by parking the cleaner before we actually close anything. Then,
    any remaining delayed iputs will always be handled in
    btrfs_commit_super(). This also ensures that the commit in close_ctree()
    is really the last commit, so we can get rid of the commit in
    cleaner_kthread().
    
    The fstest/generic/475 followed by 476 can trigger a crash that
    manifests as a slab corruption caused by accessing the freed kthread
    structure by a wake up function. Sample trace:
    
    [ 5657.077612] BUG: unable to handle kernel NULL pointer dereference at 00000000000000cc
    [ 5657.079432] PGD 1c57a067 P4D 1c57a067 PUD da10067 PMD 0
    [ 5657.080661] Oops: 0000 [#1] PREEMPT SMP
    [ 5657.081592] CPU: 1 PID: 5157 Comm: fsstress Tainted: G        W         4.19.0-rc8-default+ #323
    [ 5657.083703] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.2-0-gf9626cc-prebuilt.qemu-project.org 04/01/2014
    [ 5657.086577] RIP: 0010:shrink_page_list+0x2f9/0xe90
    [ 5657.091937] RSP: 0018:ffffb5c745c8f728 EFLAGS: 00010287
    [ 5657.092953] RAX: 0000000000000074 RBX: ffffb5c745c8f830 RCX: 0000000000000000
    [ 5657.094590] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff9a8747fdf3d0
    [ 5657.095987] RBP: ffffb5c745c8f9e0 R08: 0000000000000000 R09: 0000000000000000
    [ 5657.097159] R10: ffff9a8747fdf5e8 R11: 0000000000000000 R12: ffffb5c745c8f788
    [ 5657.098513] R13: ffff9a877f6ff2c0 R14: ffff9a877f6ff2c8 R15: dead000000000200
    [ 5657.099689] FS:  00007f948d853b80(0000) GS:ffff9a877d600000(0000) knlGS:0000000000000000
    [ 5657.101032] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [ 5657.101953] CR2: 00000000000000cc CR3: 00000000684bd000 CR4: 00000000000006e0
    [ 5657.103159] Call Trace:
    [ 5657.103776]  shrink_inactive_list+0x194/0x410
    [ 5657.104671]  shrink_node_memcg.constprop.84+0x39a/0x6a0
    [ 5657.105750]  shrink_node+0x62/0x1c0
    [ 5657.106529]  try_to_free_pages+0x1a4/0x500
    [ 5657.107408]  __alloc_pages_slowpath+0x2c9/0xb20
    [ 5657.108418]  __alloc_pages_nodemask+0x268/0x2b0
    [ 5657.109348]  kmalloc_large_node+0x37/0x90
    [ 5657.110205]  __kmalloc_node+0x236/0x310
    [ 5657.111014]  kvmalloc_node+0x3e/0x70
    
    Fixes: 30928e9b ("btrfs: don't run delayed_iputs in commit")
    Signed-off-by: default avatarOmar Sandoval <osandov@fb.com>
    Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
    [ add trace ]
    Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
    d6fd0ae2
disk-io.c 123 KB