• Christian Brauner's avatar
    fs: handle freezing from multiple devices · 7366f8b6
    Christian Brauner authored
    Before [1] freezing a filesystems through the block layer only worked
    for the main block device as the owning superblock of additional block
    devices could not be found. Any filesystem that made use of multiple
    block devices would only be freezable via it's main block device.
    
    For example, consider xfs over device mapper with /dev/dm-0 as main
    block device and /dev/dm-1 as external log device. Two freeze requests
    before [1]:
    
    (1) dmsetup suspend /dev/dm-0 on the main block device
    
        bdev_freeze(dm-0)
        -> dm-0->bd_fsfreeze_count++
        -> freeze_super(xfs-sb)
    
        The owning superblock is found and the filesystem gets frozen.
        Returns 0.
    
    (2) dmsetup suspend /dev/dm-1 on the log device
    
        bdev_freeze(dm-1)
        -> dm-1->bd_fsfreeze_count++
    
        The owning superblock isn't found and only the block device freeze
        count is incremented. Returns 0.
    
    Two freeze requests after [1]:
    
    (1') dmsetup suspend /dev/dm-0 on the main block device
    
        bdev_freeze(dm-0)
        -> dm-0->bd_fsfreeze_count++
        -> freeze_super(xfs-sb)
    
        The owning superblock is found and the filesystem gets frozen.
        Returns 0.
    
    (2') dmsetup suspend /dev/dm-1 on the log device
    
        bdev_freeze(dm-0)
        -> dm-0->bd_fsfreeze_count++
        -> freeze_super(xfs-sb)
    
        The owning superblock is found and the filesystem gets frozen.
        Returns -EBUSY.
    
    When (2') is called we initiate a freeze from another block device of
    the same superblock. So we increment the bd_fsfreeze_count for that
    additional block device. But we now also find the owning superblock for
    additional block devices and call freeze_super() again which reports
    -EBUSY.
    
    This can be reproduced through xfstests via:
    
        mkfs.xfs -f -m crc=1,reflink=1,rmapbt=1, -i sparse=1 -lsize=1g,logdev=/dev/nvme1n1p4 /dev/nvme1n1p3
        mkfs.xfs -f -m crc=1,reflink=1,rmapbt=1, -i sparse=1 -lsize=1g,logdev=/dev/nvme1n1p6 /dev/nvme1n1p5
    
        FSTYP=xfs
        export TEST_DEV=/dev/nvme1n1p3
        export TEST_DIR=/mnt/test
        export TEST_LOGDEV=/dev/nvme1n1p4
        export SCRATCH_DEV=/dev/nvme1n1p5
        export SCRATCH_MNT=/mnt/scratch
        export SCRATCH_LOGDEV=/dev/nvme1n1p6
        export USE_EXTERNAL=yes
    
        sudo ./check generic/311
    
    Current semantics allow two concurrent freezers: one initiated from
    userspace via FREEZE_HOLDER_USERSPACE and one initiated from the kernel
    via FREEZE_HOLDER_KERNEL. If there are multiple concurrent freeze
    requests from either FREEZE_HOLDER_USERSPACE or FREEZE_HOLDER_KERNEL
    -EBUSY is returned.
    
    We need to preserve these semantics because as they are uapi via
    FIFREEZE and FITHAW ioctl()s. IOW, freezes don't nest for FIFREEZE and
    FITHAW. Other kernels consumers rely on non-nesting freezes as well.
    
    With freezes initiated from the block layer freezes need to nest if the
    same superblock is frozen via multiple devices. So we need to start
    counting the number of freeze requests.
    
    If FREEZE_MAY_NEST is passed alongside FREEZE_HOLDER_KERNEL or
    FREEZE_HOLDER_USERSPACE we allow the caller to nest freeze calls.
    
    To accommodate the old semantics we split the freeze counter into two
    counting kernel initiated and userspace initiated freezes separately. We
    can then also stop recording FREEZE_HOLDER_* in struct sb_writers.
    
    We also simplify freezing by making all concurrent freezers share a
    single active superblock reference count instead of having separate
    references for kernel and userspace. I don't see why we would need two
    active reference counts. Neither FREEZE_HOLDER_KERNEL nor
    FREEZE_HOLDER_USERSPACE can put the active reference as long as they are
    concurrent freezers anwyay. That was already true before we allowed
    nesting freezes.
    
    Survives various fstests runs with different options including the
    reproducer, online scrub, and online repair, fsfreze, and so on. Also
    survives blktests.
    
    Link: https://lore.kernel.org/linux-block/87bkccnwxc.fsf@debian-BULLSEYE-live-builder-AMD64
    Link: https://lore.kernel.org/r/20231104-vfs-multi-device-freeze-v2-2-5b5b69626eac@kernel.org
    Fixes: 288d8706abfc ("bdev: implement freeze and thaw holder operations") [1] # no backport needed
    Tested-by: default avatarChandan Babu R <chandanbabu@kernel.org>
    Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
    Reviewed-by: default avatarJan Kara <jack@suse.cz>
    Reported-by: default avatarChandan Babu R <chandanbabu@kernel.org>
    Signed-off-by: default avatarChristian Brauner <brauner@kernel.org>
    7366f8b6
super.c 58.6 KB