• Darrick J. Wong's avatar
    xfs: only clear log incompat flags at clean unmount · 5302a5c8
    Darrick J. Wong authored
    While reviewing the online fsck patchset, someone spied the
    xfs_swapext_can_use_without_log_assistance function and wondered why we
    go through this inverted-bitmask dance to avoid setting the
    XFS_SB_FEAT_INCOMPAT_LOG_SWAPEXT feature.
    
    (The same principles apply to the logged extended attribute update
    feature bit in the since-merged LARP series.)
    
    The reason for this dance is that xfs_add_incompat_log_feature is an
    expensive operation -- it forces the log, pushes the AIL, and then if
    nobody's beaten us to it, sets the feature bit and issues a synchronous
    write of the primary superblock.  That could be a one-time cost
    amortized over the life of the filesystem, but the log quiesce and cover
    operations call xfs_clear_incompat_log_features to remove feature bits
    opportunistically.  On a moderately loaded filesystem this leads to us
    cycling those bits on and off over and over, which hurts performance.
    
    Why do we clear the log incompat bits?  Back in ~2020 I think Dave and I
    had a conversation on IRC[2] about what the log incompat bits represent.
    IIRC in that conversation we decided that the log incompat bits protect
    unrecovered log items so that old kernels won't try to recover them and
    barf.  Since a clean log has no protected log items, we could clear the
    bits at cover/quiesce time.
    
    As Dave Chinner pointed out in the thread, clearing log incompat bits at
    unmount time has positive effects for golden root disk image generator
    setups, since the generator could be running a newer kernel than what
    gets written to the golden image -- if there are log incompat fields set
    in the golden image that was generated by a newer kernel/OS image
    builder then the provisioning host cannot mount the filesystem even
    though the log is clean and recovery is unnecessary to mount the
    filesystem.
    
    Given that it's expensive to set log incompat bits, we really only want
    to do that once per bit per mount.  Therefore, I propose that we only
    clear log incompat bits as part of writing a clean unmount record.  Do
    this by adding an operational state flag to the xfs mount that guards
    whether or not the feature bit clearing can actually take place.
    
    This eliminates the l_incompat_users rwsem that we use to protect a log
    cleaning operation from clearing a feature bit that a frontend thread is
    trying to set -- this lock adds another way to fail w.r.t. locking.  For
    the swapext series, I shard that into multiple locks just to work around
    the lockdep complaints, and that's fugly.
    
    Link: https://lore.kernel.org/linux-xfs/20240131230043.GA6180@frogsfrogsfrogs/Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
    Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
    Reviewed-by: default avatarDave Chinner <dchinner@redhat.com>
    5302a5c8
xfs_log_recover.c 98.2 KB