• Eric Biggers's avatar
    fs: fix lazytime expiration handling in __writeback_single_inode() · 1e249cb5
    Eric Biggers authored
    When lazytime is enabled and an inode is being written due to its
    in-memory updated timestamps having expired, either due to a sync() or
    syncfs() system call or due to dirtytime_expire_interval having elapsed,
    the VFS needs to inform the filesystem so that the filesystem can copy
    the inode's timestamps out to the on-disk data structures.
    
    This is done by __writeback_single_inode() calling
    mark_inode_dirty_sync(), which then calls ->dirty_inode(I_DIRTY_SYNC).
    
    However, this occurs after __writeback_single_inode() has already
    cleared the dirty flags from ->i_state.  This causes two bugs:
    
    - mark_inode_dirty_sync() redirties the inode, causing it to remain
      dirty.  This wastefully causes the inode to be written twice.  But
      more importantly, it breaks cases where sync_filesystem() is expected
      to clean dirty inodes.  This includes the FS_IOC_REMOVE_ENCRYPTION_KEY
      ioctl (as reported at
      https://lore.kernel.org/r/20200306004555.GB225345@gmail.com), as well
      as possibly filesystem freezing (freeze_super()).
    
    - Since ->i_state doesn't contain I_DIRTY_TIME when ->dirty_inode() is
      called from __writeback_single_inode() for lazytime expiration,
      xfs_fs_dirty_inode() ignores the notification.  (XFS only cares about
      lazytime expirations, and it assumes that i_state will contain
      I_DIRTY_TIME during those.)  Therefore, lazy timestamps aren't
      persisted by sync(), syncfs(), or dirtytime_expire_interval on XFS.
    
    Fix this by moving the call to mark_inode_dirty_sync() to earlier in
    __writeback_single_inode(), before the dirty flags are cleared from
    i_state.  This makes filesystems be properly notified of the timestamp
    expiration, and it avoids incorrectly redirtying the inode.
    
    This fixes xfstest generic/580 (which tests
    FS_IOC_REMOVE_ENCRYPTION_KEY) when run on ext4 or f2fs with lazytime
    enabled.  It also fixes the new lazytime xfstest I've proposed, which
    reproduces the above-mentioned XFS bug
    (https://lore.kernel.org/r/20210105005818.92978-1-ebiggers@kernel.org).
    
    Alternatively, we could call ->dirty_inode(I_DIRTY_SYNC) directly.  But
    due to the introduction of I_SYNC_QUEUED, mark_inode_dirty_sync() is the
    right thing to do because mark_inode_dirty_sync() now knows not to move
    the inode to a writeback list if it is currently queued for sync.
    
    Fixes: 0ae45f63 ("vfs: add support for a lazytime mount option")
    Cc: stable@vger.kernel.org
    Depends-on: 5afced3b ("writeback: Avoid skipping inode writeback")
    Link: https://lore.kernel.org/r/20210112190253.64307-2-ebiggers@kernel.orgSuggested-by: default avatarJan Kara <jack@suse.cz>
    Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
    Reviewed-by: default avatarJan Kara <jack@suse.cz>
    Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
    Signed-off-by: default avatarJan Kara <jack@suse.cz>
    1e249cb5
fs-writeback.c 74.7 KB