• Darrick J. Wong's avatar
    xfs: periodically relog deferred intent items · 4e919af7
    Darrick J. Wong authored
    There's a subtle design flaw in the deferred log item code that can lead
    to pinning the log tail.  Taking up the defer ops chain examples from
    the previous commit, we can get trapped in sequences like this:
    
    Caller hands us a transaction t0 with D0-D3 attached.  The defer ops
    chain will look like the following if the transaction rolls succeed:
    
    t1: D0(t0), D1(t0), D2(t0), D3(t0)
    t2: d4(t1), d5(t1), D1(t0), D2(t0), D3(t0)
    t3: d5(t1), D1(t0), D2(t0), D3(t0)
    ...
    t9: d9(t7), D3(t0)
    t10: D3(t0)
    t11: d10(t10), d11(t10)
    t12: d11(t10)
    
    In transaction 9, we finish d9 and try to roll to t10 while holding onto
    an intent item for D3 that we logged in t0.
    
    The previous commit changed the order in which we place new defer ops in
    the defer ops processing chain to reduce the maximum chain length.  Now
    make xfs_defer_finish_noroll capable of relogging the entire chain
    periodically so that we can always move the log tail forward.  Most
    chains will never get relogged, except for operations that generate very
    long chains (large extents containing many blocks with different sharing
    levels) or are on filesystems with small logs and a lot of ongoing
    metadata updates.
    
    Callers are now required to ensure that the transaction reservation is
    large enough to handle logging done items and new intent items for the
    maximum possible chain length.  Most callers are careful to keep the
    chain lengths low, so the overhead should be minimal.
    
    The decision to relog an intent item is made based on whether the intent
    was logged in a previous checkpoint, since there's no point in relogging
    an intent into the same checkpoint.
    Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
    Reviewed-by: default avatarBrian Foster <bfoster@redhat.com>
    4e919af7
xfs_trace.h 119 KB