• Darrick J. Wong's avatar
    xfs: allow queued AG intents to drain before scrubbing · d5c88131
    Darrick J. Wong authored
    When a writer thread executes a chain of log intent items, the AG header
    buffer locks will cycle during a transaction roll to get from one intent
    item to the next in a chain.  Although scrub takes all AG header buffer
    locks, this isn't sufficient to guard against scrub checking an AG while
    that writer thread is in the middle of finishing a chain because there's
    no higher level locking primitive guarding allocation groups.
    
    When there's a collision, cross-referencing between data structures
    (e.g. rmapbt and refcountbt) yields false corruption events; if repair
    is running, this results in incorrect repairs, which is catastrophic.
    
    Fix this by adding to the perag structure the count of active intents
    and make scrub wait until it has both AG header buffer locks and the
    intent counter reaches zero.
    
    One quirk of the drain code is that deferred bmap updates also bump and
    drop the intent counter.  A fundamental decision made during the design
    phase of the reverse mapping feature is that updates to the rmapbt
    records are always made by the same code that updates the primary
    metadata.  In other words, callers of bmapi functions expect that the
    bmapi functions will queue deferred rmap updates.
    
    Some parts of the reflink code queue deferred refcount (CUI) and bmap
    (BUI) updates in the same head transaction, but the deferred work
    manager completely finishes the CUI before the BUI work is started.  As
    a result, the CUI drops the intent count long before the deferred rmap
    (RUI) update even has a chance to bump the intent count.  The only way
    to keep the intent count elevated between the CUI and RUI is for the BUI
    to bump the counter until the RUI has been created.
    
    A second quirk of the intent drain code is that deferred work items must
    increment the intent counter as soon as the work item is added to the
    transaction.  When a BUI completes and queues an RUI, the RUI must
    increment the counter before the BUI decrements it.  The only way to
    accomplish this is to require that the counter be bumped as soon as the
    deferred work item is created in memory.
    
    In the next patches we'll improve on this facility, but this patch
    provides the basic functionality.
    Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
    Reviewed-by: default avatarDave Chinner <dchinner@redhat.com>
    d5c88131
xfs_bmap_item.c 19 KB