• Darrick J. Wong's avatar
    xfs: log new intent items created as part of finishing recovered intent items · 93293bcb
    Darrick J. Wong authored
    During a code inspection, I found a serious bug in the log intent item
    recovery code when an intent item cannot complete all the work and
    decides to requeue itself to get that done.  When this happens, the
    item recovery creates a new incore deferred op representing the
    remaining work and attaches it to the transaction that it allocated.  At
    the end of _item_recover, it moves the entire chain of deferred ops to
    the dummy parent_tp that xlog_recover_process_intents passed to it, but
    fail to log a new intent item for the remaining work before committing
    the transaction for the single unit of work.
    
    xlog_finish_defer_ops logs those new intent items once recovery has
    finished dealing with the intent items that it recovered, but this isn't
    sufficient.  If the log is forced to disk after a recovered log item
    decides to requeue itself and the system goes down before we call
    xlog_finish_defer_ops, the second log recovery will never see the new
    intent item and therefore has no idea that there was more work to do.
    It will finish recovery leaving the filesystem in a corrupted state.
    
    The same logic applies to /any/ deferred ops added during intent item
    recovery, not just the one handling the remaining work.
    Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
    Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
    Reviewed-by: default avatarDave Chinner <dchinner@redhat.com>
    93293bcb
xfs_refcount_item.c 18.3 KB