1. 05 Oct, 2016 6 commits
  2. 04 Oct, 2016 6 commits
  3. 03 Oct, 2016 18 commits
  4. 02 Oct, 2016 8 commits
  5. 25 Sep, 2016 2 commits
    • Brian Foster's avatar
      xfs: log recovery tracepoints to track current lsn and buffer submission · 5cd9cee9
      Brian Foster authored
      Log recovery has particular rules around buffer submission along with
      tricky corner cases where independent transactions can share an LSN. As
      such, it can be difficult to follow when/why buffers are submitted
      during recovery.
      
      Add a couple tracepoints to post the current LSN of a record when a new
      record is being processed and when a buffer is being skipped due to LSN
      ordering. Also, update the recover item class to include the LSN of the
      current transaction for the item being processed.
      Signed-off-by: default avatarBrian Foster <bfoster@redhat.com>
      Reviewed-by: default avatarDave Chinner <dchinner@redhat.com>
      Signed-off-by: default avatarDave Chinner <david@fromorbit.com>
      5cd9cee9
    • Brian Foster's avatar
      xfs: update metadata LSN in buffers during log recovery · 60a4a222
      Brian Foster authored
      Log recovery is currently broken for v5 superblocks in that it never
      updates the metadata LSN of buffers written out during recovery. The
      metadata LSN is recorded in various bits of metadata to provide recovery
      ordering criteria that prevents transient corruption states reported by
      buffer write verifiers. Without such ordering logic, buffer updates can
      be replayed out of order and lead to false positive transient corruption
      states. This is generally not a corruption vector on its own, but
      corruption detection shuts down the filesystem and ultimately prevents a
      mount if it occurs during log recovery. This requires an xfs_repair run
      that clears the log and potentially loses filesystem updates.
      
      This problem is avoided in most cases as metadata writes during normal
      filesystem operation update the metadata LSN appropriately. The problem
      with log recovery not updating metadata LSNs manifests if the system
      happens to crash shortly after log recovery itself. In this scenario, it
      is possible for log recovery to complete all metadata I/O such that the
      filesystem is consistent. If a crash occurs after that point but before
      the log tail is pushed forward by subsequent operations, however, the
      next mount performs the same log recovery over again. If a buffer is
      updated multiple times in the dirty range of the log, an earlier update
      in the log might not be valid based on the current state of the
      associated buffer after all of the updates in the log had been replayed
      (before the previous crash). If a verifier happens to detect such a
      problem, the filesystem claims corruption and immediately shuts down.
      
      This commonly manifests in practice as directory block verifier failures
      such as the following, likely due to directory verifiers being
      particularly detailed in their checks as compared to most others:
      
        ...
        Mounting V5 Filesystem
        XFS (dm-0): Starting recovery (logdev: internal)
        XFS (dm-0): Internal error XFS_WANT_CORRUPTED_RETURN at line ... of \
          file fs/xfs/libxfs/xfs_dir2_data.c.  Caller xfs_dir3_data_verify ...
        ...
      
      Update log recovery to update the metadata LSN of recovered buffers.
      Since metadata LSNs are already updated by write verifer functions via
      attached log items, attach a dummy log item to the buffer during
      validation and explicitly set the LSN of the current transaction. This
      ensures that the metadata LSN of a buffer is updated based on whether
      the recovery I/O actually completes, and if so, that subsequent recovery
      attempts identify that the buffer is already up to date with respect to
      the current transaction.
      Signed-off-by: default avatarBrian Foster <bfoster@redhat.com>
      Reviewed-by: default avatarDave Chinner <dchinner@redhat.com>
      Signed-off-by: default avatarDave Chinner <david@fromorbit.com>
      60a4a222