1. 16 Aug, 2021 10 commits
    • Dave Chinner's avatar
      xfs: move xlog_commit_record to xfs_log_cil.c · 2ce82b72
      Dave Chinner authored
      It is only used by the CIL checkpoints, and is the counterpart to
      start record formatting and writing that is already local to
      xfs_log_cil.c.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      2ce82b72
    • Dave Chinner's avatar
      xfs: log head and tail aren't reliable during shutdown · 2562c322
      Dave Chinner authored
      I'm seeing assert failures from xlog_space_left() after a shutdown
      has begun that look like:
      
      XFS (dm-0): log I/O error -5
      XFS (dm-0): xfs_do_force_shutdown(0x2) called from line 1338 of file fs/xfs/xfs_log.c. Return address = xlog_ioend_work+0x64/0xc0
      XFS (dm-0): Log I/O Error Detected.
      XFS (dm-0): Shutting down filesystem. Please unmount the filesystem and rectify the problem(s)
      XFS (dm-0): xlog_space_left: head behind tail
      XFS (dm-0):   tail_cycle = 6, tail_bytes = 2706944
      XFS (dm-0):   GH   cycle = 6, GH   bytes = 1633867
      XFS: Assertion failed: 0, file: fs/xfs/xfs_log.c, line: 1310
      ------------[ cut here ]------------
      Call Trace:
       xlog_space_left+0xc3/0x110
       xlog_grant_push_threshold+0x3f/0xf0
       xlog_grant_push_ail+0x12/0x40
       xfs_log_reserve+0xd2/0x270
       ? __might_sleep+0x4b/0x80
       xfs_trans_reserve+0x18b/0x260
      .....
      
      There are two things here. Firstly, after a shutdown, the log head
      and tail can be out of whack as things abort and release (or don't
      release) resources, so checking them for sanity doesn't make much
      sense. Secondly, xfs_log_reserve() can race with shutdown and so it
      can still fail like this even though it has already checked for a
      log shutdown before calling xlog_grant_push_ail().
      
      So, before ASSERT failing in xlog_space_left(), make sure we haven't
      already shut down....
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      2562c322
    • Dave Chinner's avatar
      xfs: don't run shutdown callbacks on active iclogs · 502a01fa
      Dave Chinner authored
      When the log is shutdown, it currently walks all the iclogs and runs
      callbacks that are attached to the iclogs, regardless of whether the
      iclog is queued for IO completion or not. This creates a problem for
      contexts attaching callbacks to iclogs in that a racing shutdown can
      run the callbacks even before the attaching context has finished
      processing the iclog and releasing it for IO submission.
      
      If the callback processing of the iclog frees the structure that is
      attached to the iclog, then this leads to an UAF scenario that can
      only be protected against by holding the icloglock from the point
      callbacks are attached through to the release of the iclog. While we
      currently do this, it is not practical or sustainable.
      
      Hence we need to make shutdown processing the responsibility of the
      context that holds active references to the iclog. We know that the
      contexts attaching callbacks to the iclog must have active
      references to the iclog, and that means they must be in either
      ACTIVE or WANT_SYNC states. xlog_state_do_callback() will skip over
      iclogs in these states -except- when the log is shut down.
      
      xlog_state_do_callback() checks the state of the iclogs while
      holding the icloglock, therefore the reference count/state change
      that occurs in xlog_state_release_iclog() after the callbacks are
      atomic w.r.t. shutdown processing.
      
      We can't push the responsibility of callback cleanup onto the CIL
      context because we can have ACTIVE iclogs that have callbacks
      attached that have already been released. Hence we really need to
      internalise the cleanup of callbacks into xlog_state_release_iclog()
      processing.
      
      Indeed, we already have that internalisation via:
      
      xlog_state_release_iclog
        drop last reference
          ->SYNCING
        xlog_sync
          xlog_write_iclog
            if (log_is_shutdown)
              xlog_state_done_syncing()
      	  xlog_state_do_callback()
      	    <process shutdown on iclog that is now in SYNCING state>
      
      The problem is that xlog_state_release_iclog() aborts before doing
      anything if the log is already shut down. It assumes that the
      callbacks have already been cleaned up, and it doesn't need to do
      any cleanup.
      
      Hence the fix is to remove the xlog_is_shutdown() check from
      xlog_state_release_iclog() so that reference counts are correctly
      released from the iclogs, and when the reference count is zero we
      always transition to SYNCING if the log is shut down. Hence we'll
      always enter the xlog_sync() path in a shutdown and eventually end
      up erroring out the iclog IO and running xlog_state_do_callback() to
      process the callbacks attached to the iclog.
      
      This allows us to stop processing referenced ACTIVE/WANT_SYNC iclogs
      directly in the shutdown code, and in doing so gets rid of the UAF
      vector that currently exists. This then decouples the adding of
      callbacks to the iclogs from xlog_state_release_iclog() as we
      guarantee that xlog_state_release_iclog() will process the callbacks
      if the log has been shut down before xlog_state_release_iclog() has
      been called.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      502a01fa
    • Dave Chinner's avatar
      xfs: separate out log shutdown callback processing · aad7272a
      Dave Chinner authored
      The iclog callback processing done during a forced log shutdown has
      different logic to normal runtime IO completion callback processing.
      Separate out the shutdown callbacks into their own function and call
      that from the shutdown code instead.
      
      We don't need this shutdown specific logic in the normal runtime
      completion code - we'll always run the shutdown version on shutdown,
      and it will do what shutdown needs regardless of whether there are
      racing IO completion callbacks scheduled or in progress. Hence we
      can also simplify the normal IO completion callpath and only abort
      if shutdown occurred while we actively were processing callbacks.
      
      Further, separating out the IO completion logic from the shutdown
      logic avoids callback race conditions from being triggered by log IO
      completion after a shutdown. IO completion will now only run
      callbacks on iclogs that are in the correct state for a callback to
      be run, avoiding the possibility of running callbacks on a
      referenced iclog that hasn't yet been submitted for IO.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      aad7272a
    • Dave Chinner's avatar
      xfs: rework xlog_state_do_callback() · 8bb92005
      Dave Chinner authored
      Clean it up a bit by factoring and rearranging some of the code.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      8bb92005
    • Dave Chinner's avatar
      xfs: make forced shutdown processing atomic · b36d4651
      Dave Chinner authored
      The running of a forced shutdown is a bit of a mess. It does racy
      checks for XFS_MOUNT_SHUTDOWN in xfs_do_force_shutdown(), then
      does more racy checks in xfs_log_force_unmount() before finally
      setting XFS_MOUNT_SHUTDOWN and XLOG_IO_ERROR under the
      log->icloglock.
      
      Move the checking and setting of XFS_MOUNT_SHUTDOWN into
      xfs_do_force_shutdown() so we only process a shutdown once and once
      only. Serialise this with the mp->m_sb_lock spinlock so that the
      state change is atomic and won't race. Move all the mount specific
      shutdown state changes from xfs_log_force_unmount() to
      xfs_do_force_shutdown() so they are done atomically with setting
      XFS_MOUNT_SHUTDOWN.
      
      Then get rid of the racy xlog_is_shutdown() check from
      xlog_force_shutdown(), and gate the log shutdown on the
      test_and_set_bit(XLOG_IO_ERROR) test under the icloglock. This
      means that the log is shutdown once and once only, and code that
      needs to prevent races with shutdown can do so by holding the
      icloglock and checking the return value of xlog_is_shutdown().
      
      This results in a predictable shutdown execution process - we set the
      shutdown flags once and process the shutdown once rather than the
      current "as many concurrent shutdowns as can race to the flag
      setting" situation we have now.
      
      Also, now that shutdown is atomic, alway emit a stack trace when the
      error level for the filesystem is high enough. This means that we
      always get a stack trace when trying to diagnose the cause of
      shutdowns in the field, rather than just for SHUTDOWN_CORRUPT_INCORE
      cases.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      b36d4651
    • Dave Chinner's avatar
      xfs: convert log flags to an operational state field · e1d06e5f
      Dave Chinner authored
      log->l_flags doesn't actually contain "flags" as such, it contains
      operational state information that can change at runtime. For the
      shutdown state, this at least should be an atomic bit because
      it is read without holding locks in many places and so using atomic
      bitops for the state field modifications makes sense.
      
      This allows us to use things like test_and_set_bit() on state
      changes (e.g. setting XLOG_TAIL_WARN) to avoid races in setting the
      state when we aren't holding locks.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      e1d06e5f
    • Dave Chinner's avatar
      xfs: move recovery needed state updates to xfs_log_mount_finish · fd67d8a0
      Dave Chinner authored
      xfs_log_mount_finish() needs to know if recovery is needed or not to
      make decisions on whether to flush the log and AIL.  Move the
      handling of the NEED_RECOVERY state out to this function rather than
      needing a temporary variable to store this state over the call to
      xlog_recover_finish().
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      fd67d8a0
    • Dave Chinner's avatar
      xfs: XLOG_STATE_IOERROR must die · 5112e206
      Dave Chinner authored
      We don't need an iclog state field to tell us the log has been shut
      down. We can just check the xlog_is_shutdown() instead. The avoids
      the need to have shutdown overwrite the current iclog state while
      being active used by the log code and so having to ensure that every
      iclog state check handles XLOG_STATE_IOERROR appropriately.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      5112e206
    • Dave Chinner's avatar
      xfs: convert XLOG_FORCED_SHUTDOWN() to xlog_is_shutdown() · 2039a272
      Dave Chinner authored
      Make it less shouty and a static inline before adding more calls
      through the log code.
      
      Also convert internal log code that uses XFS_FORCED_SHUTDOWN(mount)
      to use xlog_is_shutdown(log) as well.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      2039a272
  2. 11 Aug, 2021 2 commits
  3. 09 Aug, 2021 20 commits
  4. 06 Aug, 2021 8 commits