1. 06 Nov, 2017 23 commits
  2. 03 Nov, 2017 2 commits
    • Darrick J. Wong's avatar
      xfs: scrub: avoid uninitialized return code · 0dca060c
      Darrick J. Wong authored
      The newly added xfs_scrub_da_btree_block() function has one code path
      that returns the 'error' variable without initializing it first, as
      shown by this compiler warning:
      
      fs/xfs/scrub/dabtree.c: In function 'xfs_scrub_da_btree_block':
      fs/xfs/scrub/dabtree.c:462:9: error: 'error' may be used uninitialized in this function [-Werror=maybe-uninitialized]
      
      Return zero since the caller will exit the scrub code if we don't produce a
      buffer pointer.
      
      Fixes: 7c4a07a4 ("xfs: scrub directory/attribute btrees")
      Reported-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: default avatarBrian Foster <bfoster@redhat.com>
      0dca060c
    • Eryu Guan's avatar
      xfs: truncate pagecache before writeback in xfs_setattr_size() · 350976ae
      Eryu Guan authored
      On truncate down, if new size is not block size aligned, we zero the
      rest of block to avoid exposing stale data to user, and
      iomap_truncate_page() skips zeroing if the range is already in
      unwritten state or a hole. Then we writeback from on-disk i_size to
      the new size if this range hasn't been written to disk yet, and
      truncate page cache beyond new EOF and set in-core i_size.
      
      The problem is that we could write data between di_size and newsize
      before removing the page cache beyond newsize, as the extents may
      still be in unwritten state right after a buffer write. As such, the
      page of data that newsize lies in has not been zeroed by page cache
      invalidation before it is written, and xfs_do_writepage() hasn't
      triggered it's "zero data beyond EOF" case because we haven't
      updated in-core i_size yet. Then a subsequent mmap read could see
      non-zeros past EOF.
      
      I occasionally see this in fsx runs in fstests generic/112, a
      simplified fsx operation sequence is like (assuming 4k block size
      xfs):
      
        fallocate 0x0 0x1000 0x0 keep_size
        write 0x0 0x1000 0x0
        truncate 0x0 0x800 0x1000
        punch_hole 0x0 0x800 0x800
        mapread 0x0 0x800 0x800
      
      where fallocate allocates unwritten extent but doesn't update
      i_size, buffer write populates the page cache and extent is still
      unwritten, truncate skips zeroing page past new EOF and writes the
      page to disk, punch_hole invalidates the page cache, at last mapread
      reads the block back and sees non-zero beyond EOF.
      
      Fix it by moving truncate_setsize() to before writeback so the page
      cache invalidation zeros the partial page at the new EOF. This also
      triggers "zero data beyond EOF" in xfs_do_writepage() at writeback
      time, because newsize has been set and page straddles the newsize.
      
      Also fixed the wrong 'end' param of filemap_write_and_wait_range()
      call while we're at it, the 'end' is inclusive and should be
      'newsize - 1'.
      Suggested-by: default avatarDave Chinner <dchinner@redhat.com>
      Signed-off-by: default avatarEryu Guan <eguan@redhat.com>
      Acked-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarBrian Foster <bfoster@redhat.com>
      Reviewed-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      350976ae
  3. 01 Nov, 2017 4 commits
  4. 31 Oct, 2017 1 commit
  5. 27 Oct, 2017 4 commits
  6. 26 Oct, 2017 6 commits
    • Darrick J. Wong's avatar
      xfs: validate sb_logsunit is a multiple of the fs blocksize · 9c92ee20
      Darrick J. Wong authored
      Make sure the log stripe unit is sane before proceeding with mounting.
      AFAICT this means that logsunit has to be 0, 1, or a multiple of the fs
      block size.  Found this by setting the LSB of logsunit in xfs/350 and
      watching the system crash as soon as we try to write to the log.
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: default avatarBrian Foster <bfoster@redhat.com>
      9c92ee20
    • Brian Foster's avatar
      xfs: drain the buffer LRU on mount · f1b92bbc
      Brian Foster authored
      Log recovery of v4 filesystems does not use buffer verifiers because
      log recovery historically can result in transient buffer corruption
      when target buffers might be ahead of the log after a crash. v5
      filesystems work around this problem with metadata LSN ordering.
      
      While this log recovery verifier behavior is necessary on v4 supers,
      it can result in leaving buffers around in the LRU without verifiers
      attached for a significant amount of time. This leads to use of
      unverified buffers while the filesystem is in active use, long after
      recovery has completed.
      
      To address this problem, drain all buffers from the LRU as a final
      step of the log mount sequence. Note that this is done
      unconditionally to provide a consistently clean cache footprint,
      regardless of superblock version or log state. As a side effect,
      this ensures that all cache resident, unverified buffers are
      reclaimed after log recovery and therefore must be recreated with
      verifiers on subsequent use.
      Reported-by: default avatarDarrick Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarBrian Foster <bfoster@redhat.com>
      Reviewed-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      f1b92bbc
    • Brian Foster's avatar
      xfs: fix log block underflow during recovery cycle verification · 9f2a4505
      Brian Foster authored
      It is possible for mkfs to format very small filesystems with too
      small of an internal log with respect to the various minimum size
      and block count requirements. If this occurs when the log happens to
      be smaller than the scan window used for cycle verification and the
      scan wraps the end of the log, the start_blk calculation in
      xlog_find_head() underflows and leads to an attempt to scan an
      invalid range of log blocks. This results in log recovery failure
      and a failed mount.
      
      Since there may be filesystems out in the wild with this kind of
      geometry, we cannot simply refuse to mount. Instead, cap the scan
      window for cycle verification to the size of the physical log. This
      ensures that the cycle verification proceeds as expected when the
      scan wraps the end of the log.
      Reported-by: default avatarZorro Lang <zlang@redhat.com>
      Signed-off-by: default avatarBrian Foster <bfoster@redhat.com>
      Reviewed-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      9f2a4505
    • Brian Foster's avatar
      xfs: more robust recovery xlog buffer validation · 99c26595
      Brian Foster authored
      mkfs has a historical problem where it can format very small
      filesystems with too small of a physical log. Under certain
      conditions, log recovery of an associated filesystem can end up
      passing garbage parameter values to some of the cycle and log record
      verification functions due to bugs in log recovery not dealing with
      such filesystems properly. This results in attempts to read from
      bogus/underflowed log block addresses.
      
      Since the buffer read may ultimately succeed, log recovery can
      proceed with bogus data and otherwise go off the rails and crash.
      One example of this is a negative last_blk being passed to
      xlog_find_verify_log_record() causing us to skip the loop, pass a
      NULL head pointer to xlog_header_check_mount() and crash.
      
      Improve the xlog buffer verification to address this problem. We
      already verify xlog buffer length, so update this mechanism to also
      sanity check for a valid log relative block address and otherwise
      return an error. Pass a fixed, valid log block address from
      xlog_get_bp() since the target address will be validated when the
      buffer is read. This ensures that any bogus log block address/length
      calculations lead to graceful mount failure rather than risking a
      crash or worse if recovery proceeds with bogus data.
      Reported-by: default avatarZorro Lang <zlang@redhat.com>
      Signed-off-by: default avatarBrian Foster <bfoster@redhat.com>
      Reviewed-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      99c26595
    • Christoph Hellwig's avatar
      xfs: add a new xfs_iext_lookup_extent_before helper · dc56015f
      Christoph Hellwig authored
      This helper looks up the last extent the covers space before the passed
      in block number.  This is useful for truncate and similar operations that
      operate backwards over the extent list.  For xfs_bunmapi it also is
      a slight optimization as we can return early if there are not extents
      at or below the end of the to be truncated range.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      dc56015f
    • Christoph Hellwig's avatar
      xfs: merge xfs_bmap_read_extents into xfs_iread_extents · 211e95bb
      Christoph Hellwig authored
      xfs_iread_extents is just a trivial wrapper, there is no good reason
      to keep the two separate.
      
      [darrick: minor fixups having left xfs_bmbt_validate_extent intact]
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      211e95bb