1. 21 Nov, 2016 4 commits
    • Eric Biggers's avatar
      ext4: avoid lockdep warning when inheriting encryption context · 2f8f5e76
      Eric Biggers authored
      On a lockdep-enabled kernel, xfstests generic/027 fails due to a lockdep
      warning when run on ext4 mounted with -o test_dummy_encryption:
      
          xfs_io/4594 is trying to acquire lock:
           (jbd2_handle
          ){++++.+}, at:
          [<ffffffff813096ef>] jbd2_log_wait_commit+0x5/0x11b
      
          but task is already holding lock:
           (jbd2_handle
          ){++++.+}, at:
          [<ffffffff813000de>] start_this_handle+0x354/0x3d8
      
      The abbreviated call stack is:
      
       [<ffffffff813096ef>] ? jbd2_log_wait_commit+0x5/0x11b
       [<ffffffff8130972a>] jbd2_log_wait_commit+0x40/0x11b
       [<ffffffff813096ef>] ? jbd2_log_wait_commit+0x5/0x11b
       [<ffffffff8130987b>] ? __jbd2_journal_force_commit+0x76/0xa6
       [<ffffffff81309896>] __jbd2_journal_force_commit+0x91/0xa6
       [<ffffffff813098b9>] jbd2_journal_force_commit_nested+0xe/0x18
       [<ffffffff812a6049>] ext4_should_retry_alloc+0x72/0x79
       [<ffffffff812f0c1f>] ext4_xattr_set+0xef/0x11f
       [<ffffffff812cc35b>] ext4_set_context+0x3a/0x16b
       [<ffffffff81258123>] fscrypt_inherit_context+0xe3/0x103
       [<ffffffff812ab611>] __ext4_new_inode+0x12dc/0x153a
       [<ffffffff812bd371>] ext4_create+0xb7/0x161
      
      When a file is created in an encrypted directory, ext4_set_context() is
      called to set an encryption context on the new file.  This calls
      ext4_xattr_set(), which contains a retry loop where the journal is
      forced to commit if an ENOSPC error is encountered.
      
      If the task actually were to wait for the journal to commit in this
      case, then it would deadlock because a handle remains open from
      __ext4_new_inode(), so the running transaction can't be committed yet.
      Fortunately, __jbd2_journal_force_commit() avoids the deadlock by not
      allowing the running transaction to be committed while the current task
      has it open.  However, the above lockdep warning is still triggered.
      
      This was a false positive which was introduced by: 1eaa566d: jbd2:
      track more dependencies on transaction commit
      
      Fix the problem by passing the handle through the 'fs_data' argument to
      ext4_set_context(), then using ext4_xattr_set_handle() instead of
      ext4_xattr_set().  And in the case where no journal handle is specified
      and ext4_set_context() has to open one, add an ENOSPC retry loop since
      in that case it is the outermost transaction.
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      2f8f5e76
    • Ross Zwisler's avatar
      ext4: remove unused function ext4_aligned_io() · d086630e
      Ross Zwisler authored
      The last user of ext4_aligned_io() was the DAX path in
      ext4_direct_IO_write().  This usage was removed by Jan Kara's patch
      entitled "ext4: Rip out DAX handling from direct IO path".
      Signed-off-by: default avatarRoss Zwisler <ross.zwisler@linux.intel.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      d086630e
    • Jan Kara's avatar
      dax: rip out get_block based IO support · dd936e43
      Jan Kara authored
      No one uses functions using the get_block callback anymore. Rip them
      out and update documentation.
      Reviewed-by: default avatarRoss Zwisler <ross.zwisler@linux.intel.com>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      dd936e43
    • Jan Kara's avatar
      ext2: use iomap_zero_range() for zeroing truncated page in DAX path · 00697eed
      Jan Kara authored
      Currently the last user of ext2_get_blocks() for DAX inodes was
      dax_truncate_page(). Convert that to iomap_zero_range() so that all DAX
      IO uses the iomap path.
      Reviewed-by: default avatarRoss Zwisler <ross.zwisler@linux.intel.com>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      00697eed
  2. 20 Nov, 2016 8 commits
  3. 18 Nov, 2016 4 commits
  4. 15 Nov, 2016 5 commits
  5. 14 Nov, 2016 9 commits
  6. 13 Nov, 2016 3 commits
  7. 09 Nov, 2016 1 commit
  8. 08 Nov, 2016 6 commits
    • Ross Zwisler's avatar
      dax: remove "depends on BROKEN" from FS_DAX_PMD · 190b5caa
      Ross Zwisler authored
      Now that DAX PMD faults are once again working and are now participating in
      DAX's radix tree locking scheme, allow their config option to be enabled.
      Signed-off-by: default avatarRoss Zwisler <ross.zwisler@linux.intel.com>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarDave Chinner <david@fromorbit.com>
      190b5caa
    • Ross Zwisler's avatar
      xfs: use struct iomap based DAX PMD fault path · 862f1b9d
      Ross Zwisler authored
      Switch xfs_filemap_pmd_fault() from using dax_pmd_fault() to the new and
      improved dax_iomap_pmd_fault().  Also, now that it has no more users,
      remove xfs_get_blocks_dax_fault().
      Signed-off-by: default avatarRoss Zwisler <ross.zwisler@linux.intel.com>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarDave Chinner <david@fromorbit.com>
      862f1b9d
    • Ross Zwisler's avatar
      dax: add struct iomap based DAX PMD support · 642261ac
      Ross Zwisler authored
      DAX PMDs have been disabled since Jan Kara introduced DAX radix tree based
      locking.  This patch allows DAX PMDs to participate in the DAX radix tree
      based locking scheme so that they can be re-enabled using the new struct
      iomap based fault handlers.
      
      There are currently three types of DAX 4k entries: 4k zero pages, 4k DAX
      mappings that have an associated block allocation, and 4k DAX empty
      entries.  The empty entries exist to provide locking for the duration of a
      given page fault.
      
      This patch adds three equivalent 2MiB DAX entries: Huge Zero Page (HZP)
      entries, PMD DAX entries that have associated block allocations, and 2 MiB
      DAX empty entries.
      
      Unlike the 4k case where we insert a struct page* into the radix tree for
      4k zero pages, for HZP we insert a DAX exceptional entry with the new
      RADIX_DAX_HZP flag set.  This is because we use a single 2 MiB zero page in
      every 2MiB hole mapping, and it doesn't make sense to have that same struct
      page* with multiple entries in multiple trees.  This would cause contention
      on the single page lock for the one Huge Zero Page, and it would break the
      page->index and page->mapping associations that are assumed to be valid in
      many other places in the kernel.
      
      One difficult use case is when one thread is trying to use 4k entries in
      radix tree for a given offset, and another thread is using 2 MiB entries
      for that same offset.  The current code handles this by making the 2 MiB
      user fall back to 4k entries for most cases.  This was done because it is
      the simplest solution, and because the use of 2MiB pages is already
      opportunistic.
      
      If we were to try to upgrade from 4k pages to 2MiB pages for a given range,
      we run into the problem of how we lock out 4k page faults for the entire
      2MiB range while we clean out the radix tree so we can insert the 2MiB
      entry.  We can solve this problem if we need to, but I think that the cases
      where both 2MiB entries and 4K entries are being used for the same range
      will be rare enough and the gain small enough that it probably won't be
      worth the complexity.
      Signed-off-by: default avatarRoss Zwisler <ross.zwisler@linux.intel.com>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarDave Chinner <david@fromorbit.com>
      642261ac
    • Ross Zwisler's avatar
      dax: move put_(un)locked_mapping_entry() in dax.c · 422476c4
      Ross Zwisler authored
      No functional change.
      
      The static functions put_locked_mapping_entry() and
      put_unlocked_mapping_entry() will soon be used in error cases in
      grab_mapping_entry(), so move their definitions above this function.
      Signed-off-by: default avatarRoss Zwisler <ross.zwisler@linux.intel.com>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarDave Chinner <david@fromorbit.com>
      422476c4
    • Ross Zwisler's avatar
      dax: move RADIX_DAX_* defines to dax.h · fa28f729
      Ross Zwisler authored
      The RADIX_DAX_* defines currently mostly live in fs/dax.c, with just
      RADIX_DAX_ENTRY_LOCK being in include/linux/dax.h so it can be used in
      mm/filemap.c.  When we add PMD support, though, mm/filemap.c will also need
      access to the RADIX_DAX_PTE type so it can properly construct a 4k sized
      empty entry.
      
      Instead of shifting the defines between dax.c and dax.h as they are
      individually used in other code, just move them wholesale to dax.h so
      they'll be available when we need them.
      Signed-off-by: default avatarRoss Zwisler <ross.zwisler@linux.intel.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarDave Chinner <david@fromorbit.com>
      fa28f729
    • Ross Zwisler's avatar
      dax: dax_iomap_fault() needs to call iomap_end() · 1550290b
      Ross Zwisler authored
      Currently iomap_end() doesn't do anything for DAX page faults for both ext2
      and XFS.  ext2_iomap_end() just checks for a write underrun, and
      xfs_file_iomap_end() checks to see if it needs to finish a delayed
      allocation.  However, in the future iomap_end() calls might be needed to
      make sure we have balanced allocations, locks, etc.  So, add calls to
      iomap_end() with appropriate error handling to dax_iomap_fault().
      Signed-off-by: default avatarRoss Zwisler <ross.zwisler@linux.intel.com>
      Suggested-by: default avatarJan Kara <jack@suse.cz>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarDave Chinner <david@fromorbit.com>
      1550290b