1. 03 Oct, 2008 2 commits
    • Josef Bacik's avatar
      generic block based fiemap implementation · 68c9d702
      Josef Bacik authored
      Any block based fs (this patch includes ext3) just has to declare its own
      fiemap() function and then call this generic function with its own
      get_block_t. This works well for block based filesystems that will map
      multiple contiguous blocks at one time, but will work for filesystems that
      only map one block at a time, you will just end up with an "extent" for each
      block. One gotcha is this will not play nicely where there is hole+data
      after the EOF. This function will assume its hit the end of the data as soon
      as it hits a hole after the EOF, so if there is any data past that it will
      not pick that up. AFAIK no block based fs does this anyway, but its in the
      comments of the function anyway just in case.
      Signed-off-by: default avatarJosef Bacik <jbacik@redhat.com>
      Signed-off-by: default avatarMark Fasheh <mfasheh@suse.com>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      Cc: linux-fsdevel@vger.kernel.org
      68c9d702
    • Mark Fasheh's avatar
      ocfs2: fiemap support · 00dc417f
      Mark Fasheh authored
      Plug ocfs2 into ->fiemap. Some portions of ocfs2_get_clusters() had to be
      refactored so that the extent cache can be skipped in favor of going
      directly to the on-disk records. This makes it easier for us to determine
      which extent is the last one in the btree. Also, I'm not sure we want to be
      caching fiemap lookups anyway as they're not directly related to data
      read/write.
      Signed-off-by: default avatarMark Fasheh <mfasheh@suse.com>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      Cc: ocfs2-devel@oss.oracle.com
      Cc: linux-fsdevel@vger.kernel.org
      00dc417f
  2. 08 Oct, 2008 1 commit
    • Mark Fasheh's avatar
      vfs: vfs-level fiemap interface · c4b929b8
      Mark Fasheh authored
      Basic vfs-level fiemap infrastructure, which sets up a new ->fiemap
      inode operation.
      
      Userspace can get extent information on a file via fiemap ioctl. As input,
      the fiemap ioctl takes a struct fiemap which includes an array of struct
      fiemap_extent (fm_extents). Size of the extent array is passed as
      fm_extent_count and number of extents returned will be written into
      fm_mapped_extents. Offset and length fields on the fiemap structure
      (fm_start, fm_length) describe a logical range which will be searched for
      extents. All extents returned will at least partially contain this range.
      The actual extent offsets and ranges returned will be unmodified from their
      offset and range on-disk.
      
      The fiemap ioctl returns '0' on success. On error, -1 is returned and errno
      is set. If errno is equal to EBADR, then fm_flags will contain those flags
      which were passed in which the kernel did not understand. On all other
      errors, the contents of fm_extents is undefined.
      
      As fiemap evolved, there have been many authors of the vfs patch. As far as
      I can tell, the list includes:
      Kalpak Shah <kalpak.shah@sun.com>
      Andreas Dilger <adilger@sun.com>
      Eric Sandeen <sandeen@redhat.com>
      Mark Fasheh <mfasheh@suse.com>
      Signed-off-by: default avatarMark Fasheh <mfasheh@suse.com>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
      Cc: linux-api@vger.kernel.org
      Cc: linux-fsdevel@vger.kernel.org
      c4b929b8
  3. 09 Oct, 2008 1 commit
    • Kalpak Shah's avatar
      ext4: fix xattr deadlock · 4d20c685
      Kalpak Shah authored
      ext4_xattr_set_handle() eventually ends up calling
      ext4_mark_inode_dirty() which tries to expand the inode by shifting
      the EAs.  This leads to the xattr_sem being downed again and leading
      to a deadlock.
      
      This patch makes sure that if ext4_xattr_set_handle() is in the
      call-chain, ext4_mark_inode_dirty() will not expand the inode.
      Signed-off-by: default avatarKalpak Shah <kalpak.shah@sun.com>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      4d20c685
  4. 06 Oct, 2008 2 commits
  5. 09 Oct, 2008 1 commit
  6. 10 Oct, 2008 4 commits
    • Frederic Bohe's avatar
      ext4: fix initialization of UNINIT bitmap blocks · c806e68f
      Frederic Bohe authored
      This fixes a bug which caused on-line resizing of filesystems with a
      1k blocksize to fail.  The root cause of this bug was the fact that if
      an uninitalized bitmap block gets read in by userspace (which
      e2fsprogs does try to avoid, but can happen when the blocksize is less
      than the pagesize and an adjacent blocks is read into memory)
      ext4_read_block_bitmap() was erroneously depending on the buffer
      uptodate flag to decide whether it needed to initialize the bitmap
      block in memory --- i.e., to set the standard set of blocks in use by
      a block group (superblock, bitmaps, inode table, etc.).  Essentially,
      ext4_read_block_bitmap() assumed it was the only routine that might
      try to read a block containing a block bitmap, which is simply not
      true.  
      
      To fix this, ext4_read_block_bitmap() and ext4_read_inode_bitmap()
      must always initialize uninitialized bitmap blocks.  Once a block or
      inode is allocated out of that bitmap, it will be marked as
      initialized in the block group descriptor, so in general this won't
      result any extra unnecessary work.
      Signed-off-by: default avatarFrederic Bohe <frederic.bohe@bull.net>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      c806e68f
    • Theodore Ts'o's avatar
      ext4: Remove old legacy block allocator · c2ea3fde
      Theodore Ts'o authored
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      c2ea3fde
    • Theodore Ts'o's avatar
      ext4: Use readahead when reading an inode from the inode table · 240799cd
      Theodore Ts'o authored
      With modern hard drives, reading 64k takes roughly the same time as
      reading a 4k block.  So request readahead for adjacent inode table
      blocks to reduce the time it takes when iterating over directories
      (especially when doing this in htree sort order) in a cold cache case.
      With this patch, the time it takes to run "git status" on a kernel
      tree after flushing the caches via "echo 3 > /proc/sys/vm/drop_caches"
      is reduced by 21%.
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      240799cd
    • Theodore Ts'o's avatar
      ext4: Improve the documentation for ext4's /proc tunables · 37515fac
      Theodore Ts'o authored
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      Cc: Alex Tomas <bzzz@sun.com>
      Cc: Andreas Dilger <adilger@sun.com>
      37515fac
  7. 23 Sep, 2008 2 commits
  8. 22 Sep, 2008 1 commit
  9. 07 Oct, 2008 1 commit
  10. 13 Sep, 2008 2 commits
  11. 09 Oct, 2008 1 commit
  12. 13 Sep, 2008 1 commit
  13. 08 Sep, 2008 1 commit
  14. 16 Sep, 2008 1 commit
    • Theodore Ts'o's avatar
      jbd2: clean up how the journal device name is printed · 05496769
      Theodore Ts'o authored
      Calculate the journal device name once and stash it away in the
      journal_s structure.  This avoids needing to call bdevname()
      everywhere and reduces stack usage by not needing to allocate an
      on-stack buffer.  In addition, we eliminate the '/' that can appear in
      device names (e.g. "cciss/c0d0p9" --- see kernel bugzilla #11321) that
      can cause problems when creating proc directory names, and include the
      inode number to support ocfs2 which creates multiple journals with
      different inode numbers.
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      05496769
  15. 14 Sep, 2008 1 commit
  16. 08 Sep, 2008 1 commit
  17. 09 Oct, 2008 1 commit
    • Eric Sandeen's avatar
      ext4: Avoid printk floods in the face of directory corruption · 9d9f1775
      Eric Sandeen authored
      Note: some people thinks this represents a security bug, since it
      might make the system go away while it is printing a large number of
      console messages, especially if a serial console is involved.  Hence,
      it has been assigned CVE-2008-3528, but it requires that the attacker
      either has physical access to your machine to insert a USB disk with a
      corrupted filesystem image (at which point why not just hit the power
      button), or is otherwise able to convince the system administrator to
      mount an arbitrary filesystem image (at which point why not just
      include a setuid shell or world-writable hard disk device file or some
      such).  Me, I think they're just being silly. --tytso
      Signed-off-by: default avatarEric Sandeen <sandeen@redhat.com>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      Cc: linux-ext4@vger.kernel.org
      Cc: Eugene Teo <eugeneteo@kernel.sg>
      9d9f1775
  18. 13 Sep, 2008 2 commits
  19. 09 Sep, 2008 3 commits
  20. 09 Oct, 2008 2 commits
  21. 10 Oct, 2008 1 commit
  22. 09 Sep, 2008 1 commit
  23. 09 Oct, 2008 1 commit
    • Aneesh Kumar K.V's avatar
      ext4: Make sure all the block allocation paths reserve blocks · a30d542a
      Aneesh Kumar K.V authored
      With delayed allocation we need to make sure block are reserved before
      we attempt to allocate them. Otherwise we get block allocation failure
      (ENOSPC) during writepages which cannot be handled. This would mean
      silent data loss (We do a printk stating data will be lost). This patch
      updates the DIO and fallocate code path to do block reservation before
      block allocation. This is needed to make sure parallel DIO and fallocate
      request doesn't take block out of delayed reserve space.
      
      When free blocks count go below a threshold we switch to a slow patch
      which looks at other CPU's accumulated percpu counter values.
      Signed-off-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      a30d542a
  24. 20 Aug, 2008 1 commit
  25. 09 Sep, 2008 3 commits
  26. 09 Oct, 2008 2 commits