1. 24 Nov, 2009 3 commits
    • Akira Fujita's avatar
      ext4: move_extent_per_page() cleanup · ac48b0a1
      Akira Fujita authored
      Integrate duplicate lines (acquire/release semaphore and invalidate
      extent cache in move_extent_per_page()) into mext_replace_branches(),
      to reduce source and object code size.
      Signed-off-by: default avatarAkira Fujita <a-fujita@rs.jp.nec.com>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      ac48b0a1
    • Kazuya Mio's avatar
      ext4: initialize moved_len before calling ext4_move_extents() · 446aaa6e
      Kazuya Mio authored
      The move_extent.moved_len is used to pass back the number of exchanged
      blocks count to user space.  Currently the caller must clear this
      field; but we spend more code space checking for this requirement than
      simply zeroing the field ourselves, so let's just make life easier for
      everyone all around.
      Signed-off-by: default avatarKazuya Mio <k-mio@sx.jp.nec.com>
      Signed-off-by: default avatarAkira Fujita <a-fujita@rs.jp.nec.com>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      446aaa6e
    • Akira Fujita's avatar
      ext4: Fix double-free of blocks with EXT4_IOC_MOVE_EXT · 94d7c16c
      Akira Fujita authored
      At the beginning of ext4_move_extent(), we call
      ext4_discard_preallocations() to discard inode PAs of orig and donor
      inodes.  But in the following case, blocks can be double freed, so
      move ext4_discard_preallocations() to the end of ext4_move_extents().
      
      1. Discard inode PAs of orig and donor inodes with
         ext4_discard_preallocations() in ext4_move_extents().
      
         orig : [ DATA1 ]
         donor: [ DATA2 ]
      
      2. While data blocks are exchanging between orig and donor inodes, new
         inode PAs is created to orig by other process's block allocation.
         (Since there are semaphore gaps in ext4_move_extents().)  And new
         inode PAs is used partially (2-1).
      
         2-1 Create new inode PAs to orig inode
         orig : [ DATA1 | used PA1 | free PA1 ]
         donor: [ DATA2 ]
      
      3. Donor inode which has old orig inode's blocks is deleted after
         EXT4_IOC_MOVE_EXT finished (3-1, 3-2).  So the block bitmap
         corresponds to old orig inode's blocks are freed.
      
         3-1 After EXT4_IOC_MOVE_EXT finished
         orig : [ DATA2 |  free PA1 ]
         donor: [ DATA1 |  used PA1 ]
      
         3-2 Delete donor inode
         orig : [ DATA2 |  free PA1 ]
         donor: [ FREE SPACE(DATA1) | FREE SPACE(used PA1) ]
      
      4. The double-free of blocks is occurred, when close() is called to
         orig inode.  Because ext4_discard_preallocations() for orig inode
         frees used PA1 and free PA1, though used PA1 is already freed in 3.
      
         4-1 Double-free of blocks is occurred
         orig : [ DATA2 |  FREE SPACE(free PA1) ]
         donor: [ FREE SPACE(DATA1) | DOUBLE FREE(used PA1) ]
      Signed-off-by: default avatarAkira Fujita <a-fujita@rs.jp.nec.com>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      94d7c16c
  2. 23 Nov, 2009 4 commits
  3. 22 Nov, 2009 1 commit
  4. 23 Nov, 2009 1 commit
  5. 24 Nov, 2009 1 commit
  6. 23 Nov, 2009 1 commit
    • Theodore Ts'o's avatar
      ext4: move ext4_forget() to ext4_jbd2.c · d6797d14
      Theodore Ts'o authored
      The ext4_forget() function better belongs in ext4_jbd2.c.  This will
      allow us to do some cleanup of the ext4_journal_revoke() and
      ext4_journal_forget() functions, as well as giving us better error
      reporting since we can report the caller of ext4_forget() when things
      go wrong.
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      d6797d14
  7. 19 Nov, 2009 2 commits
  8. 23 Nov, 2009 2 commits
  9. 15 Nov, 2009 1 commit
  10. 23 Nov, 2009 2 commits
  11. 15 Nov, 2009 2 commits
  12. 23 Nov, 2009 3 commits
    • Theodore Ts'o's avatar
      ext4: make sure directory and symlink blocks are revoked · 50689696
      Theodore Ts'o authored
      When an inode gets unlinked, the functions ext4_clear_blocks() and
      ext4_remove_blocks() call ext4_forget() for all the buffer heads
      corresponding to the deleted inode's data blocks.  If the inode is a
      directory or a symlink, the is_metadata parameter must be non-zero so
      ext4_forget() will revoke them via jbd2_journal_revoke().  Otherwise,
      if these blocks are reused for a data file, and the system crashes
      before a journal checkpoint, the journal replay could end up
      corrupting these data blocks.
      
      Thanks to Curt Wohlgemuth for pointing out potential problems in this
      area.
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      Cc: stable@kernel.org
      50689696
    • Theodore Ts'o's avatar
      ext4: add tracepoint for ext4_forget() · beac2da7
      Theodore Ts'o authored
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      beac2da7
    • Theodore Ts'o's avatar
      ext4: remove failed journal checksum check · cf40db13
      Theodore Ts'o authored
      Now that we are checking for failed journal checksums in the jbd2
      layer, we don't need to check in the ext4 mount path --- since a
      checksum fail will result in ext4_load_journal() returning an error,
      causing the file system to refuse to be mounted until e2fsck can deal
      with the problem.
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      cf40db13
  13. 15 Nov, 2009 1 commit
    • Theodore Ts'o's avatar
      jbd2: don't wipe the journal on a failed journal checksum · e6a47428
      Theodore Ts'o authored
      If there is a failed journal checksum, don't reset the journal.  This
      allows for userspace programs to decide how to recover from this
      situation.  It may be that ignoring the journal checksum failure might
      be a better way of recovering the file system.  Once we add per-block
      checksums, we can definitely do better.  Until then, a system
      administrator can try backing up the file system image (or taking a
      snapshot) and and trying to determine experimentally whether ignoring
      the checksum failure or aborting the journal replay results in less
      data loss.
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      Cc: stable@kernel.org
      e6a47428
  14. 14 Nov, 2009 1 commit
  15. 23 Nov, 2009 6 commits
  16. 13 Nov, 2009 1 commit
  17. 12 Nov, 2009 8 commits