1. 01 Jan, 2010 2 commits
    • Theodore Ts'o's avatar
      ext4: Calculate metadata requirements more accurately · 9d0be502
      Theodore Ts'o authored
      In the past, ext4_calc_metadata_amount(), and its sub-functions
      ext4_ext_calc_metadata_amount() and ext4_indirect_calc_metadata_amount()
      badly over-estimated the number of metadata blocks that might be
      required for delayed allocation blocks.  This didn't matter as much
      when functions which managed the reserved metadata blocks were more
      aggressive about dropping reserved metadata blocks as delayed
      allocation blocks were written, but unfortunately they were too
      aggressive.  This was fixed in commit 0637c6f4, but as a result the
      over-estimation by ext4_calc_metadata_amount() would lead to reserving
      2-3 times the number of pending delayed allocation blocks as
      potentially required metadata blocks.  So if there are 1 megabytes of
      blocks which have been not yet been allocation, up to 3 megabytes of
      space would get reserved out of the user's quota and from the file
      system free space pool until all of the inode's data blocks have been
      allocated.
      
      This commit addresses this problem by much more accurately estimating
      the number of metadata blocks that will be required.  It will still
      somewhat over-estimate the number of blocks needed, since it must make
      a worst case estimate not knowing which physical blocks will be
      needed, but it is much more accurate than before.
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      9d0be502
    • Theodore Ts'o's avatar
      ext4: Fix accounting of reserved metadata blocks · ee5f4d9c
      Theodore Ts'o authored
      Commit 0637c6f4 had a typo which caused the reserved metadata blocks to
      not be released correctly.   Fix this.
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      ee5f4d9c
  2. 30 Dec, 2009 2 commits
    • Theodore Ts'o's avatar
      ext4: Patch up how we claim metadata blocks for quota purposes · 0637c6f4
      Theodore Ts'o authored
      As reported in Kernel Bugzilla #14936, commit d21cd8f1 triggered a BUG
      in the function ext4_da_update_reserve_space() found in
      fs/ext4/inode.c.  The root cause of this BUG() was caused by the fact
      that ext4_calc_metadata_amount() can severely over-estimate how many
      metadata blocks will be needed, especially when using direct
      block-mapped files.
      
      In addition, it can also badly *under* estimate how much space is
      needed, since ext4_calc_metadata_amount() assumes that the blocks are
      contiguous, and this is not always true.  If the application is
      writing blocks to a sparse file, the number of metadata blocks
      necessary can be severly underestimated by the functions
      ext4_da_reserve_space(), ext4_da_update_reserve_space() and
      ext4_da_release_space().  This was the cause of the dq_claim_space
      reports found on kerneloops.org.
      
      Unfortunately, doing this right means that we need to massively
      over-estimate the amount of free space needed.  So in some cases we
      may need to force the inode to be written to disk asynchronously in
      to avoid spurious quota failures.
      
      http://bugzilla.kernel.org/show_bug.cgi?id=14936Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      0637c6f4
    • Aneesh Kumar K.V's avatar
      ext4: Ensure zeroout blocks have no dirty metadata · 515f41c3
      Aneesh Kumar K.V authored
      This fixes a bug (found by Curt Wohlgemuth) in which new blocks
      returned from an extent created with ext4_ext_zeroout() can have dirty
      metadata still associated with them.
      Signed-off-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: default avatarCurt Wohlgemuth <curtw@google.com>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      515f41c3
  3. 25 Dec, 2009 1 commit
  4. 24 Dec, 2009 1 commit
  5. 23 Dec, 2009 6 commits
    • Andrew Morton's avatar
      jbd2: don't use __GFP_NOFAIL in journal_init_common() · 3ebfdf88
      Andrew Morton authored
      It triggers the warning in get_page_from_freelist(), and it isn't
      appropriate to use __GFP_NOFAIL here anyway.
      
      Addresses http://bugzilla.kernel.org/show_bug.cgi?id=14843Reported-by: default avatarChristian Casteyde <casteyde.christian@free.fr>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      3ebfdf88
    • Eric Sandeen's avatar
      ext4: flush delalloc blocks when space is low · c8afb446
      Eric Sandeen authored
      Creating many small files in rapid succession on a small
      filesystem can lead to spurious ENOSPC; on a 104MB filesystem:
      
      for i in `seq 1 22500`; do
          echo -n > $SCRATCH_MNT/$i
          echo XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX > $SCRATCH_MNT/$i
      done
      
      leads to ENOSPC even though after a sync, 40% of the fs is free
      again.
      
      This is because we reserve worst-case metadata for delalloc writes,
      and when data is allocated that worst-case reservation is not
      usually needed.
      
      When freespace is low, kicking off an async writeback will start
      converting that worst-case space usage into something more realistic,
      almost always freeing up space to continue.
      
      This resolves the testcase for me, and survives all 4 generic
      ENOSPC tests in xfstests.
      
      We'll still need a hard synchronous sync to squeeze out the last bit,
      but this fixes things up to a large degree.
      Signed-off-by: default avatarEric Sandeen <sandeen@redhat.com>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      c8afb446
    • Eric Sandeen's avatar
      fs-writeback: Add helper function to start writeback if idle · 17bd55d0
      Eric Sandeen authored
      ext4, at least, would like to start pushing on writeback if it starts
      to get close to ENOSPC when reserving worst-case blocks for delalloc
      writes.  Writing out delalloc data will convert those worst-case
      predictions into usually smaller actual usage, freeing up space
      before we hit ENOSPC based on this speculation.
      
      Thanks to Jens for the suggestion for the helper function,
      & the naming help.
      
      I've made the helper return status on whether writeback was
      started even though I don't plan to use it in the ext4 patch;
      it seems like it would be potentially useful to test this
      in some cases.
      Signed-off-by: default avatarEric Sandeen <sandeen@redhat.com>
      Acked-by: default avatarJan Kara <jack@suse.cz>
      17bd55d0
    • Julia Lawall's avatar
      ext4: Eliminate potential double free on error path · d3533d72
      Julia Lawall authored
      b_entry_name and buffer are initially NULL, are initialized within a loop
      to the result of calling kmalloc, and are freed at the bottom of this loop.
      The loop contains gotos to cleanup, which also frees b_entry_name and
      buffer.  Some of these gotos are before the reinitializations of
      b_entry_name and buffer.  To maintain the invariant that b_entry_name and
      buffer are NULL at the top of the loop, and thus acceptable arguments to
      kfree, these variables are now set to NULL after the kfrees.
      
      This seems to be the simplest solution.  A more complicated solution
      would be to introduce more labels in the error handling code at the end of
      the function.
      
      A simplified version of the semantic match that finds this problem is as
      follows: (http://coccinelle.lip6.fr/)
      
      // <smpl>
      @r@
      identifier E;
      expression E1;
      iterator I;
      statement S;
      @@
      
      *kfree(E);
      ... when != E = E1
          when != I(E,...) S
          when != &E
      *kfree(E);
      // </smpl>
      Signed-off-by: default avatarJulia Lawall <julia@diku.dk>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      d3533d72
    • Andrew Morton's avatar
      ext4: fix unsigned long long printk warning in super.c · a6b43e38
      Andrew Morton authored
      sparc64 allmodconfig:
      
      fs/ext4/super.c: In function `lifetime_write_kbytes_show':
      fs/ext4/super.c:2174: warning: long long unsigned int format, long unsigned int arg (arg 4)
      fs/ext4/super.c:2174: warning: long long unsigned int format, long unsigned int arg (arg 4)
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      a6b43e38
    • Theodore Ts'o's avatar
      ext4, jbd2: Add barriers for file systems with exernal journals · cc3e1bea
      Theodore Ts'o authored
      This is a bit complicated because we are trying to optimize when we
      send barriers to the fs data disk.  We could just throw in an extra
      barrier to the data disk whenever we send a barrier to the journal
      disk, but that's not always strictly necessary.
      
      We only need to send a barrier during a commit when there are data
      blocks which are must be written out due to an inode written in
      ordered mode, or if fsync() depends on the commit to force data blocks
      to disk.  Finally, before we drop transactions from the beginning of
      the journal during a checkpoint operation, we need to guarantee that
      any blocks that were flushed out to the data disk are firmly on the
      rust platter before we drop the transaction from the journal.
      
      Thanks to Oleg Drokin for pointing out this flaw in ext3/ext4.
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      cc3e1bea
  6. 14 Dec, 2009 1 commit
  7. 21 Dec, 2009 2 commits
  8. 14 Dec, 2009 1 commit
  9. 24 Dec, 2009 24 commits