Commits · f6d2f6b327ceef5c689581529a852dc6ec3b74a6 · Kirill Smelkov / linux

23 May, 2011 3 commits

ext4: fix unbalanced up_write() in ext4_ext_truncate() error path · f6d2f6b3

Eric Gouriou authored May 22, 2011

ext4_ext_truncate() should not invoke up_write(&EXT4_I(inode)->i_data_sem)
when ext4_orphan_add() returns an error, as it hasn't performed a
down_write() yet. This trivial patch fixes this by moving the up_write()
invocation above the out_stop label.
Signed-off-by: Eric Gouriou <egouriou@google.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

f6d2f6b3

ext4: count hits/misses of extent cache and expose in sysfs · 77f4135f

Vivek Haldar authored May 22, 2011

The number of hits and misses for each filesystem is exposed in
/sys/fs/ext4/<dev>/extent_cache_{hits, misses}.

Tested: fsstress, manual checks.
Signed-off-by: Vivek Haldar <haldar@google.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

77f4135f

ext4: make ext4_split_extent() handle error correctly · 93917411

Yongqiang Yang authored May 22, 2011

Signed-off-by: Yongqiang Yang <xiaoqiangnk@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Mingming Cao <cmm@us.ibm.com>

93917411

22 May, 2011 1 commit

ext4: don't show mount options in /proc/mounts if there is no journal · 373cd5c5

Theodore Ts'o authored May 22, 2011

After creating an ext4 file system without a journal:

  # mke2fs -t ext4 -O ^has_journal /dev/sda
  # mount -t ext4 /dev/sda /test

the /proc/mounts will show:
"/dev/sda /test ext4 rw,relatime,user_xattr,acl,barrier=1,data=writeback 0 0"
which can fool users into thinking that the fs is using writeback mode.

So don't set the writeback option when the journal has not been
enabled; we don't depend on the writeback option being set, since
ext4_should_writeback_data() in ext4_jbd2.h tests to see if the
journal is not present before returning true.
Reported-by: Robin Dong <sanbai@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

373cd5c5

20 May, 2011 4 commits

ext4: fix possible use-after-free in ext4_remove_li_request() · 1bb933fb

Lukas Czerner authored May 20, 2011

We need to take reference to the s_li_request after we take a mutex,
because it might be freed since then, hence result in accessing old
already freed memory. Also we should protect the whole
ext4_remove_li_request() because ext4_li_info might be in the process of
being freed in ext4_lazyinit_thread().
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>

1bb933fb

ext4: fix the mount option "init_itable=n" to work as expected for n=0 · 51ce6511

Lukas Czerner authored May 20, 2011

For some reason, when we set the mount option "init_itable=0" it
behaves as we would set init_itable=20 which is not right at all.
Basically when we set it to zero we are saying to lazyinit thread not
to wait between zeroing the inode table (except of cond_resched()) so
this commit fixes that and removes the unnecessary condition.  The 'n'
should be also properly used on remount.

When the n is not set at all, it means that the default miltiplier
EXT4_DEF_LI_WAIT_MULT is set instead.
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reported-by: Eric Sandeen <sandeen@redhat.com>

51ce6511

ext4: Remove unnecessary wait_event ext4_run_lazyinit_thread() · e1290b3e

Lukas Czerner authored May 20, 2011

For some reason we have been waiting for lazyinit thread to start in the
ext4_run_lazyinit_thread() but it is not needed since it was jus
unnecessary complexity, so get rid of it. We can also remove li_task and
li_wait_task since it is not used anymore.
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>

e1290b3e

ext4: Use schedule_timeout_interruptible() for waiting in lazyinit thread · 4ed5c033

Lukas Czerner authored May 20, 2011

In order to make lazyinit eat approx. 10% of io bandwidth at max, we
are sleeping between zeroing each single inode table. For that purpose
we are using timer which wakes up thread when it expires. It is set
via add_timer() and this may cause troubles in the case that thread
has been woken up earlier and in next iteration we call add_timer() on
still running timer hence hitting BUG_ON in add_timer(). We could fix
that by using mod_timer() instead however we can use
schedule_timeout_interruptible() for waiting and hence simplifying
things a lot.

This commit exchange the old "waiting mechanism" with simple
schedule_timeout_interruptible(), setting the time to sleep. Hence we
do not longer need li_wait_daemon waiting queue and others, so get rid
of it.

Addresses-Red-Hat-Bugzilla: #699708
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>

4ed5c033

18 May, 2011 3 commits

ext4: wait for writeback to complete while making pages writable · 0e499890

Darrick J. Wong authored May 18, 2011

In order to stabilize pages during disk writes, ext4_page_mkwrite must
wait for writeback operations to complete before making a page
writable.  Furthermore, the function must return locked pages, and
recheck the writeback status if the page lock is ever dropped.  The
"someone could wander in" part of this patch was suggested by Chris
Mason.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

0e499890

ext4: clean up some wait_on_page_writeback calls · 7cb1a535

Darrick J. Wong authored May 18, 2011

wait_on_page_writeback already checks the writeback bit, so callers of it
needn't do that test.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

7cb1a535

ext4: don't warn about mnt_count if it has been disabled · ed3ce80a

Tao Ma authored May 18, 2011

Currently, if we mkfs a new ext4 volume with s_max_mnt_count set to
zero, and mount it for the first time, we will get the warning:

	maximal mount count reached, running e2fsck is recommended

It is really misleading. So change the check so that it won't warn in
that case.
Signed-off-by: Tao Ma <boyu.mt@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

ed3ce80a

16 May, 2011 2 commits

ext4: ext4_ext_convert_to_initialized bug found in extended FSX testing · 9b940f8e

Allison Henderson authored May 16, 2011

This patch addresses bugs found while testing punch hole 
with the fsx test.  The patch corrects the number of blocks
that are zeroed out while splitting an extent, and also corrects
the return value to return the number of blocks split out, instead
of the number of blocks zeroed out.

This patch has been tested in addition to the following patches: 
[Ext4 punch hole v7]
[XFS Tests Punch Hole 1/1 v2] Add Punch Hole Testing to FSX

The test ran successfully for 24 hours.
Signed-off-by: Allison Henderson <achender@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

9b940f8e

ext4: fix oops in ext4_quota_off() · 0b268590

Amir Goldstein authored May 16, 2011

If quota is not enabled when ext4_quota_off() is called, we must not
dereference quota file inode since it is NULL.  Check properly for
this.

This fixes a bug in commit 21f97697 (ext4: remove unnecessary
[cm]time update of quota file), which was merged for 2.6.39-rc3.
Reported-by: Amir Goldstein <amir73il@users.sf.net>
Signed-off-by: Amir Goldstein <amir73il@users.sf.net>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

0b268590

15 May, 2011 1 commit

ext4: don't dereference null pointer when make_indexed_dir() fails · 6976a6f2

Allison Henderson authored May 15, 2011

Fix for a null pointer bug found while running punch hole tests
Signed-off-by: Allison Henderson <achender@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

6976a6f2

10 May, 2011 4 commits

ext4: remove alloc_semp · 44183d42

Amir Goldstein authored May 09, 2011

After taking care of all group init races, all that remains is to
remove alloc_semp from ext4_allocation_context and ext4_buddy structs.
Signed-off-by: Amir Goldstein <amir73il@users.sf.net>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

44183d42

ext4: teach ext4_mb_init_cache() to skip uptodate buddy caches · 9b8b7d35

Amir Goldstein authored May 09, 2011

After online resize which adds new groups, some of the groups
in a buddy page may be initialized and uptodate, while other
(new ones) may be uninitialized.

The indication for init of new block groups is when ext4_mb_init_cache()
is called with an uptodate buddy page. In this case, initialized groups
on that buddy page must be skipped when initializing the buddy cache.
Signed-off-by: Amir Goldstein <amir73il@users.sf.net>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

9b8b7d35

ext4: synchronize ext4_mb_init_group() with buddy page lock · 2de8807b

Amir Goldstein authored May 09, 2011

The old routines ext4_mb_[get|put]_buddy_cache_lock(), which used
to take grp->alloc_sem for all groups on the buddy page have been
replaced with the routines ext4_mb_[get|put]_buddy_page_lock().

The new routines take both buddy and bitmap page locks to protect
against concurrent init of groups on the same buddy page.

The GROUP_NEED_INIT flag is tested again under page lock to check
if the group was initialized by another caller.
Signed-off-by: Amir Goldstein <amir73il@users.sf.net>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

2de8807b

ext4: implement ext4_add_groupblocks() by freeing blocks · e73a347b

Amir Goldstein authored May 09, 2011

The old imlementation used to take grp->alloc_sem and set the
GROUP_NEED_INIT flag, so that the buddy cache would be reloaded.

The new implementation updates the buddy cache by freeing the added
blocks and making them available for use, so there is no need to
reload the buddy cache and there is no need to take grp->alloc_sem.
Signed-off-by: Amir Goldstein <amir73il@users.sf.net>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

e73a347b

09 May, 2011 6 commits

ext4: remove unneeded ext4_journal_get_undo_access · 2cd05cc3

Theodore Ts'o authored May 09, 2011

The block allocation code used to use jbd2_journal_get_undo_access as
a way to make changes that wouldn't show up until the commit took
place. The new multi-block allocation code has a its own way of
preventing newly freed blocks from getting reused until the commit
takes place (it avoids updating the buddy bitmaps until the commit is
done), so we don't need to use jbd2_journal_get_undo_access(), which
has extra overhead compared to jbd2_journal_get_write_access().

There was one last vestigal use of ext4_journal_get_undo_access() in
ext4_add_groupblocks(); change it to use ext4_journal_get_write_access()
and then remove the ext4_journal_get_undo_access() support.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

2cd05cc3

ext4: move ext4_add_groupblocks() to mballoc.c · 2846e820

Amir Goldstein authored May 09, 2011

In preparation for the next patch, the function ext4_add_groupblocks()
is moved to mballoc.c, where it could use some static functions.
Signed-off-by: Amir Goldstein <amir73il@users.sf.net>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

2846e820

ext4: remove redundant #ifdef in super.c · 66bb8279

Amerigo Wang authored May 09, 2011

There is already an #ifdef CONFIG_QUOTA some lines above,
so this one is totally useless.
Signed-off-by: WANG Cong <amwang@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

66bb8279

ext4: remove redundant check for first_not_zeroed in ext4_register_li_request · 55ff3840

Tao Ma authored May 09, 2011

We have checked first_not_zeroed == ngroups already above, so remove
this redundant check.

sbi->s_li_request = NULL above is also removed since it is NULL
already.

Cc: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Tao Ma <boyu.mt@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

55ff3840

ext4: use s_inodes_per_block directly in __ext4_get_inode_loc · 00d09882

Tao Ma authored May 09, 2011

In __ext4_get_inode_loc, we calculate inodes_per_block every time by
EXT4_BLOCK_SIZE(sb) / EXT4_INODE_SIZE(sb).  AFAICS, this function is a
hot path for ext4, so we'd better use s_inodes_per_block directly
instead of calculating every time.
Signed-off-by: Tao Ma <boyu.mt@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

00d09882

ext4: use EXT4FS_DEBUG instead of EXT4_DEBUG in fsync.c · e8bbe8c4

Tao Ma authored May 09, 2011

We have EXT4FS_DEBUG for some old debug and CONFIG_EXT4_DEBUG
for the new mballoc debug, but there isn't any EXT4_DEBUG.

As CONFIG_EXT4_DEBUG seems to be only used in mballoc, use
EXT4FS_DEBUG in fsync.c.

[ It doesn't really matter; although I'm including this commit for
  consistency's sake.  The whole point of the #ifdef's is to disable
  the debugging code.  In general you're not going to want to enable
  all of the code protected by EXT4FS_DEBUG at the same time.  -- Ted ]
Signed-off-by: Tao Ma <boyu.mt@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

e8bbe8c4

08 May, 2011 2 commits

jbd2: only print the debugging information for tid wraparound once · 1be2add6

Theodore Ts'o authored May 08, 2011

If we somehow wrap, we don't want to keep printing the warning message
over and over again.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

1be2add6

jbd2: Fix forever sleeping process in do_get_write_access() · 229309ca

Jan Kara authored May 08, 2011

In do_get_write_access() we wait on BH_Unshadow bit for buffer to get
from shadow state. The waking code in journal_commit_transaction() has
a bug because it does not issue a memory barrier after the buffer is
moved from the shadow state and before wake_up_bit() is called. Thus a
waitqueue check can happen before the buffer is actually moved from
the shadow state and waiting process may never be woken. Fix the
problem by issuing proper barrier.
Reported-by: Tao Ma <boyu.mt@taobao.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

229309ca

03 May, 2011 6 commits

ext4: reimplement convert and split_unwritten · 667eff35

Yongqiang Yang authored May 03, 2011

Reimplement ext4_ext_convert_to_initialized() and
ext4_split_unwritten_extents() using ext4_split_extent()
Signed-off-by: Yongqiang Yang <xiaoqiangnk@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Tested-by: Allison Henderson <achender@linux.vnet.ibm.com>

667eff35

ext4: add ext4_split_extent_at() and ext4_split_extent() · 47ea3bb5

Yongqiang Yang authored May 03, 2011

Add two functions: ext4_split_extent_at(), which splits an extent into
two extents at given logical block, and ext4_split_extent() which
splits an extent into three extents.
Signed-off-by: Yongqiang Yang <xiaoqiangnk@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Tested-by: Allison Henderson <achender@linux.vnet.ibm.com>

47ea3bb5

ext4: add a function merging extents right and left · 197217a5

Yongqiang Yang authored May 03, 2011

1) Rename ext4_ext_try_to_merge() to ext4_ext_try_to_merge_right().

2) Add a new function ext4_ext_try_to_merge() which tries to merge
   an extent both left and right.

3) Use the new function in ext4_ext_convert_unwritten_endio() and
   ext4_ext_insert_extent().
Signed-off-by: Yongqiang Yang <xiaoqiangnk@gmail.com>
Tested-by: Allison Henderson <achender@linux.vnet.ibm.com>

197217a5

ext4: fix deadlock in ext4_symlink() in ENOSPC conditions · df5e6223

Jan Kara authored May 03, 2011

ext4_symlink() cannot call __page_symlink() with transaction open.
__page_symlink() calls ext4_write_begin() which can wait for
transaction commit if we are running out of space thus causing a
deadlock. Also error recovery in ext4_truncate_failed_write() does not
count with the transaction being already started (although I'm not
aware of any particular deadlock here).

Fix the problem by stopping a transaction before calling
__page_symlink() (we have to be careful and put inode to orphan list
so that it gets deleted in case of crash) and starting another one
after __page_symlink() returns for addition of symlink into a
directory.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

df5e6223

ext4: Fix fs corruption when make_indexed_dir() fails · 7ad8e4e6

Jan Kara authored May 03, 2011

When make_indexed_dir() fails (e.g. because of ENOSPC) after it has
allocated block for index tree root, we did not properly mark all
changed buffers dirty.  This lead to only some of these buffers being
written out and thus effectively corrupting the directory.

Fix the issue by marking all changed data dirty even in the error
failure case.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

7ad8e4e6

ext4: set extents flag when migrating file to use extents · 74e4e6db

Theodore Ts'o authored May 03, 2011

Fix a typo that was introduced in commit 07a03824 (in 2.6.36) which
caused the extents flag not to be set at the conclusion of converting
an inode to use extents.
Reported-by: Peter Uchno <peter.uchno@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

74e4e6db

01 May, 2011 3 commits

jbd2: fix fsync() tid wraparound bug · deeeaf13

Theodore Ts'o authored May 01, 2011

If an application program does not make any changes to the indirect
blocks or extent tree, i_datasync_tid will not get updated.  If there
are enough commits (i.e., 2**31) such that tid_geq()'s calculations
wrap, and there isn't a currently active transaction at the time of
the fdatasync() call, this can end up triggering a BUG_ON in
fs/jbd2/commit.c:

	J_ASSERT(journal->j_running_transaction != NULL);

It's pretty rare that this can happen, since it requires the use of
fdatasync() plus *very* frequent and excessive use of fsync().  But
with the right workload, it can.

We fix this by replacing the use of tid_geq() with an equality test,
since there's only one valid transaction id that we is valid for us to
wait until it is commited: namely, the currently running transaction
(if it exists).
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

deeeaf13

ext4: remove obsolete mount options from ext4's documentation · 59802db0
Theodore Ts'o authored May 01, 2011
```
The block reservation code from ext3 was removed long ago...
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
```
59802db0

ext4: remove dead code in ext4_has_free_blocks() · dc2070a2

Shaohua Li authored May 01, 2011

percpu_counter_sum_positive() never returns a negative value.
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

dc2070a2

30 Apr, 2011 2 commits

ext4: ignore errors when issuing discards · d9f34504

Theodore Ts'o authored Apr 30, 2011

This is an effective revert of commit a30eec2a: "ext4: stop issuing
discards if not supported by device".  The problem is that there are
some devices that may return errors in response to a discard request
some times but not others.  (One example would be a hybrid dm device
which concatenates an SSD and an HDD device).

By this logic, I also removed the error checking from ext4's FITRIM
code; so that an error from a discard will not stop the FITRIM from
trying to trim the rest of the file system.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

d9f34504

ext4: don't set PageUptodate in ext4_end_bio() · 39db00f1

Curt Wohlgemuth authored Apr 30, 2011

In the bio completion routine, we should not be setting
PageUptodate at all -- it's set at sys_write() time, and is
unaffected by success/failure of the write to disk.

This can cause a page corruption bug when the file system's
block size is less than the architecture's VM page size.

if we have only written a single block -- we might end up
setting the page's PageUptodate flag, indicating that page
is completely read into memory, which may not be true.
This could cause subsequent reads to get bad data.

This commit also takes the opportunity to clean up error
handling in ext4_end_bio(), and remove some extraneous code:

   - fixes ext4_end_bio() to set AS_EIO in the
     page->mapping->flags on error, which was left out by
     mistake.  This is needed so that fsync() will
     return an error if there was an I/O error.
   - remove the clear_buffer_dirty() call on unmapped
     buffers for each page.
   - consolidate page/buffer error handling in a single
     section.
Signed-off-by: Curt Wohlgemuth <curtw@google.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reported-by: Jim Meyering <jim@meyering.net>
Reported-by: Hugh Dickins <hughd@google.com>
Cc: Mingming Cao <cmm@us.ibm.com>

39db00f1

18 Apr, 2011 1 commit

ext4: check for ext[23] file system features when mounting as ext[23] · 2035e776

Theodore Ts'o authored Apr 18, 2011

Provide better emulation for ext[23] mode by enforcing that the file
system does not have any unsupported file system features as defined
by ext[23] when emulating the ext[23] file system driver when
CONFIG_EXT4_USE_FOR_EXT23 is defined.

This causes the file system type information in /proc/mounts to be
correct for the automatically mounted root file system.  This also
means that "mount -t ext2 /dev/sda /mnt" will fail if /dev/sda
contains an ext3 or ext4 file system, just as one would expect if the
original ext2 file system driver were in use.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

2035e776

16 Apr, 2011 1 commit

ext4: release page cache in ext4_mb_load_buddy error path · 26626f11

Yang Ruirui authored Apr 16, 2011

Add missing page_cache_release in the error path of ext4_mb_load_buddy
Signed-off-by: Yang Ruirui <ruirui.r.yang@tieto.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@kernel.org

26626f11

12 Apr, 2011 1 commit
- Linux 2.6.39-rc3 · a6360dd3
  Linus Torvalds authored Apr 11, 2011
  
  a6360dd3