- 23 May, 2011 3 commits
-
-
Eric Gouriou authored
ext4_ext_truncate() should not invoke up_write(&EXT4_I(inode)->i_data_sem) when ext4_orphan_add() returns an error, as it hasn't performed a down_write() yet. This trivial patch fixes this by moving the up_write() invocation above the out_stop label. Signed-off-by: Eric Gouriou <egouriou@google.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
Vivek Haldar authored
The number of hits and misses for each filesystem is exposed in /sys/fs/ext4/<dev>/extent_cache_{hits, misses}. Tested: fsstress, manual checks. Signed-off-by: Vivek Haldar <haldar@google.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
Yongqiang Yang authored
Signed-off-by: Yongqiang Yang <xiaoqiangnk@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Mingming Cao <cmm@us.ibm.com>
-
- 22 May, 2011 1 commit
-
-
Theodore Ts'o authored
After creating an ext4 file system without a journal: # mke2fs -t ext4 -O ^has_journal /dev/sda # mount -t ext4 /dev/sda /test the /proc/mounts will show: "/dev/sda /test ext4 rw,relatime,user_xattr,acl,barrier=1,data=writeback 0 0" which can fool users into thinking that the fs is using writeback mode. So don't set the writeback option when the journal has not been enabled; we don't depend on the writeback option being set, since ext4_should_writeback_data() in ext4_jbd2.h tests to see if the journal is not present before returning true. Reported-by: Robin Dong <sanbai@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
- 20 May, 2011 4 commits
-
-
Lukas Czerner authored
We need to take reference to the s_li_request after we take a mutex, because it might be freed since then, hence result in accessing old already freed memory. Also we should protect the whole ext4_remove_li_request() because ext4_li_info might be in the process of being freed in ext4_lazyinit_thread(). Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Eric Sandeen <sandeen@redhat.com>
-
Lukas Czerner authored
For some reason, when we set the mount option "init_itable=0" it behaves as we would set init_itable=20 which is not right at all. Basically when we set it to zero we are saying to lazyinit thread not to wait between zeroing the inode table (except of cond_resched()) so this commit fixes that and removes the unnecessary condition. The 'n' should be also properly used on remount. When the n is not set at all, it means that the default miltiplier EXT4_DEF_LI_WAIT_MULT is set instead. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reported-by: Eric Sandeen <sandeen@redhat.com>
-
Lukas Czerner authored
For some reason we have been waiting for lazyinit thread to start in the ext4_run_lazyinit_thread() but it is not needed since it was jus unnecessary complexity, so get rid of it. We can also remove li_task and li_wait_task since it is not used anymore. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Eric Sandeen <sandeen@redhat.com>
-
Lukas Czerner authored
In order to make lazyinit eat approx. 10% of io bandwidth at max, we are sleeping between zeroing each single inode table. For that purpose we are using timer which wakes up thread when it expires. It is set via add_timer() and this may cause troubles in the case that thread has been woken up earlier and in next iteration we call add_timer() on still running timer hence hitting BUG_ON in add_timer(). We could fix that by using mod_timer() instead however we can use schedule_timeout_interruptible() for waiting and hence simplifying things a lot. This commit exchange the old "waiting mechanism" with simple schedule_timeout_interruptible(), setting the time to sleep. Hence we do not longer need li_wait_daemon waiting queue and others, so get rid of it. Addresses-Red-Hat-Bugzilla: #699708 Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Eric Sandeen <sandeen@redhat.com>
-
- 18 May, 2011 3 commits
-
-
Darrick J. Wong authored
In order to stabilize pages during disk writes, ext4_page_mkwrite must wait for writeback operations to complete before making a page writable. Furthermore, the function must return locked pages, and recheck the writeback status if the page lock is ever dropped. The "someone could wander in" part of this patch was suggested by Chris Mason. Signed-off-by: Darrick J. Wong <djwong@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
Darrick J. Wong authored
wait_on_page_writeback already checks the writeback bit, so callers of it needn't do that test. Signed-off-by: Darrick J. Wong <djwong@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
Tao Ma authored
Currently, if we mkfs a new ext4 volume with s_max_mnt_count set to zero, and mount it for the first time, we will get the warning: maximal mount count reached, running e2fsck is recommended It is really misleading. So change the check so that it won't warn in that case. Signed-off-by: Tao Ma <boyu.mt@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
- 16 May, 2011 2 commits
-
-
Allison Henderson authored
This patch addresses bugs found while testing punch hole with the fsx test. The patch corrects the number of blocks that are zeroed out while splitting an extent, and also corrects the return value to return the number of blocks split out, instead of the number of blocks zeroed out. This patch has been tested in addition to the following patches: [Ext4 punch hole v7] [XFS Tests Punch Hole 1/1 v2] Add Punch Hole Testing to FSX The test ran successfully for 24 hours. Signed-off-by: Allison Henderson <achender@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
Amir Goldstein authored
If quota is not enabled when ext4_quota_off() is called, we must not dereference quota file inode since it is NULL. Check properly for this. This fixes a bug in commit 21f97697 (ext4: remove unnecessary [cm]time update of quota file), which was merged for 2.6.39-rc3. Reported-by: Amir Goldstein <amir73il@users.sf.net> Signed-off-by: Amir Goldstein <amir73il@users.sf.net> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
- 15 May, 2011 1 commit
-
-
Allison Henderson authored
Fix for a null pointer bug found while running punch hole tests Signed-off-by: Allison Henderson <achender@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
- 10 May, 2011 4 commits
-
-
Amir Goldstein authored
After taking care of all group init races, all that remains is to remove alloc_semp from ext4_allocation_context and ext4_buddy structs. Signed-off-by: Amir Goldstein <amir73il@users.sf.net> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
Amir Goldstein authored
After online resize which adds new groups, some of the groups in a buddy page may be initialized and uptodate, while other (new ones) may be uninitialized. The indication for init of new block groups is when ext4_mb_init_cache() is called with an uptodate buddy page. In this case, initialized groups on that buddy page must be skipped when initializing the buddy cache. Signed-off-by: Amir Goldstein <amir73il@users.sf.net> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
Amir Goldstein authored
The old routines ext4_mb_[get|put]_buddy_cache_lock(), which used to take grp->alloc_sem for all groups on the buddy page have been replaced with the routines ext4_mb_[get|put]_buddy_page_lock(). The new routines take both buddy and bitmap page locks to protect against concurrent init of groups on the same buddy page. The GROUP_NEED_INIT flag is tested again under page lock to check if the group was initialized by another caller. Signed-off-by: Amir Goldstein <amir73il@users.sf.net> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
Amir Goldstein authored
The old imlementation used to take grp->alloc_sem and set the GROUP_NEED_INIT flag, so that the buddy cache would be reloaded. The new implementation updates the buddy cache by freeing the added blocks and making them available for use, so there is no need to reload the buddy cache and there is no need to take grp->alloc_sem. Signed-off-by: Amir Goldstein <amir73il@users.sf.net> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
- 09 May, 2011 6 commits
-
-
Theodore Ts'o authored
The block allocation code used to use jbd2_journal_get_undo_access as a way to make changes that wouldn't show up until the commit took place. The new multi-block allocation code has a its own way of preventing newly freed blocks from getting reused until the commit takes place (it avoids updating the buddy bitmaps until the commit is done), so we don't need to use jbd2_journal_get_undo_access(), which has extra overhead compared to jbd2_journal_get_write_access(). There was one last vestigal use of ext4_journal_get_undo_access() in ext4_add_groupblocks(); change it to use ext4_journal_get_write_access() and then remove the ext4_journal_get_undo_access() support. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
Amir Goldstein authored
In preparation for the next patch, the function ext4_add_groupblocks() is moved to mballoc.c, where it could use some static functions. Signed-off-by: Amir Goldstein <amir73il@users.sf.net> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
Amerigo Wang authored
There is already an #ifdef CONFIG_QUOTA some lines above, so this one is totally useless. Signed-off-by: WANG Cong <amwang@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
Tao Ma authored
We have checked first_not_zeroed == ngroups already above, so remove this redundant check. sbi->s_li_request = NULL above is also removed since it is NULL already. Cc: Lukas Czerner <lczerner@redhat.com> Signed-off-by: Tao Ma <boyu.mt@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
Tao Ma authored
In __ext4_get_inode_loc, we calculate inodes_per_block every time by EXT4_BLOCK_SIZE(sb) / EXT4_INODE_SIZE(sb). AFAICS, this function is a hot path for ext4, so we'd better use s_inodes_per_block directly instead of calculating every time. Signed-off-by: Tao Ma <boyu.mt@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
Tao Ma authored
We have EXT4FS_DEBUG for some old debug and CONFIG_EXT4_DEBUG for the new mballoc debug, but there isn't any EXT4_DEBUG. As CONFIG_EXT4_DEBUG seems to be only used in mballoc, use EXT4FS_DEBUG in fsync.c. [ It doesn't really matter; although I'm including this commit for consistency's sake. The whole point of the #ifdef's is to disable the debugging code. In general you're not going to want to enable all of the code protected by EXT4FS_DEBUG at the same time. -- Ted ] Signed-off-by: Tao Ma <boyu.mt@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
- 08 May, 2011 2 commits
-
-
Theodore Ts'o authored
If we somehow wrap, we don't want to keep printing the warning message over and over again. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
Jan Kara authored
In do_get_write_access() we wait on BH_Unshadow bit for buffer to get from shadow state. The waking code in journal_commit_transaction() has a bug because it does not issue a memory barrier after the buffer is moved from the shadow state and before wake_up_bit() is called. Thus a waitqueue check can happen before the buffer is actually moved from the shadow state and waiting process may never be woken. Fix the problem by issuing proper barrier. Reported-by: Tao Ma <boyu.mt@taobao.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
- 03 May, 2011 6 commits
-
-
Yongqiang Yang authored
Reimplement ext4_ext_convert_to_initialized() and ext4_split_unwritten_extents() using ext4_split_extent() Signed-off-by: Yongqiang Yang <xiaoqiangnk@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Tested-by: Allison Henderson <achender@linux.vnet.ibm.com>
-
Yongqiang Yang authored
Add two functions: ext4_split_extent_at(), which splits an extent into two extents at given logical block, and ext4_split_extent() which splits an extent into three extents. Signed-off-by: Yongqiang Yang <xiaoqiangnk@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Tested-by: Allison Henderson <achender@linux.vnet.ibm.com>
-
Yongqiang Yang authored
1) Rename ext4_ext_try_to_merge() to ext4_ext_try_to_merge_right(). 2) Add a new function ext4_ext_try_to_merge() which tries to merge an extent both left and right. 3) Use the new function in ext4_ext_convert_unwritten_endio() and ext4_ext_insert_extent(). Signed-off-by: Yongqiang Yang <xiaoqiangnk@gmail.com> Tested-by: Allison Henderson <achender@linux.vnet.ibm.com>
-
Jan Kara authored
ext4_symlink() cannot call __page_symlink() with transaction open. __page_symlink() calls ext4_write_begin() which can wait for transaction commit if we are running out of space thus causing a deadlock. Also error recovery in ext4_truncate_failed_write() does not count with the transaction being already started (although I'm not aware of any particular deadlock here). Fix the problem by stopping a transaction before calling __page_symlink() (we have to be careful and put inode to orphan list so that it gets deleted in case of crash) and starting another one after __page_symlink() returns for addition of symlink into a directory. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
Jan Kara authored
When make_indexed_dir() fails (e.g. because of ENOSPC) after it has allocated block for index tree root, we did not properly mark all changed buffers dirty. This lead to only some of these buffers being written out and thus effectively corrupting the directory. Fix the issue by marking all changed data dirty even in the error failure case. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
Theodore Ts'o authored
Fix a typo that was introduced in commit 07a03824 (in 2.6.36) which caused the extents flag not to be set at the conclusion of converting an inode to use extents. Reported-by: Peter Uchno <peter.uchno@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
- 01 May, 2011 3 commits
-
-
Theodore Ts'o authored
If an application program does not make any changes to the indirect blocks or extent tree, i_datasync_tid will not get updated. If there are enough commits (i.e., 2**31) such that tid_geq()'s calculations wrap, and there isn't a currently active transaction at the time of the fdatasync() call, this can end up triggering a BUG_ON in fs/jbd2/commit.c: J_ASSERT(journal->j_running_transaction != NULL); It's pretty rare that this can happen, since it requires the use of fdatasync() plus *very* frequent and excessive use of fsync(). But with the right workload, it can. We fix this by replacing the use of tid_geq() with an equality test, since there's only one valid transaction id that we is valid for us to wait until it is commited: namely, the currently running transaction (if it exists). Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
Theodore Ts'o authored
The block reservation code from ext3 was removed long ago... Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
Shaohua Li authored
percpu_counter_sum_positive() never returns a negative value. Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
- 30 Apr, 2011 2 commits
-
-
Theodore Ts'o authored
This is an effective revert of commit a30eec2a: "ext4: stop issuing discards if not supported by device". The problem is that there are some devices that may return errors in response to a discard request some times but not others. (One example would be a hybrid dm device which concatenates an SSD and an HDD device). By this logic, I also removed the error checking from ext4's FITRIM code; so that an error from a discard will not stop the FITRIM from trying to trim the rest of the file system. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
Curt Wohlgemuth authored
In the bio completion routine, we should not be setting PageUptodate at all -- it's set at sys_write() time, and is unaffected by success/failure of the write to disk. This can cause a page corruption bug when the file system's block size is less than the architecture's VM page size. if we have only written a single block -- we might end up setting the page's PageUptodate flag, indicating that page is completely read into memory, which may not be true. This could cause subsequent reads to get bad data. This commit also takes the opportunity to clean up error handling in ext4_end_bio(), and remove some extraneous code: - fixes ext4_end_bio() to set AS_EIO in the page->mapping->flags on error, which was left out by mistake. This is needed so that fsync() will return an error if there was an I/O error. - remove the clear_buffer_dirty() call on unmapped buffers for each page. - consolidate page/buffer error handling in a single section. Signed-off-by: Curt Wohlgemuth <curtw@google.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reported-by: Jim Meyering <jim@meyering.net> Reported-by: Hugh Dickins <hughd@google.com> Cc: Mingming Cao <cmm@us.ibm.com>
-
- 18 Apr, 2011 1 commit
-
-
Theodore Ts'o authored
Provide better emulation for ext[23] mode by enforcing that the file system does not have any unsupported file system features as defined by ext[23] when emulating the ext[23] file system driver when CONFIG_EXT4_USE_FOR_EXT23 is defined. This causes the file system type information in /proc/mounts to be correct for the automatically mounted root file system. This also means that "mount -t ext2 /dev/sda /mnt" will fail if /dev/sda contains an ext3 or ext4 file system, just as one would expect if the original ext2 file system driver were in use. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
-
- 16 Apr, 2011 1 commit
-
-
Yang Ruirui authored
Add missing page_cache_release in the error path of ext4_mb_load_buddy Signed-off-by: Yang Ruirui <ruirui.r.yang@tieto.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@kernel.org
-
- 12 Apr, 2011 1 commit
-
-
Linus Torvalds authored
-