- 02 Aug, 2010 40 commits
-
-
Dmitry Monakhov authored
commit 84061e07 upstream (as of v2.6.34-git13) Currently block/inode/dir counters initialized before journal was recovered. In fact after journal recovery this info will probably change. And freeblocks it critical for correct delalloc mode accounting. https://bugzilla.kernel.org/show_bug.cgi?id=15768Signed-off-by:
Dmitry Monakhov <dmonakhov@openvz.org> Acked-by:
Jan Kara <jack@suse.cz> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Dmitry Monakhov authored
commit d17413c0 upstrea (as of v2..34-git13) - Reorganize locking scheme to batch two atomic operation in to one. This also allow us to state what healthy group must obey following rule ext4_free_inodes_count(sb, gdp) == ext4_count_free(inode_bitmap, NUM); - Fix possible undefined pointer dereference. - Even if group descriptor stats aren't accessible we have to update inode bitmaps. - Move non-group members update out of group_lock. Note: this commit has been observed to fix fs corruption problems under heavy fs load Signed-off-by:
Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Dmitry Monakhov authored
commit 21ca087a upstream (as of v2.6.34-git13) The extents code will sometimes zero out blocks and mark them as initialized instead of splitting an extent into several smaller ones. This optimization however, causes problems if the extent is beyond i_size because fsck will complain if there are uninitialized blocks after i_size as this can not be distinguished from an inode that has an incorrect i_size field. https://bugzilla.kernel.org/show_bug.cgi?id=15742Signed-off-by:
Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Eric Sandeen authored
commit c445e3e0 upstream (as of v2.6.34-git13) There was a bug reported on RHEL5 that a 10G dd on a 12G box had a very, very slow sync after that. At issue was the loop in write_cache_pages scanning all the way to the end of the 10G file, even though the subsequent call to mpage_da_submit_io would only actually write a smallish amt; then we went back to the write_cache_pages loop ... wasting tons of time in calling __mpage_da_writepage for thousands of pages we would just revisit (many times) later. Upstream it's not such a big issue for sys_sync because we get to the loop with a much smaller nr_to_write, which limits the loop. However, talking with Aneesh he realized that fsync upstream still gets here with a very large nr_to_write and we face the same problem. This patch makes mpage_add_bh_to_extent stop the loop after we've accumulated 2048 pages, by setting mpd->io_done = 1; which ultimately causes the write_cache_pages loop to break. Repeating the test with a dirty_ratio of 80 (to leave something for fsync to do), I don't see huge IO performance gains, but the reduction in cpu usage is striking: 80% usage with stock, and 2% with the below patch. Instrumenting the loop in write_cache_pages clearly shows that we are wasting time here. Eventually we need to change mpage_da_map_pages() also submit its I/O to the block layer, subsuming mpage_da_submit_io(), and then change it call ext4_get_blocks() multiple times. Signed-off-by:
Eric Sandeen <sandeen@redhat.com> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Eric Sandeen authored
commit a30eec2a upstream (as of v2.6.34-git13) Turn off issuance of discard requests if the device does not support it - similar to the action we take for barriers. This will save a little computation time if a non-discardable device is mounted with -o discard, and also makes it obvious that it's not doing what was asked at mount time ... Signed-off-by:
Eric Sandeen <sandeen@redhat.com> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Eric Sandeen authored
commit 6b0310fb upstream (as of v2.6.34-git13) ext4_freeze() used jbd2_journal_lock_updates() which takes the j_barrier mutex, and then returns to userspace. The kernel does not like this: ================================================ [ BUG: lock held when returning to user space! ] ------------------------------------------------ lvcreate/1075 is leaving the kernel with locks still held! 1 lock held by lvcreate/1075: #0: (&journal->j_barrier){+.+...}, at: [<ffffffff811c6214>] jbd2_journal_lock_updates+0xe1/0xf0 Use vfs_check_frozen() added to ext4_journal_start_sb() and ext4_force_commit() instead. Addresses-Red-Hat-Bugzilla: #568503 Signed-off-by:
Eric Sandeen <sandeen@redhat.com> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Eric Sandeen authored
commit 42007efd upstream (as of v2.6.34-git13) If groups_per_flex < 2, sbi->s_flex_groups[] doesn't get filled out, and every other access to this first tests s_log_groups_per_flex; same thing needs to happen in resize or we'll wander off into a null pointer when doing an online resize of the file system. Thanks to Christoph Biedl, who came up with the trivial testcase: # truncate --size 128M fsfile # mkfs.ext3 -F fsfile # tune2fs -O extents,uninit_bg,dir_index,flex_bg,huge_file,dir_nlink,extra_isize fsfile # e2fsck -yDf -C0 fsfile # truncate --size 132M fsfile # losetup /dev/loop0 fsfile # mount /dev/loop0 mnt # resize2fs -p /dev/loop0 https://bugzilla.kernel.org/show_bug.cgi?id=13549Reported-by:
Alessandro Polverini <alex@nibbles.it> Test-case-by:
Christoph Biedl <bugzilla.kernel.bpeb@manchmal.in-ulm.de> Signed-off-by:
Eric Sandeen <sandeen@redhat.com> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Dmitry Monakhov authored
commit 35121c98 upstream (as of v2.6.34-git13) allocated_meta_data is already included in 'used' variable. Signed-off-by:
Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Christian Borntraeger authored
commit b684b2ee upstream (as of v2.6.34-git13) I have an x86_64 kernel with i386 userspace. e4defrag fails on the EXT4_IOC_MOVE_EXT ioctl because it is not wired up for the compat case. It seems that struct move_extent is compat save, only types with fixed widths are used: { __u32 reserved; /* should be zero */ __u32 donor_fd; /* donor file descriptor */ __u64 orig_start; /* logical start offset in block for orig */ __u64 donor_start; /* logical start offset in block for donor */ __u64 len; /* block length to be moved */ __u64 moved_len; /* moved block length */ }; Lets just wire up EXT4_IOC_MOVE_EXT for the compat case. Signed-off-by:
Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu> Reviewed-by:
Eric Sandeen <sandeen@redhat.com> CC: Akira Fujita <a-fujita@rs.jp.nec.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Jing Zhang authored
commit e39e07fd upstream (as of v2.6.34-git13) This function cleans up after ext4_mb_load_buddy(), so the renaming makes the code clearer. Signed-off-by:
Jing Zhang <zj.barak@gmail.com> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Jing Zhang authored
commit 62e823a2 upstream (as of v2.6.34-git13) Signed-off-by:
Jing Zhang <zj.barak@gmail.com> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Jing Zhang authored
commit b720303d upstream (as of v2.6.34-git13) When EIO occurs after bio is submitted, there is no memory free operation for bio, which results in memory leakage. And there is also no check against bio_alloc() for bio. Acked-by:
Dave Kleikamp <shaggy@linux.vnet.ibm.com> Signed-off-by:
Jing Zhang <zj.barak@gmail.com> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Dmitry Monakhov authored
commit 0671e704 upstream (as of v2.6.34-git13) Signed-off-by:
Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Theodore Ts'o authored
commit b90f6870 upstream (as of v2.6.34-rc6) Otherwise, we can end up having data corruption because the blocks could get reused and then discarded! https://bugzilla.kernel.org/show_bug.cgi?id=15579Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Curt Wohlgemuth authored
commit fd2dd9fb upstream (as of v2.6.34-rc6) Calls to ext4_get_inode_loc() returns with a reference to a buffer head in iloc->bh. The callers of this function in ext4_write_inode() when in no journal mode and in ext4_xattr_fiemap() don't release the buffer head after using it. Addresses-Google-Bug: #2548165 Signed-off-by:
Curt Wohlgemuth <curtw@google.com> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Curt Wohlgemuth authored
commit 8b472d73 upstream (as of v2.6.34-rc6) In the no-journal case, ext4_write_inode() will fetch the bh and call sync_dirty_buffer() on it. However, if the bh has already been written and the bh reclaimed for some other purpose, AND if the inode is the only one in the inode table block in use, then ext4_get_inode_loc() will not read the inode table block from disk, but as an optimization, fill the block with zero's assuming that its caller will copy in the on-disk version of the inode. This is not done by ext4_write_inode(), so the contents of the inode can simply get lost. The fix is to use __ext4_get_inode_loc() with in_mem set to 0, instead of ext4_get_inode_loc(). Long term the API needs to be fixed so it's obvious why latter is not safe. Addresses-Google-Bug: #2526446 Signed-off-by:
Curt Wohlgemuth <curtw@google.com> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Eric Sandeen authored
commit c4caae25 upstream (as of v2.6.34-rc3) When used_dirs was introduced for the flex_groups struct, it looks like the accounting was not put into place properly, in some places manipulating free_inodes rather than used_dirs. Signed-off-by:
Eric Sandeen <sandeen@redhat.com> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Jan Kara authored
commit d330a5be upstream (as of v2.6.34-rc3) http://bugzilla.kernel.org/show_bug.cgi?id=15420Signed-off-by:
Jan Kara <jack@suse.cz> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Akira Fujita authored
commit c437b273 upstream (as of v2.6.33-git11) a) Fix sparse warning in ext4_ioctl() b) Remove unneeded variable in mext_leaf_block() c) Fix spelling typo in mext_check_arguments() Signed-off-by:
Akira Fujita <a-fujita@rs.jp.nec.com> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Akira Fujita authored
commit 7247c0ca upstream (as of v2.6.33-git11) If EXT4_IOC_MOVE_EXT ioctl is called with NULL donor_fd, fget() in ext4_ioctl() gets inappropriate file structure for donor; so we need to do this check earlier, before calling double_down_write_data_sem(). Signed-off-by:
Akira Fujita <a-fujita@rs.jp.nec.com> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Akira Fujita authored
commit 5fd5249a upstream (as of v2.6.33-git11) If the leaf node has 2 extent space or fewer and EXT4_IOC_MOVE_EXT ioctl is called with the file offset where after the 2nd extent covers, mext_insert_across_blocks() always tries to insert extent into the first extent. As a result, the file gets corrupted because of wrong extent order. The patch fixes this problem. Signed-off-by:
Akira Fujita <a-fujita@rs.jp.nec.com> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Toshiyuki Okajima authored
commit b8b8afe2 upstream (as of v2.6.33-git11) The callers of ext4_check_dir_entry() usually pass in the "file offset" (ext4_readdir, htree_dirblock_to_tree, search_dirblock, ext4_dx_find_entry, empty_dir), but a few callers (add_dirent_to_buf, ext4_delete_entry) only pass in the buffer offset. To accomodate those last two (which would be hard to fix otherwise), this patch changes ext4_check_dir_entry() to print the physical block number and the relative offset as well as the passed-in offset. Signed-off-by:
Toshiyuki Okajima <toshi.okajima@jp.fujitsu.com> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Dmitry Monakhov authored
commit 6e3617e5 upstream (as of v2.6.33-git11) In case of truncate errors we explicitly remove inode from in-core orphan list via orphan_del(NULL, inode) without modifying the on-disk list. But later on, the same inode may be inserted in the orphan list again which will result the on-disk linked list getting corrupted. If inode i_dtime contains valid value, then skip on-disk list modification. Signed-off-by:
Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Dmitry Monakhov authored
commit da1dafca upstream (as of v2.6.33-git11) Otherwise non-empty orphan list will be triggered on umount. Signed-off-by:
Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Dmitry Monakhov authored
commit f39490bc upstream (as of v2.6.33-git11) Set i_nlink to zero for temporary inode from very beginning. otherwise we may fail to start new journal handle and this inode will be unreferenced but with i_nlink == 1 Since we hold inode reference it can not be pruned. Also add missed journal_start retval check. Signed-off-by:
Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Tao Ma authored
commit cc483f10 upstream (as of v2.6.33-git11) The ext4 multiblock allocator decides whether to use group or file preallocation based on the file size. When the file size reaches s_mb_stream_request (default is 16 blocks), it changes to use a file-specific preallocation. This is cool, but it has a tiny problem. See a simple script: mkfs.ext4 -b 1024 /dev/sda8 1000000 mount -t ext4 -o nodelalloc /dev/sda8 /mnt/ext4 for((i=0;i<5;i++)) do cat /mnt/4096>>/mnt/ext4/a #4096 is a file with 4096 characters. cat /mnt/4096>>/mnt/ext4/b done debuge4fs -R 'stat a' /dev/sda8|grep BLOCKS -A 1 And you get BLOCKS: (0-14):8705-8719, (15):2356, (16-19):8465-8468 So there are 3 extents, a bit strange for the lonely 15th logical block. As we write to the 16 blocks, we choose file preallocation in ext4_mb_group_or_file, but in ext4_mb_normalize_request, we meet with the 16*1024 range, so no preallocation will be carried. file b then reserves the space after '2356', so when when write 16, we start from another part. This patch just change the check in ext4_mb_group_or_file, so that for the lonely 15 we will still use group preallocation. After the patch, we will get: debuge4fs -R 'stat a' /dev/sda8|grep BLOCKS -A 1 BLOCKS: (0-15):8705-8720, (16-19):8465-8468 Looks more sane. Thanks. Signed-off-by:
Tao Ma <tao.ma@oracle.com> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Jiaying Zhang authored
commit c8d46e41 upstream (as of v2.6.33-git11) fallocate() may potentially instantiate blocks past EOF, depending on the flags used when it is called. e2fsck currently has a test for blocks past i_size, and it sometimes trips up - noticeably on xfstests 013 which runs fsstress. This patch from Jiayang does fix it up - it (along with e2fsprogs updates and other patches recently from Aneesh) has survived many fsstress runs in a row. Signed-off-by:
Eric Sandeen <sandeen@redhat.com> Signed-off-by:
Jiaying Zhang <jiayingz@google.com> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Curt Wohlgemuth authored
commit 73b50c1c upstream (as of v2.6.33-git11) Calls to ext4_handle_dirty_metadata should only pass in an inode pointer for inode-specific metadata, and not for shared metadata blocks such as inode table blocks, block group descriptors, the superblock, etc. The BUG_ON can get tripped when updating a special device (such as a block device) that is opened (so that i_mapping is set in fs/block_dev.c) and the file system is mounted in no journal mode. Addresses-Google-Bug: #2404870 Signed-off-by:
Curt Wohlgemuth <curtw@google.com> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Theodore Ts'o authored
commit 19f5fb7a upstream (as of v2.6.33-git11) At several places we modify EXT4_I(inode)->i_state without holding i_mutex (ext4_release_file, ext4_bmap, ext4_journalled_writepage, ext4_do_update_inode, ...). These modifications are racy and we can lose updates to i_state. So convert handling of i_state to use bitops which are atomic. Cc: Jan Kara <jack@suse.cz> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Aneesh Kumar K.V authored
commit 1296cc85 upstream (as of v2.6.33-rc6) We should update reserve space if it is delalloc buffer and that is indicated by EXT4_GET_BLOCKS_DELALLOC_RESERVE flag. So use EXT4_GET_BLOCKS_DELALLOC_RESERVE in place of EXT4_GET_BLOCKS_UPDATE_RESERVE_SPACE [ Stable note: This fixes a corruption cuased by the following reproduction case: rm -f $TEST_FN touch $TEST_FN fallocate -n -o 656712 -l 858907 $TEST_FN dd if=/dev/zero of=$TEST_FN conv=notrunc bs=1 seek=1011020 count=36983 sync dd if=/dev/zero of=$TEST_FN conv=notrunc bs=1 seek=332121 count=24005 dd if=/dev/zero of=$TEST_FN conv=notrunc bs=1 seek=1040179 count=93319 If the filesystem is then unmounted and e2fsck run forced, the i_blocks field for the file $TEST_FN will be found to be incorrect. ] Signed-off-by:
Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Aneesh Kumar K.V authored
commit 5f634d06 upstream (as of v2.6.33-rc6) When we fallocate a region of the file which we had recently written, and which is still in the page cache marked as delayed allocated blocks we need to make sure we don't do the quota update on writepage path. This is because the needed quota updated would have already be done by fallocate. Signed-off-by:
Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Aneesh Kumar K.V authored
commit 1db91382 upstream (as of v2.6.33-rc6) We need to release the journal before we do a write_inode. Otherwise we could deadlock. Signed-off-by:
Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Theodore Ts'o authored
commit 9d0be502 upstream (as of v2.6.33-rc3) In the past, ext4_calc_metadata_amount(), and its sub-functions ext4_ext_calc_metadata_amount() and ext4_indirect_calc_metadata_amount() badly over-estimated the number of metadata blocks that might be required for delayed allocation blocks. This didn't matter as much when functions which managed the reserved metadata blocks were more aggressive about dropping reserved metadata blocks as delayed allocation blocks were written, but unfortunately they were too aggressive. This was fixed in commit 0637c6f4, but as a result the over-estimation by ext4_calc_metadata_amount() would lead to reserving 2-3 times the number of pending delayed allocation blocks as potentially required metadata blocks. So if there are 1 megabytes of blocks which have been not yet been allocation, up to 3 megabytes of space would get reserved out of the user's quota and from the file system free space pool until all of the inode's data blocks have been allocated. This commit addresses this problem by much more accurately estimating the number of metadata blocks that will be required. It will still somewhat over-estimate the number of blocks needed, since it must make a worst case estimate not knowing which physical blocks will be needed, but it is much more accurate than before. Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Theodore Ts'o authored
commit ee5f4d9c upstream (as of v2.6.33-rc3) Commit 0637c6f4 had a typo which caused the reserved metadata blocks to not be released correctly. Fix this. Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Theodore Ts'o authored
commit 0637c6f4 upstream (as of v2.6.33-rc3) As reported in Kernel Bugzilla #14936, commit d21cd8f1 triggered a BUG in the function ext4_da_update_reserve_space() found in fs/ext4/inode.c. The root cause of this BUG() was caused by the fact that ext4_calc_metadata_amount() can severely over-estimate how many metadata blocks will be needed, especially when using direct block-mapped files. In addition, it can also badly *under* estimate how much space is needed, since ext4_calc_metadata_amount() assumes that the blocks are contiguous, and this is not always true. If the application is writing blocks to a sparse file, the number of metadata blocks necessary can be severly underestimated by the functions ext4_da_reserve_space(), ext4_da_update_reserve_space() and ext4_da_release_space(). This was the cause of the dq_claim_space reports found on kerneloops.org. Unfortunately, doing this right means that we need to massively over-estimate the amount of free space needed. So in some cases we may need to force the inode to be written to disk asynchronously in to avoid spurious quota failures. http://bugzilla.kernel.org/show_bug.cgi?id=14936Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Aneesh Kumar K.V authored
commit 515f41c3 upstream (as of v2.6.33-rc3) This fixes a bug (found by Curt Wohlgemuth) in which new blocks returned from an extent created with ext4_ext_zeroout() can have dirty metadata still associated with them. Signed-off-by:
Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by:
Curt Wohlgemuth <curtw@google.com> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Richard Kennedy authored
commit 2faf2e19 upstream (as of v2.6.33-rc3) When ext4_da_writepages increases the nr_to_write in writeback_control then it must always re-base the return value. Originally there was a (misguided) attempt prevent wbc.nr_to_write from going negative. In fact, it's necessary to allow nr_to_write to be negative so that wb_writeback() can correctly calculate how many pages were actually written. Signed-off-by:
Richard Kennedy <richard@rsk.demon.co.uk> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Julia Lawall authored
commit d3533d72 upstream (as of v2.6.33-rc3) b_entry_name and buffer are initially NULL, are initialized within a loop to the result of calling kmalloc, and are freed at the bottom of this loop. The loop contains gotos to cleanup, which also frees b_entry_name and buffer. Some of these gotos are before the reinitializations of b_entry_name and buffer. To maintain the invariant that b_entry_name and buffer are NULL at the top of the loop, and thus acceptable arguments to kfree, these variables are now set to NULL after the kfrees. This seems to be the simplest solution. A more complicated solution would be to introduce more labels in the error handling code at the end of the function. A simplified version of the semantic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @r@ identifier E; expression E1; iterator I; statement S; @@ *kfree(E); ... when != E = E1 when != I(E,...) S when != &E *kfree(E); // </smpl> Signed-off-by:
Julia Lawall <julia@diku.dk> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Theodore Ts'o authored
commit cc3e1bea upstream (as of v2.6.33-rc3) This is a bit complicated because we are trying to optimize when we send barriers to the fs data disk. We could just throw in an extra barrier to the data disk whenever we send a barrier to the journal disk, but that's not always strictly necessary. We only need to send a barrier during a commit when there are data blocks which are must be written out due to an inode written in ordered mode, or if fsync() depends on the commit to force data blocks to disk. Finally, before we drop transactions from the beginning of the journal during a checkpoint operation, we need to guarantee that any blocks that were flushed out to the data disk are firmly on the rust platter before we drop the transaction from the journal. Thanks to Oleg Drokin for pointing out this flaw in ext3/ext4. Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Surbhi Palande authored
commit 034fb4c9 upstream (as of v2.6.33-rc3) This patch fixes the Kernel BZ #14286. When the address of an extent corresponding to a valid block is corrupted, a -EIO should be reported instead of a BUG(). This situation should not normally not occur except in the case of a corrupted filesystem. If however it does, then the system should not panic directly but depending on the mount time options appropriate action should be taken. If the mount options so permit, the I/O should be gracefully aborted by returning a -EIO. http://bugzilla.kernel.org/show_bug.cgi?id=14286Signed-off-by:
Surbhi Palande <surbhi.palande@canonical.com> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-