Commit 793736b1 authored by Jan Kara's avatar Jan Kara Committed by Ben Hutchings

ext4: fix data corruption for mmap writes

commit a056bdaa upstream.

mpage_submit_page() can race with another process growing i_size and
writing data via mmap to the written-back page. As mpage_submit_page()
samples i_size too early, it may happen that ext4_bio_write_page()
zeroes out too large tail of the page and thus corrupts user data.

Fix the problem by sampling i_size only after the page has been
write-protected in page tables by clear_page_dirty_for_io() call.
Reported-by: default avatarMichael Zimmer <michael@swarm64.com>
Fixes: cb20d518Signed-off-by: default avatarJan Kara <jack@suse.cz>
Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
[bwh: Backported to 3.2: The writeback path is very different here and
 it needs to read i_size long before calling clear_page_dirty_for_io().
 So read it twice and skip the page if it changed.]
Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
parent 69615f19
......@@ -1344,7 +1344,6 @@ static int mpage_da_submit_io(struct mpage_da_data *mpd,
int ret = 0, err, nr_pages, i;
struct inode *inode = mpd->inode;
struct address_space *mapping = inode->i_mapping;
loff_t size = i_size_read(inode);
unsigned int len, block_start;
struct buffer_head *bh, *page_bufs = NULL;
int journal_data = ext4_should_journal_data(inode);
......@@ -1370,6 +1369,7 @@ static int mpage_da_submit_io(struct mpage_da_data *mpd,
for (i = 0; i < nr_pages; i++) {
int commit_write = 0, skip_page = 0;
struct page *page = pvec.pages[i];
loff_t size = i_size_read(inode);
index = page->index;
if (index > end)
......@@ -1443,11 +1443,31 @@ static int mpage_da_submit_io(struct mpage_da_data *mpd,
if (skip_page)
goto skip_page;
clear_page_dirty_for_io(page);
/*
* We have to be very careful here! Nothing protects
* writeback path against i_size changes and the page
* can be writeably mapped into page tables. So an
* application can be growing i_size and writing data
* through mmap while writeback runs.
* clear_page_dirty_for_io() write-protects our page in
* page tables and the page cannot get written to again
* until we release page lock. So only after
* clear_page_dirty_for_io() we are safe to sample
* i_size for ext4_bio_write_page() to zero-out tail of
* the written page. We rely on the barrier provided by
* TestClearPageDirty in clear_page_dirty_for_io() to
* make sure i_size is really sampled only after page
* tables are updated.
*/
if (size != i_size_read(inode)) {
set_page_dirty(page);
goto skip_page;
}
if (commit_write)
/* mark the buffer_heads as dirty & uptodate */
block_commit_write(page, 0, len);
clear_page_dirty_for_io(page);
/*
* Delalloc doesn't support data journalling,
* but eventually maybe we'll lift this
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment