Commit d50bdd5a authored by Curt Wohlgemuth's avatar Curt Wohlgemuth Committed by Theodore Ts'o

ext4: Fix data corruption with multi-block writepages support

This fixes a corruption problem with the multi-block
writepages submittal change for ext4, from commit
bd2d0210 ("ext4: use bio
layer instead of buffer layer in mpage_da_submit_io").

(Note that this corruption is not present in 2.6.37 on
ext4, because the corruption was detected after the
feature was merged in 2.6.37-rc1, and so it was turned
off by adding a non-default mount option,
mblk_io_submit.  With this commit, which hopefully
fixes the last of the bugs with this feature, we'll be
able to turn on this performance feature by default in
2.6.38, and remove the mblk_io_submit option.)

The ext4 code path to bundle multiple pages for
writeback in ext4_bio_write_page() had a bug: we should
be clearing buffer head dirty flags *before* we submit
the bio, not in the completion routine.

The patch below was tested on 2.6.37 under KVM with the
postgresql script which was submitted by Jon Nelson as
documented in commit 1449032b.

Without the patch, I'd hit the corruption problem about
50-70% of the time.  With the patch, I executed the
script > 100 times with no corruption seen.

I also fixed a bug to make sure ext4_end_bio() doesn't
dereference the bio after the bio_put() call.
Reported-by: default avatarJon Nelson <jnelson@jamponi.net>
Reported-by: default avatarMatthias Bayer <jackdachef@gmail.com>
Signed-off-by: default avatarCurt Wohlgemuth <curtw@google.com>
Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
Cc: stable@kernel.org
parent dd68314c
...@@ -190,6 +190,7 @@ static void ext4_end_bio(struct bio *bio, int error) ...@@ -190,6 +190,7 @@ static void ext4_end_bio(struct bio *bio, int error)
struct inode *inode; struct inode *inode;
unsigned long flags; unsigned long flags;
int i; int i;
sector_t bi_sector = bio->bi_sector;
BUG_ON(!io_end); BUG_ON(!io_end);
bio->bi_private = NULL; bio->bi_private = NULL;
...@@ -207,9 +208,7 @@ static void ext4_end_bio(struct bio *bio, int error) ...@@ -207,9 +208,7 @@ static void ext4_end_bio(struct bio *bio, int error)
if (error) if (error)
SetPageError(page); SetPageError(page);
BUG_ON(!head); BUG_ON(!head);
if (head->b_size == PAGE_CACHE_SIZE) if (head->b_size != PAGE_CACHE_SIZE) {
clear_buffer_dirty(head);
else {
loff_t offset; loff_t offset;
loff_t io_end_offset = io_end->offset + io_end->size; loff_t io_end_offset = io_end->offset + io_end->size;
...@@ -221,7 +220,6 @@ static void ext4_end_bio(struct bio *bio, int error) ...@@ -221,7 +220,6 @@ static void ext4_end_bio(struct bio *bio, int error)
if (error) if (error)
buffer_io_error(bh); buffer_io_error(bh);
clear_buffer_dirty(bh);
} }
if (buffer_delay(bh)) if (buffer_delay(bh))
partial_write = 1; partial_write = 1;
...@@ -257,7 +255,7 @@ static void ext4_end_bio(struct bio *bio, int error) ...@@ -257,7 +255,7 @@ static void ext4_end_bio(struct bio *bio, int error)
(unsigned long long) io_end->offset, (unsigned long long) io_end->offset,
(long) io_end->size, (long) io_end->size,
(unsigned long long) (unsigned long long)
bio->bi_sector >> (inode->i_blkbits - 9)); bi_sector >> (inode->i_blkbits - 9));
} }
/* Add the io_end to per-inode completed io list*/ /* Add the io_end to per-inode completed io list*/
...@@ -380,6 +378,7 @@ int ext4_bio_write_page(struct ext4_io_submit *io, ...@@ -380,6 +378,7 @@ int ext4_bio_write_page(struct ext4_io_submit *io,
blocksize = 1 << inode->i_blkbits; blocksize = 1 << inode->i_blkbits;
BUG_ON(!PageLocked(page));
BUG_ON(PageWriteback(page)); BUG_ON(PageWriteback(page));
set_page_writeback(page); set_page_writeback(page);
ClearPageError(page); ClearPageError(page);
...@@ -397,12 +396,14 @@ int ext4_bio_write_page(struct ext4_io_submit *io, ...@@ -397,12 +396,14 @@ int ext4_bio_write_page(struct ext4_io_submit *io,
for (bh = head = page_buffers(page), block_start = 0; for (bh = head = page_buffers(page), block_start = 0;
bh != head || !block_start; bh != head || !block_start;
block_start = block_end, bh = bh->b_this_page) { block_start = block_end, bh = bh->b_this_page) {
block_end = block_start + blocksize; block_end = block_start + blocksize;
if (block_start >= len) { if (block_start >= len) {
clear_buffer_dirty(bh); clear_buffer_dirty(bh);
set_buffer_uptodate(bh); set_buffer_uptodate(bh);
continue; continue;
} }
clear_buffer_dirty(bh);
ret = io_submit_add_bh(io, io_page, inode, wbc, bh); ret = io_submit_add_bh(io, io_page, inode, wbc, bh);
if (ret) { if (ret) {
/* /*
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment