Commit ffb5387e authored by Eric Sandeen's avatar Eric Sandeen Committed by Theodore Ts'o

ext4: fix unjournaled inode bitmap modification

commit 119c0d44 changed
ext4_new_inode() such that the inode bitmap was being modified
outside a transaction, which could lead to corruption, and was
discovered when journal_checksum found a bad checksum in the
journal during log replay.

Nix ran into this when using the journal_async_commit mount
option, which enables journal checksumming.  The ensuing
journal replay failures due to the bad checksums led to
filesystem corruption reported as the now infamous
"Apparent serious progressive ext4 data corruption bug"

[ Changed by tytso to only call ext4_journal_get_write_access() only
  when we're fairly certain that we're going to allocate the inode. ]

I've tested this by mounting with journal_checksum and
running fsstress then dropping power; I've also tested by
hacking DM to create snapshots w/o first quiescing, which
allows me to test journal replay repeatedly w/o actually
power-cycling the box.  Without the patch I hit a journal
checksum error every time.  With this fix it survives
many iterations.
Reported-by: default avatarNix <nix@esperi.org.uk>
Signed-off-by: default avatarEric Sandeen <sandeen@redhat.com>
Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org
parent 8f0d8163
...@@ -725,6 +725,10 @@ struct inode *ext4_new_inode(handle_t *handle, struct inode *dir, umode_t mode, ...@@ -725,6 +725,10 @@ struct inode *ext4_new_inode(handle_t *handle, struct inode *dir, umode_t mode,
"inode=%lu", ino + 1); "inode=%lu", ino + 1);
continue; continue;
} }
BUFFER_TRACE(inode_bitmap_bh, "get_write_access");
err = ext4_journal_get_write_access(handle, inode_bitmap_bh);
if (err)
goto fail;
ext4_lock_group(sb, group); ext4_lock_group(sb, group);
ret2 = ext4_test_and_set_bit(ino, inode_bitmap_bh->b_data); ret2 = ext4_test_and_set_bit(ino, inode_bitmap_bh->b_data);
ext4_unlock_group(sb, group); ext4_unlock_group(sb, group);
...@@ -738,6 +742,11 @@ struct inode *ext4_new_inode(handle_t *handle, struct inode *dir, umode_t mode, ...@@ -738,6 +742,11 @@ struct inode *ext4_new_inode(handle_t *handle, struct inode *dir, umode_t mode,
goto out; goto out;
got: got:
BUFFER_TRACE(inode_bitmap_bh, "call ext4_handle_dirty_metadata");
err = ext4_handle_dirty_metadata(handle, NULL, inode_bitmap_bh);
if (err)
goto fail;
/* We may have to initialize the block bitmap if it isn't already */ /* We may have to initialize the block bitmap if it isn't already */
if (ext4_has_group_desc_csum(sb) && if (ext4_has_group_desc_csum(sb) &&
gdp->bg_flags & cpu_to_le16(EXT4_BG_BLOCK_UNINIT)) { gdp->bg_flags & cpu_to_le16(EXT4_BG_BLOCK_UNINIT)) {
...@@ -771,11 +780,6 @@ struct inode *ext4_new_inode(handle_t *handle, struct inode *dir, umode_t mode, ...@@ -771,11 +780,6 @@ struct inode *ext4_new_inode(handle_t *handle, struct inode *dir, umode_t mode,
goto fail; goto fail;
} }
BUFFER_TRACE(inode_bitmap_bh, "get_write_access");
err = ext4_journal_get_write_access(handle, inode_bitmap_bh);
if (err)
goto fail;
BUFFER_TRACE(group_desc_bh, "get_write_access"); BUFFER_TRACE(group_desc_bh, "get_write_access");
err = ext4_journal_get_write_access(handle, group_desc_bh); err = ext4_journal_get_write_access(handle, group_desc_bh);
if (err) if (err)
...@@ -823,11 +827,6 @@ struct inode *ext4_new_inode(handle_t *handle, struct inode *dir, umode_t mode, ...@@ -823,11 +827,6 @@ struct inode *ext4_new_inode(handle_t *handle, struct inode *dir, umode_t mode,
} }
ext4_unlock_group(sb, group); ext4_unlock_group(sb, group);
BUFFER_TRACE(inode_bitmap_bh, "call ext4_handle_dirty_metadata");
err = ext4_handle_dirty_metadata(handle, NULL, inode_bitmap_bh);
if (err)
goto fail;
BUFFER_TRACE(group_desc_bh, "call ext4_handle_dirty_metadata"); BUFFER_TRACE(group_desc_bh, "call ext4_handle_dirty_metadata");
err = ext4_handle_dirty_metadata(handle, NULL, group_desc_bh); err = ext4_handle_dirty_metadata(handle, NULL, group_desc_bh);
if (err) if (err)
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment