• Jan Kara's avatar
    ext4: fix data exposure after a crash · 06bd3c36
    Jan Kara authored
    Huang has reported that in his powerfail testing he is seeing stale
    block contents in some of recently allocated blocks although he mounts
    ext4 in data=ordered mode. After some investigation I have found out
    that indeed when delayed allocation is used, we don't add inode to
    transaction's list of inodes needing flushing before commit. Originally
    we were doing that but commit f3b59291 removed the logic with a
    flawed argument that it is not needed.
    
    The problem is that although for delayed allocated blocks we write their
    contents immediately after allocating them, there is no guarantee that
    the IO scheduler or device doesn't reorder things and thus transaction
    allocating blocks and attaching them to inode can reach stable storage
    before actual block contents. Actually whenever we attach freshly
    allocated blocks to inode using a written extent, we should add inode to
    transaction's ordered inode list to make sure we properly wait for block
    contents to be written before committing the transaction. So that is
    what we do in this patch. This also handles other cases where stale data
    exposure was possible - like filling hole via mmap in
    data=ordered,nodelalloc mode.
    
    The only exception to the above rule are extending direct IO writes where
    blkdev_direct_IO() waits for IO to complete before increasing i_size and
    thus stale data exposure is not possible. For now we don't complicate
    the code with optimizing this special case since the overhead is pretty
    low. In case this is observed to be a performance problem we can always
    handle it using a special flag to ext4_map_blocks().
    
    CC: stable@vger.kernel.org
    Fixes: f3b59291Reported-by: default avatar"HUANG Weller (CM/ESW12-CN)" <Weller.Huang@cn.bosch.com>
    Tested-by: default avatar"HUANG Weller (CM/ESW12-CN)" <Weller.Huang@cn.bosch.com>
    Signed-off-by: default avatarJan Kara <jack@suse.cz>
    Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
    06bd3c36
inode.c 164 KB