• Zhihao Cheng's avatar
    jbd2: fix potential data lost in recovering journal raced with synchronizing fs bdev · 61187fce
    Zhihao Cheng authored
    JBD2 makes sure journal data is fallen on fs device by sync_blockdev(),
    however, other process could intercept the EIO information from bdev's
    mapping, which leads journal recovering successful even EIO occurs during
    data written back to fs device.
    
    We found this problem in our product, iscsi + multipath is chosen for block
    device of ext4. Unstable network may trigger kpartx to rescan partitions in
    device mapper layer. Detailed process is shown as following:
    
      mount          kpartx          irq
    jbd2_journal_recover
     do_one_pass
      memcpy(nbh->b_data, obh->b_data) // copy data to fs dev from journal
      mark_buffer_dirty // mark bh dirty
             vfs_read
    	  generic_file_read_iter // dio
    	   filemap_write_and_wait_range
    	    __filemap_fdatawrite_range
    	     do_writepages
    	      block_write_full_folio
    	       submit_bh_wbc
    	            >>  EIO occurs in disk  <<
    	                     end_buffer_async_write
    			      mark_buffer_write_io_error
    			       mapping_set_error
    			        set_bit(AS_EIO, &mapping->flags) // set!
    	    filemap_check_errors
    	     test_and_clear_bit(AS_EIO, &mapping->flags) // clear!
     err2 = sync_blockdev
      filemap_write_and_wait
       filemap_check_errors
        test_and_clear_bit(AS_EIO, &mapping->flags) // false
     err2 = 0
    
    Filesystem is mounted successfully even data from journal is failed written
    into disk, and ext4/ocfs2 could become corrupted.
    
    Fix it by comparing the wb_err state in fs block device before recovering
    and after recovering.
    
    A reproducer can be found in the kernel bugzilla referenced below.
    
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=217888
    Cc: stable@vger.kernel.org
    Signed-off-by: default avatarZhihao Cheng <chengzhihao1@huawei.com>
    Signed-off-by: default avatarZhang Yi <yi.zhang@huawei.com>
    Reviewed-by: default avatarJan Kara <jack@suse.cz>
    Link: https://lore.kernel.org/r/20230919012525.1783108-1-chengzhihao1@huawei.comSigned-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
    61187fce
recovery.c 24.2 KB