• Zhang Yi's avatar
    ext4: ext4_read_bh_lock() should submit IO if the buffer isn't uptodate · 0b73284c
    Zhang Yi authored
    Recently we notice that ext4 filesystem would occasionally fail to read
    metadata from disk and report error message, but the disk and block
    layer looks fine. After analyse, we lockon commit 88dbcbb3
    ("blkdev: avoid migration stalls for blkdev pages"). It provide a
    migration method for the bdev, we could move page that has buffers
    without extra users now, but it lock the buffers on the page, which
    breaks the fragile metadata read operation on ext4 filesystem,
    ext4_read_bh_lock() was copied from ll_rw_block(), it depends on the
    assumption of that locked buffer means it is under IO. So it just
    trylock the buffer and skip submit IO if it lock failed, after
    wait_on_buffer() we conclude IO error because the buffer is not
    uptodate.
    
    This issue could be easily reproduced by add some delay just after
    buffer_migrate_lock_buffers() in __buffer_migrate_folio() and do
    fsstress on ext4 filesystem.
    
      EXT4-fs error (device pmem1): __ext4_find_entry:1658: inode #73193:
      comm fsstress: reading directory lblock 0
      EXT4-fs error (device pmem1): __ext4_find_entry:1658: inode #75334:
      comm fsstress: reading directory lblock 0
    
    Fix it by removing the trylock logic in ext4_read_bh_lock(), just lock
    the buffer and submit IO if it's not uptodate, and also leave over
    readahead helper.
    
    Cc: stable@kernel.org
    Signed-off-by: default avatarZhang Yi <yi.zhang@huawei.com>
    Reviewed-by: default avatarJan Kara <jack@suse.cz>
    Link: https://lore.kernel.org/r/20220831074629.3755110-1-yi.zhang@huawei.comSigned-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
    0b73284c
super.c 199 KB