• Dmitry Monakhov's avatar
    ext4: clean up online defrag bugs in move_extent_per_page() · bb557488
    Dmitry Monakhov authored
    Non-full list of bugs:
    1) uninitialized extent optimization does not hold page's lock,
       and simply replace brunches after that writeback code goes
       crazy because block mapping changed under it's feets
       kernel BUG at fs/ext4/inode.c:1434!  ( 288'th xfstress)
    
    2) uninitialized extent may became initialized right after we
       drop i_data_sem, so extent state must be rechecked
    
    3) Locked pages goes uptodate via following sequence:
       ->readpage(page); lock_page(page); use_that_page(page)
       But after readpage() one may invalidate it because it is
       uptodate and unlocked (reclaimer does that)
       As result kernel bug at include/linux/buffer_head.c:133!
    
    4) We call write_begin() with already opened stansaction which
       result in following deadlock:
    ->move_extent_per_page()
      ->ext4_journal_start()-> hold journal transaction
      ->write_begin()
        ->ext4_da_write_begin()
          ->ext4_nonda_switch()
            ->writeback_inodes_sb_if_idle()  --> will wait for journal_stop()
    
    5) try_to_release_page() may fail and it does fail if one of page's bh was
       pinned by journal
    
    6) If we about to change page's mapping we MUST hold it's lock during entire
       remapping procedure, this is true for both pages(original and donor one)
    
    Fixes:
    
    - Avoid (1) and (2) simply by temproraly drop uninitialized extent handling
      optimization, this will be reimplemented later.
    
    - Fix (3) by manually forcing page to uptodate state w/o dropping it's lock
    
    - Fix (4) by rearranging existing locking:
      from: journal_start(); ->write_begin
      to: write_begin(); journal_extend()
    - Fix (5) simply by checking retvalue
    - Fix (6) by locking both (original and donor one) pages during extent swap
      with help of mext_page_double_lock()
    Signed-off-by: default avatarDmitry Monakhov <dmonakhov@openvz.org>
    Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
    bb557488
move_extent.c 42.3 KB