• Yan, Zheng's avatar
    ceph: introduce i_truncate_mutex · b0d7c223
    Yan, Zheng authored
    I encountered below deadlock when running fsstress
    
    wmtruncate work      truncate                 MDS
    ---------------  ------------------  --------------------------
                       lock i_mutex
                                          <- truncate file
    lock i_mutex (blocked)
                                          <- revoking Fcb (filelock to MIX)
                       send request ->
                                             handle request (xlock filelock)
    
    At the initial time, there are some dirty pages in the page cache.
    When the kclient receives the truncate message, it reduces inode size
    and creates some 'out of i_size' dirty pages. wmtruncate work can't
    truncate these dirty pages because it's blocked by the i_mutex. Later
    when the kclient receives the cap message that revokes Fcb caps, It
    can't flush all dirty pages because writepages() only flushes dirty
    pages within the inode size.
    
    When the MDS handles the 'truncate' request from kclient, it waits
    for the filelock to become stable. But the filelock is stuck in
    unstable state because it can't finish revoking kclient's Fcb caps.
    
    The truncate pagecache locking has already caused lots of trouble
    for use. I think it's time simplify it by introducing a new mutex.
    We use the new mutex to prevent concurrent truncate_inode_pages().
    There is no need to worry about race between buffered write and
    truncate_inode_pages(), because our "get caps" mechanism prevents
    them from concurrent execution.
    Reviewed-by: default avatarSage Weil <sage@inktank.com>
    Signed-off-by: default avatarYan, Zheng <zheng.z.yan@intel.com>
    b0d7c223
inode.c 50.8 KB