• Miklos Szeredi's avatar
    fuse: fix deadlock between atomic O_TRUNC and page invalidation · 2fdbb8dd
    Miklos Szeredi authored
    fuse_finish_open() will be called with FUSE_NOWRITE set in case of atomic
    O_TRUNC open(), so commit 76224355 ("fuse: truncate pagecache on
    atomic_o_trunc") replaced invalidate_inode_pages2() by truncate_pagecache()
    in such a case to avoid the A-A deadlock. However, we found another A-B-B-A
    deadlock related to the case above, which will cause the xfstests
    generic/464 testcase hung in our virtio-fs test environment.
    
    For example, consider two processes concurrently open one same file, one
    with O_TRUNC and another without O_TRUNC. The deadlock case is described
    below, if open(O_TRUNC) is already set_nowrite(acquired A), and is trying
    to lock a page (acquiring B), open() could have held the page lock
    (acquired B), and waiting on the page writeback (acquiring A). This would
    lead to deadlocks.
    
    open(O_TRUNC)
    ----------------------------------------------------------------
    fuse_open_common
      inode_lock            [C acquire]
      fuse_set_nowrite      [A acquire]
    
      fuse_finish_open
        truncate_pagecache
          lock_page         [B acquire]
          truncate_inode_page
          unlock_page       [B release]
    
      fuse_release_nowrite  [A release]
      inode_unlock          [C release]
    ----------------------------------------------------------------
    
    open()
    ----------------------------------------------------------------
    fuse_open_common
      fuse_finish_open
        invalidate_inode_pages2
          lock_page         [B acquire]
            fuse_launder_page
              fuse_wait_on_page_writeback [A acquire & release]
          unlock_page       [B release]
    ----------------------------------------------------------------
    
    Besides this case, all calls of invalidate_inode_pages2() and
    invalidate_inode_pages2_range() in fuse code also can deadlock with
    open(O_TRUNC).
    
    Fix by moving the truncate_pagecache() call outside the nowrite protected
    region.  The nowrite protection is only for delayed writeback
    (writeback_cache) case, where inode lock does not protect against
    truncation racing with writes on the server.  Write syscalls racing with
    page cache truncation still get the inode lock protection.
    
    This patch also changes the order of filemap_invalidate_lock()
    vs. fuse_set_nowrite() in fuse_open_common().  This new order matches the
    order found in fuse_file_fallocate() and fuse_do_setattr().
    Reported-by: default avatarJiachen Zhang <zhangjiachen.jaycee@bytedance.com>
    Tested-by: default avatarJiachen Zhang <zhangjiachen.jaycee@bytedance.com>
    Fixes: e4648309 ("fuse: truncate pending writes on O_TRUNC")
    Cc: <stable@vger.kernel.org>
    Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
    2fdbb8dd
file.c 79.1 KB