[PATCH] Reduce i_sem usage during file sync operations
We hold i_sem during the various sync() operations to prevent livelocks: if another thread is dirtying the file, a sync() may never return. Or at least, that used to be true when we were using the per-address_space page lists. Since writeback has used radix tree traversal it is not possible to livelock the sync() operations, because they only visit each page a single time. sync_page_range() (used by O_SYNC writes) has not been holding i_sem for quite some time, for the above reasons. The patch converts fsync(), fdatasync() and msync() to also not hold i_sem during the radix-tree-based writeback. Now, we _do_ still need to hold i_sem across the file->f_op->fsync() call, because that is still based on a list_head walk, and is still livelockable. But in the case of msync() I deliberately left i_sem untaken. This is because we're currently deadlockable in msync, because mmap_sem is already held, and mmap_sem nexts inside i_sem, due to direct-io.c. And yes, the ranking of down_read() veruss down() does matter: Task A Task B Task C down_read(rwsem) down(sem) down_write(rwsem) down(sem) down_read(rwsem) C's down_write() will cause B's down_read to block. B holds `sem', so A will never release `rwsem'. So the patch fixes a hard-to-hit triple-task deadlock, but adds a possible livelock in msync(). It is possible to fix sys_msync() so that it takes i_sem outside i_mmap_sem. Later. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Showing
Please register or sign in to comment