Commit 50d49a05 authored by Andrew Morton's avatar Andrew Morton Committed by Linus Torvalds

[PATCH] Fix SMP race betwen __sync_single_inode and

Patch from Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>

there's a SMP race condition between __sync_single_inode (or __sync_one on
2.4.20) and __mark_inode_dirty. __mark_inode_dirty doesn't take inode
spinlock. As we know -- unless you take a spinlock or use barrier,
processor can change order of instructions.

CPU 1

modify inode
(but modifications are in cpu-local
buffer and do not go to bus)

calls
__mark_inode_dirty
it sees I_DIRTY and exits immediatelly
					CPU 2
					takes spinlock
					calls __sync_single_inode
					inode->i_state &= ~I_DIRTY
					writes the inode (but does not see
					modifications by CPU 1 yet)

CPU 1 flushes its write buffer to the bus
inode is already written, clean, modifications
done by CPU1 are lost

The easiest fix would be to move the test inside spinlock in
__mark_inode_dirty; if you do not want to suffer from performance loss,
use the attached patches that use memory barriers to ensure ordering of
reads and writes.
parent 0b316620
...@@ -61,6 +61,12 @@ void __mark_inode_dirty(struct inode *inode, int flags) ...@@ -61,6 +61,12 @@ void __mark_inode_dirty(struct inode *inode, int flags)
sb->s_op->dirty_inode(inode); sb->s_op->dirty_inode(inode);
} }
/*
* make sure that changes are seen by all cpus before we test i_state
* -- mikulas
*/
smp_mb();
/* avoid the locking if we can */ /* avoid the locking if we can */
if ((inode->i_state & flags) == flags) if ((inode->i_state & flags) == flags)
return; return;
...@@ -137,6 +143,12 @@ __sync_single_inode(struct inode *inode, struct writeback_control *wbc) ...@@ -137,6 +143,12 @@ __sync_single_inode(struct inode *inode, struct writeback_control *wbc)
inode->i_state |= I_LOCK; inode->i_state |= I_LOCK;
inode->i_state &= ~I_DIRTY; inode->i_state &= ~I_DIRTY;
/*
* smp_rmb(); note: if you remove write_lock below, you must add this.
* mark_inode_dirty doesn't take spinlock, make sure that inode is not
* read speculatively by this cpu before &= ~I_DIRTY -- mikulas
*/
write_lock(&mapping->page_lock); write_lock(&mapping->page_lock);
if (wait || !wbc->for_kupdate || list_empty(&mapping->io_pages)) if (wait || !wbc->for_kupdate || list_empty(&mapping->io_pages))
list_splice_init(&mapping->dirty_pages, &mapping->io_pages); list_splice_init(&mapping->dirty_pages, &mapping->io_pages);
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment