Commit eebef30c authored by Andrew Morton's avatar Andrew Morton Committed by Linus Torvalds

[PATCH] dirty inode writeback fix

Both sys_sync() and the kupdate function need to precalculate the number of
pages which they are prepared to write.  Mainly for livelock avoidance.

But they also must write inodes, and dirty inodes do not contribute to dirty
page accounting (oops).  Net effect: when there are lots of dirty inodes and
few dirty pages, we forget to write inodes.

This mainly affects atime updates, because most other inode-dirtying activity
will generate dirty pages too.

It mainly affects ext2.


Now, writing an ext2 inode will just dirty the underlying blockdev pagecache
page.  So what the patch does is to assume that writing one inode will dirty
up to one pagecache page.  So the patch adds (inodes_stat.nr_inodes -
inodes_stat.nr_unused) into the number of pages to be written.

I considered creating inodes_stat.nr_dirty.  It looks fairly messy, needing
to know not to account for memory-backed inodes, etc.  But it is probably a
better thing to do.
parent 9d54df6e
...@@ -369,6 +369,9 @@ writeback_inodes(struct writeback_control *wbc) ...@@ -369,6 +369,9 @@ writeback_inodes(struct writeback_control *wbc)
* *
* A finite limit is set on the number of pages which will be written. * A finite limit is set on the number of pages which will be written.
* To prevent infinite livelock of sys_sync(). * To prevent infinite livelock of sys_sync().
*
* We add in the number of potentially dirty inodes, because each inode write
* can dirty pagecache in the underlying blockdev.
*/ */
void sync_inodes_sb(struct super_block *sb, int wait) void sync_inodes_sb(struct super_block *sb, int wait)
{ {
...@@ -382,7 +385,9 @@ void sync_inodes_sb(struct super_block *sb, int wait) ...@@ -382,7 +385,9 @@ void sync_inodes_sb(struct super_block *sb, int wait)
get_page_state(&ps); get_page_state(&ps);
wbc.nr_to_write = ps.nr_dirty + ps.nr_unstable + wbc.nr_to_write = ps.nr_dirty + ps.nr_unstable +
(ps.nr_dirty + ps.nr_unstable) / 4; (inodes_stat.nr_inodes - inodes_stat.nr_unused) +
ps.nr_dirty + ps.nr_unstable;
wbc.nr_to_write += wbc.nr_to_write / 2; /* Bit more for luck */
spin_lock(&inode_lock); spin_lock(&inode_lock);
sync_sb_inodes(sb, &wbc); sync_sb_inodes(sb, &wbc);
spin_unlock(&inode_lock); spin_unlock(&inode_lock);
......
...@@ -323,7 +323,8 @@ static void wb_kupdate(unsigned long arg) ...@@ -323,7 +323,8 @@ static void wb_kupdate(unsigned long arg)
oldest_jif = jiffies - (dirty_expire_centisecs * HZ) / 100; oldest_jif = jiffies - (dirty_expire_centisecs * HZ) / 100;
start_jif = jiffies; start_jif = jiffies;
next_jif = start_jif + (dirty_writeback_centisecs * HZ) / 100; next_jif = start_jif + (dirty_writeback_centisecs * HZ) / 100;
nr_to_write = ps.nr_dirty + ps.nr_unstable; nr_to_write = ps.nr_dirty + ps.nr_unstable +
(inodes_stat.nr_inodes - inodes_stat.nr_unused);
while (nr_to_write > 0) { while (nr_to_write > 0) {
wbc.encountered_congestion = 0; wbc.encountered_congestion = 0;
wbc.nr_to_write = MAX_WRITEBACK_PAGES; wbc.nr_to_write = MAX_WRITEBACK_PAGES;
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment