[PATCH] mark swapout pages PageWriteback()

Pages which are under writeout to swap are locked, and not PageWriteback(). So page allocators do not throttle against them in shrink_caches(). This causes enormous list scans and general coma under really heavy swapout loads. One fix would be to teach shrink_cache() to wait on PG_locked for swap pages. The other approach is to set both PG_locked and PG_writeback for swap pages so they can be handled in the same manner as file-backed pages in shrink_cache(). This patch takes the latter approach.

[PATCH] mark swapout pages PageWriteback()
Pages which are under writeout to swap are locked, and not PageWriteback(). So page allocators do not throttle against them in shrink_caches(). This causes enormous list scans and general coma under really heavy swapout loads. One fix would be to teach shrink_cache() to wait on PG_locked for swap pages. The other approach is to set both PG_locked and PG_writeback for swap pages so they can be handled in the same manner as file-backed pages in shrink_cache(). This patch takes the latter approach.
357f5a5e · Andrew Morton · Linus Torvalds · bd052817 · 357f5a5e · 357f5a5e
Commit 357f5a5e authored May 27, 2002 by Andrew Morton Committed by Linus Torvalds May 27, 2002
Hide whitespace changes
Inline Side-by-side

Showing with 18 additions and 4 deletions

fs/buffer.c fs/buffer.c +16 -0

mm/swap_state.c mm/swap_state.c +2 -4

No files found.
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -544,6 +544,14 @@ static void end_buffer_async_read(struct buffer_head *bh, int uptodate)
 	 */
 	if (page_uptodate && !PageError(page))
 		SetPageUptodate(page);
+	/*
+	 * swap page handling is a bit hacky.  A standalone completion handler
+	 * for swapout pages would fix that up.  swapin can use this function.
+	 */
+	if (PageSwapCache(page) && PageWriteback(page))
+		end_page_writeback(page);
 	unlock_page(page);
 	return;
@@ -2271,6 +2279,9 @@ int brw_kiovec(int rw, int nr, struct kiobuf *iovec[],
 * calls block_flushpage() under spinlock and hits a locked buffer, and
 * schedules under spinlock.   Another approach would be to teach
 * find_trylock_page() to also trylock the page's writeback flags.
+ *
+ * Swap pages are also marked PageWriteback when they are being written
+ * so that memory allocators will throttle on them.
 */
 int brw_page(int rw, struct page *page,
 		struct block_device *bdev, sector_t b[], int size)
@@ -2301,6 +2312,11 @@ int brw_page(int rw, struct page *page,
 		bh = bh->b_this_page;
 	} while (bh != head);
+	if (rw == WRITE) {
+		BUG_ON(PageWriteback(page));
+		SetPageWriteback(page);
+	}
 	/* Stage 2: start the IO */
 	do {
 		struct buffer_head *next = bh->b_this_page;

--- a/mm/swap_state.c
+++ b/mm/swap_state.c
@@ -36,10 +36,8 @@ static int swap_writepage(struct page *page)
 * swapper_space doesn't have a real inode, so it gets a special vm_writeback()
 * so we don't need swap special cases in generic_vm_writeback().
 *
- * FIXME: swap pages are locked, but not PageWriteback while under writeout.
+ * Swap pages are PageLocked and PageWriteback while under writeout so that
- * This will confuse throttling in shrink_cache().  It may be advantageous to
+ * memory allocators will throttle against them.
- * set PG_writeback against swap pages while they're also locked.  Either that,
- * or special-case swap pages in shrink_cache().
 */
 static int swap_vm_writeback(struct page *page, int *nr_to_write)
 {