mm: fix s390 BUG by __set_page_dirty_no_writeback on swap
Mel reports a BUG_ON(slot == NULL) in radix_tree_tag_set() on s390 3.0.13: called from __set_page_dirty_nobuffers() when page_remove_rmap() tries to transfer dirty flag from s390 storage key to struct page and radix_tree. That would be because of reclaim's shrink_page_list() calling add_to_swap() on this page at the same time: first PageSwapCache is set (causing page_mapping(page) to appear as &swapper_space), then page->private set, then tree_lock taken, then page inserted into radix_tree - so there's an interval before taking the lock when the radix_tree slot is empty. We could fix this by moving __add_to_swap_cache()'s spin_lock_irq up before the SetPageSwapCache. But a better fix is simply to do what's five years overdue: Ken Chen introduced __set_page_dirty_no_writeback() (if !PageDirty TestSetPageDirty) for tmpfs to skip all the radix_tree overhead, and swap is just the same - it ignores the radix_tree tag, and does not participate in dirty page accounting, so should be using __set_page_dirty_no_writeback() too. s390 testing now confirms that this does indeed fix the problem. Reported-by: Mel Gorman <mgorman@suse.de> Signed-off-by: Hugh Dickins <hughd@google.com> Acked-by: Mel Gorman <mgorman@suse.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Rik van Riel <riel@redhat.com> Cc: Ken Chen <kenchen@google.com> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Showing
Please register or sign in to comment