[PATCH] always update page->flags atomically
move_from_swap_cache() and move_to_swap_cache() are playing with page->flags nonatomically. The page is on the LRU at the time and another CPU could be altering page->flags concurrently. The patch converts those functions to use atomic operations. It also rationalises the number of bits which are cleared. It's not really clear to me what page flags we really want to set to a known state in there. It had no right to go clearing PG_arch_1. I'm now clearing PG_arch_1 inside rmqueue() which is still a bit presumptious. btw: shmem uses PAGE_CACHE_SIZE and swapper_space uses PAGE_SIZE. I've been carefully maintaining the distinction, but it looks like shmem will break if we ever do make these values different. Also, __add_to_page_cache() was performing a non-atomic RMW against page->flags, under the assumption that it was a newly allocated page which no other CPU would look at. Not true - this function is used for moving anon pages into swapcache. Those anon pages are on the LRU - other CPUs can be performing operations against page->flags while __add_to_swap_cache is stomping on them. This had me running around in circles for two days. So let's move the initialisation of the page state into rmqueue(), where the page really is new (could do it in page_cache_alloc, perhaps). The SetPageLocked() in __add_to_page_cache() is also rather curious. Seems OK for both pagecache and swapcache so I covered that with a comment. 2.4 has the same problem. Basically, add_to_swap_cache() can stomp on another CPU's manipulation of page->flags. After a quick review of the code there, it is barely conceivable that a concurrent refill_inactve() could get its PG_referenced and PG_active bits scribbled on. Rather unlikely because swap_out() will probably see PageActive() and bale out. Also, mark_dirty_kiobuf() could have its PG_dirty bit accidentally cleared (but try_to_swap_out() sets it again later). But there may be other code paths. Really, I think this needs fixing in 2.4 - it's horrid.
Showing
Please register or sign in to comment