Commit 3c776e64 authored by Daisuke Nishimura's avatar Daisuke Nishimura Committed by Linus Torvalds

memcg: charge swapcache to proper memcg

memcg_test.txt says at 4.1:

	This swap-in is one of the most complicated work. In do_swap_page(),
	following events occur when pte is unchanged.

	(1) the page (SwapCache) is looked up.
	(2) lock_page()
	(3) try_charge_swapin()
	(4) reuse_swap_page() (may call delete_swap_cache())
	(5) commit_charge_swapin()
	(6) swap_free().

	Considering following situation for example.

	(A) The page has not been charged before (2) and reuse_swap_page()
	    doesn't call delete_from_swap_cache().
	(B) The page has not been charged before (2) and reuse_swap_page()
	    calls delete_from_swap_cache().
	(C) The page has been charged before (2) and reuse_swap_page() doesn't
	    call delete_from_swap_cache().
	(D) The page has been charged before (2) and reuse_swap_page() calls
	    delete_from_swap_cache().

	    memory.usage/memsw.usage changes to this page/swp_entry will be
	 Case          (A)      (B)       (C)     (D)
         Event
       Before (2)     0/ 1     0/ 1      1/ 1    1/ 1
          ===========================================
          (3)        +1/+1    +1/+1     +1/+1   +1/+1
          (4)          -       0/ 0       -     -1/ 0
          (5)         0/-1     0/ 0     -1/-1    0/ 0
          (6)          -       0/-1       -      0/-1
          ===========================================
       Result         1/ 1     1/ 1      1/ 1    1/ 1

       In any cases, charges to this page should be 1/ 1.

In case of (D), mem_cgroup_try_get_from_swapcache() returns NULL
(because lookup_swap_cgroup() returns NULL), so "+1/+1" at (3) means
charges to the memcg("foo") to which the "current" belongs.
OTOH, "-1/0" at (4) and "0/-1" at (6) means uncharges from the memcg("baa")
to which the page has been charged.

So, if the "foo" and "baa" is different(for example because of task move),
this charge will be moved from "baa" to "foo".

I think this is an unexpected behavior.

This patch fixes this by modifying mem_cgroup_try_get_from_swapcache()
to return the memcg to which the swapcache has been charged if PCG_USED bit
is set.
IIUC, checking PCG_USED bit of swapcache is safe under page lock.
Signed-off-by: default avatarDaisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
parent 3918b96e
...@@ -994,13 +994,24 @@ static int __mem_cgroup_try_charge(struct mm_struct *mm, ...@@ -994,13 +994,24 @@ static int __mem_cgroup_try_charge(struct mm_struct *mm,
static struct mem_cgroup *try_get_mem_cgroup_from_swapcache(struct page *page) static struct mem_cgroup *try_get_mem_cgroup_from_swapcache(struct page *page)
{ {
struct mem_cgroup *mem; struct mem_cgroup *mem;
struct page_cgroup *pc;
swp_entry_t ent; swp_entry_t ent;
VM_BUG_ON(!PageLocked(page));
if (!PageSwapCache(page)) if (!PageSwapCache(page))
return NULL; return NULL;
pc = lookup_page_cgroup(page);
/*
* Used bit of swapcache is solid under page lock.
*/
if (PageCgroupUsed(pc))
mem = pc->mem_cgroup;
else {
ent.val = page_private(page); ent.val = page_private(page);
mem = lookup_swap_cgroup(ent); mem = lookup_swap_cgroup(ent);
}
if (!mem) if (!mem)
return NULL; return NULL;
if (!css_tryget(&mem->css)) if (!css_tryget(&mem->css))
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment