• Filipe Brandenburger's avatar
    memcg: reparent charges of children before processing parent · 4fb1a86f
    Filipe Brandenburger authored
    Sometimes the cleanup after memcg hierarchy testing gets stuck in
    mem_cgroup_reparent_charges(), unable to bring non-kmem usage down to 0.
    
    There may turn out to be several causes, but a major cause is this: the
    workitem to offline parent can get run before workitem to offline child;
    parent's mem_cgroup_reparent_charges() circles around waiting for the
    child's pages to be reparented to its lrus, but it's holding
    cgroup_mutex which prevents the child from reaching its
    mem_cgroup_reparent_charges().
    
    Further testing showed that an ordered workqueue for cgroup_destroy_wq
    is not always good enough: percpu_ref_kill_and_confirm's call_rcu_sched
    stage on the way can mess up the order before reaching the workqueue.
    
    Instead, when offlining a memcg, call mem_cgroup_reparent_charges() on
    all its children (and grandchildren, in the correct order) to have their
    charges reparented first.
    
    Fixes: e5fca243 ("cgroup: use a dedicated workqueue for cgroup destruction")
    Signed-off-by: default avatarFilipe Brandenburger <filbranden@google.com>
    Signed-off-by: default avatarHugh Dickins <hughd@google.com>
    Reviewed-by: default avatarTejun Heo <tj@kernel.org>
    Acked-by: default avatarMichal Hocko <mhocko@suse.cz>
    Cc: Johannes Weiner <hannes@cmpxchg.org>
    Cc: <stable@vger.kernel.org>	[v3.10+]
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    4fb1a86f
memcontrol.c 194 KB