• Johannes Weiner's avatar
    mm: memcontrol: fix recursive statistics correctness & scalabilty · 42a30035
    Johannes Weiner authored
    Right now, when somebody needs to know the recursive memory statistics
    and events of a cgroup subtree, they need to walk the entire subtree and
    sum up the counters manually.
    
    There are two issues with this:
    
    1. When a cgroup gets deleted, its stats are lost. The state counters
       should all be 0 at that point, of course, but the events are not.
       When this happens, the event counters, which are supposed to be
       monotonic, can go backwards in the parent cgroups.
    
    2. During regular operation, we always have a certain number of lazily
       freed cgroups sitting around that have been deleted, have no tasks,
       but have a few cache pages remaining. These groups' statistics do not
       change until we eventually hit memory pressure, but somebody
       watching, say, memory.stat on an ancestor has to iterate those every
       time.
    
    This patch addresses both issues by introducing recursive counters at
    each level that are propagated from the write side when stats change.
    
    Upward propagation happens when the per-cpu caches spill over into the
    local atomic counter.  This is the same thing we do during charge and
    uncharge, except that the latter uses atomic RMWs, which are more
    expensive; stat changes happen at around the same rate.  In a sparse
    file test (page faults and reclaim at maximum CPU speed) with 5 cgroup
    nesting levels, perf shows __mod_memcg_page state at ~1%.
    
    Link: http://lkml.kernel.org/r/20190412151507.2769-4-hannes@cmpxchg.orgSigned-off-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
    Reviewed-by: default avatarShakeel Butt <shakeelb@google.com>
    Reviewed-by: default avatarRoman Gushchin <guro@fb.com>
    Cc: Michal Hocko <mhocko@kernel.org>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    42a30035
memcontrol.c 173 KB