• Greg Thelen's avatar
    memcg: use __this_cpu_sub() to dec stats to avoid incorrect subtrahend casting · 5e8cfc3c
    Greg Thelen authored
    As of commit 3ea67d06 ("memcg: add per cgroup writeback pages
    accounting") memcg counter errors are possible when moving charged
    memory to a different memcg.  Charge movement occurs when processing
    writes to memory.force_empty, moving tasks to a memcg with
    memcg.move_charge_at_immigrate=1, or memcg deletion.
    
    An example showing error after memory.force_empty:
    
      $ cd /sys/fs/cgroup/memory
      $ mkdir x
      $ rm /data/tmp/file
      $ (echo $BASHPID >> x/tasks && exec mmap_writer /data/tmp/file 1M) &
      [1] 13600
      $ grep ^mapped x/memory.stat
      mapped_file 1048576
      $ echo 13600 > tasks
      $ echo 1 > x/memory.force_empty
      $ grep ^mapped x/memory.stat
      mapped_file 4503599627370496
    
    mapped_file should end with 0.
      4503599627370496 == 0x10,0000,0000,0000 == 0x100,0000,0000 pages
      1048576          == 0x10,0000           == 0x100 pages
    
    This issue only affects the source memcg on 64 bit machines; the
    destination memcg counters are correct.  So the rmdir case is not too
    important because such counters are soon disappearing with the entire
    memcg.  But the memcg.force_empty and memory.move_charge_at_immigrate=1
    cases are larger problems as the bogus counters are visible for the
    (possibly long) remaining life of the source memcg.
    
    The problem is due to memcg use of __this_cpu_from(.., -nr_pages), which
    is subtly wrong because it subtracts the unsigned int nr_pages (either
    -1 or -512 for THP) from a signed long percpu counter.  When
    nr_pages=-1, -nr_pages=0xffffffff.  On 64 bit machines stat->count[idx]
    is signed 64 bit.  So memcg's attempt to simply decrement a count (e.g.
    from 1 to 0) boils down to:
    
      long count = 1
      unsigned int nr_pages = 1
      count += -nr_pages  /* -nr_pages == 0xffff,ffff */
      count is now 0x1,0000,0000 instead of 0
    
    The fix is to subtract the unsigned page count rather than adding its
    negation.  This only works once "percpu: fix this_cpu_sub() subtrahend
    casting for unsigneds" is applied to fix this_cpu_sub().
    Signed-off-by: default avatarGreg Thelen <gthelen@google.com>
    Acked-by: default avatarTejun Heo <tj@kernel.org>
    Acked-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    5e8cfc3c
memcontrol.c 186 KB