• KAMEZAWA Hiroyuki's avatar
    memcg: coalesce uncharge during unmap/truncate · 569b846d
    KAMEZAWA Hiroyuki authored
    In massive parallel enviroment, res_counter can be a performance
    bottleneck.  One strong techinque to reduce lock contention is reducing
    calls by coalescing some amount of calls into one.
    
    Considering charge/uncharge chatacteristic,
    	- charge is done one by one via demand-paging.
    	- uncharge is done by
    		- in chunk at munmap, truncate, exit, execve...
    		- one by one via vmscan/paging.
    
    It seems we have a chance to coalesce uncharges for improving scalability
    at unmap/truncation.
    
    This patch is a for coalescing uncharge.  For avoiding scattering memcg's
    structure to functions under /mm, this patch adds memcg batch uncharge
    information to the task.  A reason for per-task batching is for making use
    of caller's context information.  We do batched uncharge (deleyed
    uncharge) when truncation/unmap occurs but do direct uncharge when
    uncharge is called by memory reclaim (vmscan.c).
    
    The degree of coalescing depends on callers
      - at invalidate/trucate... pagevec size
      - at unmap ....ZAP_BLOCK_SIZE
    (memory itself will be freed in this degree.)
    Then, we'll not coalescing too much.
    
    On x86-64 8cpu server, I tested overheads of memcg at page fault by
    running a program which does map/fault/unmap in a loop. Running
    a task per a cpu by taskset and see sum of the number of page faults
    in 60secs.
    
    [without memcg config]
      40156968  page-faults              #      0.085 M/sec   ( +-   0.046% )
      27.67 cache-miss/faults
    [root cgroup]
      36659599  page-faults              #      0.077 M/sec   ( +-   0.247% )
      31.58 miss/faults
    [in a child cgroup]
      18444157  page-faults              #      0.039 M/sec   ( +-   0.133% )
      69.96 miss/faults
    [child with this patch]
      27133719  page-faults              #      0.057 M/sec   ( +-   0.155% )
      47.16 miss/faults
    
    We can see some amounts of improvement.
    (root cgroup doesn't affected by this patch)
    Another patch for "charge" will follow this and above will be improved more.
    
    Changelog(since 2009/10/02):
     - renamed filed of memcg_batch (as pages to bytes, memsw to memsw_bytes)
     - some clean up and commentary/description updates.
     - added initialize code to copy_process(). (possible bug fix)
    
    Changelog(old):
     - fixed !CONFIG_MEM_CGROUP case.
     - rebased onto the latest mmotm + softlimit fix patches.
     - unified patch for callers
     - added commetns.
     - make ->do_batch as bool.
     - removed css_get() at el. We don't need it.
    Signed-off-by: default avatarKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
    Cc: Balbir Singh <balbir@in.ibm.com>
    Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    569b846d
memcontrol.c 81.9 KB