• Namhyung Kim's avatar
    perf core: Add a kmem_cache for struct perf_event · bdacfaf2
    Namhyung Kim authored
    The kernel can allocate a lot of struct perf_event when profiling. For
    example, 256 cpu x 8 events x 20 cgroups = 40K instances of the struct
    would be allocated on a large system.
    
    The size of struct perf_event in my setup is 1152 byte. As it's
    allocated by kmalloc, the actual allocation size would be rounded up
    to 2K.
    
    Then there's 896 byte (~43%) of waste per instance resulting in total
    ~35MB with 40K instances. We can create a dedicated kmem_cache to
    avoid such a big unnecessary memory consumption.
    
    With this change, I can see below (note this machine has 112 cpus).
    
      # grep perf_event /proc/slabinfo
      perf_event    224    784   1152    7    2 : tunables   24   12    8 : slabdata    112    112      0
    
    The sixth column is pages-per-slab which is 2, and the fifth column is
    obj-per-slab which is 7.  Thus actually it can use 1152 x 7 = 8064
    byte in the 8K, and wasted memory is (8192 - 8064) / 7 = ~18 byte per
    instance.
    Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
    Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
    Link: https://lkml.kernel.org/r/20210311115413.444407-1-namhyung@kernel.org
    bdacfaf2
core.c 315 KB