• Kan Liang's avatar
    perf/core: Use kmem_cache to allocate the PMU specific data · 217c2a63
    Kan Liang authored
    Currently, the PMU specific data task_ctx_data is allocated by the
    function kzalloc() in the perf generic code. When there is no specific
    alignment requirement for the task_ctx_data, the method works well for
    now. However, there will be a problem once a specific alignment
    requirement is introduced in future features, e.g., the Architecture LBR
    XSAVE feature requires 64-byte alignment. If the specific alignment
    requirement is not fulfilled, the XSAVE family of instructions will fail
    to save/restore the xstate to/from the task_ctx_data.
    
    The function kzalloc() itself only guarantees a natural alignment. A
    new method to allocate the task_ctx_data has to be introduced, which
    has to meet the requirements as below:
    - must be a generic method can be used by different architectures,
      because the allocation of the task_ctx_data is implemented in the
      perf generic code;
    - must be an alignment-guarantee method (The alignment requirement is
      not changed after the boot);
    - must be able to allocate/free a buffer (smaller than a page size)
      dynamically;
    - should not cause extra CPU overhead or space overhead.
    
    Several options were considered as below:
    - One option is to allocate a larger buffer for task_ctx_data. E.g.,
        ptr = kmalloc(size + alignment, GFP_KERNEL);
        ptr &= ~(alignment - 1);
      This option causes space overhead.
    - Another option is to allocate the task_ctx_data in the PMU specific
      code. To do so, several function pointers have to be added. As a
      result, both the generic structure and the PMU specific structure
      will become bigger. Besides, extra function calls are added when
      allocating/freeing the buffer. This option will increase both the
      space overhead and CPU overhead.
    - The third option is to use a kmem_cache to allocate a buffer for the
      task_ctx_data. The kmem_cache can be created with a specific alignment
      requirement by the PMU at boot time. A new pointer for kmem_cache has
      to be added in the generic struct pmu, which would be used to
      dynamically allocate a buffer for the task_ctx_data at run time.
      Although the new pointer is added to the struct pmu, the existing
      variable task_ctx_size is not required anymore. The size of the
      generic structure is kept the same.
    
    The third option which meets all the aforementioned requirements is used
    to replace kzalloc() for the PMU specific data allocation. A later patch
    will remove the kzalloc() method and the related variables.
    Signed-off-by: default avatarKan Liang <kan.liang@linux.intel.com>
    Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
    Link: https://lkml.kernel.org/r/1593780569-62993-17-git-send-email-kan.liang@linux.intel.com
    217c2a63
core.c 310 KB