• Uladzislau Rezki (Sony)'s avatar
    mm: vmalloc: avoid calling __find_vmap_area() twice in __vunmap() · edd89818
    Uladzislau Rezki (Sony) authored
    Currently the __vunmap() path calls __find_vmap_area() twice.  Once on
    entry to check that the area exists, then inside the remove_vm_area()
    function which also performs a new search for the VA.
    
    In order to improvie it from a performance point of view we split
    remove_vm_area() into two new parts:
      - find_unlink_vmap_area() that does a search and unlink from tree;
      - __remove_vm_area() that removes without searching.
    
    In this case there is no any functional change for remove_vm_area()
    whereas vm_remove_mappings(), where a second search happens, switches to
    the __remove_vm_area() variant where the already detached VA is passed as
    a parameter, so there is no need to find it again.
    
    Performance wise, i use test_vmalloc.sh with 32 threads doing alloc
    free on a 64-CPUs-x86_64-box:
    
    perf without this patch:
    -   31.41%     0.50%  vmalloc_test/10  [kernel.vmlinux]    [k] __vunmap
       - 30.92% __vunmap
          - 17.67% _raw_spin_lock
               native_queued_spin_lock_slowpath
          - 12.33% remove_vm_area
             - 11.79% free_vmap_area_noflush
                - 11.18% _raw_spin_lock
                     native_queued_spin_lock_slowpath
            0.76% free_unref_page
    
    perf with this patch:
    -   11.35%     0.13%  vmalloc_test/14  [kernel.vmlinux]    [k] __vunmap
       - 11.23% __vunmap
          - 8.28% find_unlink_vmap_area
             - 7.95% _raw_spin_lock
                  7.44% native_queued_spin_lock_slowpath
          - 1.93% free_vmap_area_noflush
             - 0.56% _raw_spin_lock
                  0.53% native_queued_spin_lock_slowpath
            0.60% __vunmap_range_noflush
    
    __vunmap() consumes around ~20% less CPU cycles on this test.
    
    Also, switch from find_vmap_area() to find_unlink_vmap_area() to prevent a
    double access to the vmap_area_lock: one for finding area, second time is
    for unlinking from a tree.
    
    [urezki@gmail.com: switch to find_unlink_vmap_area() in vm_unmap_ram()]
      Link: https://lkml.kernel.org/r/20221222190022.134380-2-urezki@gmail.com
    Link: https://lkml.kernel.org/r/20221222190022.134380-1-urezki@gmail.comSigned-off-by: default avatarUladzislau Rezki (Sony) <urezki@gmail.com>
    Reported-by: default avatarRoman Gushchin <roman.gushchin@linux.dev>
    Reviewed-by: default avatarLorenzo Stoakes <lstoakes@gmail.com>
    Cc: Baoquan He <bhe@redhat.com>
    Cc: Christoph Hellwig <hch@infradead.org>
    Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
    Cc: Nicholas Piggin <npiggin@gmail.com>
    Cc: Oleksiy Avramchenko <oleksiy.avramchenko@sony.com>
    Cc: Christoph Hellwig <hch@lst.de>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    edd89818
vmalloc.c 109 KB