• Mel Gorman's avatar
    mm/page_alloc: protect PCP lists with a spinlock · 4b23a68f
    Mel Gorman authored
    Currently the PCP lists are protected by using local_lock_irqsave to
    prevent migration and IRQ reentrancy but this is inconvenient.  Remote
    draining of the lists is impossible and a workqueue is required and every
    task allocation/free must disable then enable interrupts which is
    expensive.
    
    As preparation for dealing with both of those problems, protect the
    lists with a spinlock.  The IRQ-unsafe version of the lock is used
    because IRQs are already disabled by local_lock_irqsave.  spin_trylock
    is used in combination with local_lock_irqsave() but later will be
    replaced with a spin_trylock_irqsave when the local_lock is removed.
    
    The per_cpu_pages still fits within the same number of cache lines after
    this patch relative to before the series.
    
    struct per_cpu_pages {
            spinlock_t                 lock;                 /*     0     4 */
            int                        count;                /*     4     4 */
            int                        high;                 /*     8     4 */
            int                        batch;                /*    12     4 */
            short int                  free_factor;          /*    16     2 */
            short int                  expire;               /*    18     2 */
    
            /* XXX 4 bytes hole, try to pack */
    
            struct list_head           lists[13];            /*    24   208 */
    
            /* size: 256, cachelines: 4, members: 7 */
            /* sum members: 228, holes: 1, sum holes: 4 */
            /* padding: 24 */
    } __attribute__((__aligned__(64)));
    
    There is overhead in the fast path due to acquiring the spinlock even
    though the spinlock is per-cpu and uncontended in the common case.  Page
    Fault Test (PFT) running on a 1-socket reported the following results on a
    1 socket machine.
    
                                         5.19.0-rc3               5.19.0-rc3
                                            vanilla      mm-pcpspinirq-v5r16
    Hmean     faults/sec-1   869275.7381 (   0.00%)   874597.5167 *   0.61%*
    Hmean     faults/sec-3  2370266.6681 (   0.00%)  2379802.0362 *   0.40%*
    Hmean     faults/sec-5  2701099.7019 (   0.00%)  2664889.7003 *  -1.34%*
    Hmean     faults/sec-7  3517170.9157 (   0.00%)  3491122.8242 *  -0.74%*
    Hmean     faults/sec-8  3965729.6187 (   0.00%)  3939727.0243 *  -0.66%*
    
    There is a small hit in the number of faults per second but given that the
    results are more stable, it's borderline noise.
    
    [akpm@linux-foundation.org: add missing local_unlock_irqrestore() on contention path]
    Link: https://lkml.kernel.org/r/20220624125423.6126-6-mgorman@techsingularity.netSigned-off-by: default avatarMel Gorman <mgorman@techsingularity.net>
    Tested-by: default avatarYu Zhao <yuzhao@google.com>
    Reviewed-by: default avatarNicolas Saenz Julienne <nsaenzju@redhat.com>
    Tested-by: default avatarNicolas Saenz Julienne <nsaenzju@redhat.com>
    Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
    Cc: Hugh Dickins <hughd@google.com>
    Cc: Marcelo Tosatti <mtosatti@redhat.com>
    Cc: Marek Szyprowski <m.szyprowski@samsung.com>
    Cc: Michal Hocko <mhocko@kernel.org>
    Cc: Minchan Kim <minchan@kernel.org>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    4b23a68f
page_alloc.c 268 KB