• Gavin Shan's avatar
    mm/vmscan.c: don't round up scan size for online memory cgroup · 76073c64
    Gavin Shan authored
    Commit 68600f62 ("mm: don't miss the last page because of round-off
    error") makes the scan size round up to @denominator regardless of the
    memory cgroup's state, online or offline.  This affects the overall
    reclaiming behavior: the corresponding LRU list is eligible for
    reclaiming only when its size logically right shifted by @sc->priority
    is bigger than zero in the former formula.
    
    For example, the inactive anonymous LRU list should have at least 0x4000
    pages to be eligible for reclaiming when we have 60/12 for
    swappiness/priority and without taking scan/rotation ratio into account.
    
    After the roundup is applied, the inactive anonymous LRU list becomes
    eligible for reclaiming when its size is bigger than or equal to 0x1000
    in the same condition.
    
        (0x4000 >> 12) * 60 / (60 + 140 + 1) = 1
        ((0x1000 >> 12) * 60) + 200) / (60 + 140 + 1) = 1
    
    aarch64 has 512MB huge page size when the base page size is 64KB.  The
    memory cgroup that has a huge page is always eligible for reclaiming in
    that case.
    
    The reclaiming is likely to stop after the huge page is reclaimed,
    meaing the further iteration on @sc->priority and the silbing and child
    memory cgroups will be skipped.  The overall behaviour has been changed.
    This fixes the issue by applying the roundup to offlined memory cgroups
    only, to give more preference to reclaim memory from offlined memory
    cgroup.  It sounds reasonable as those memory is unlikedly to be used by
    anyone.
    
    The issue was found by starting up 8 VMs on a Ampere Mustang machine,
    which has 8 CPUs and 16 GB memory.  Each VM is given with 2 vCPUs and
    2GB memory.  It took 264 seconds for all VMs to be completely up and
    784MB swap is consumed after that.  With this patch applied, it took 236
    seconds and 60MB swap to do same thing.  So there is 10% performance
    improvement for my case.  Note that KSM is disable while THP is enabled
    in the testing.
    
             total     used    free   shared  buff/cache   available
       Mem:  16196    10065    2049       16        4081        3749
       Swap:  8175      784    7391
             total     used    free   shared  buff/cache   available
       Mem:  16196    11324    3656       24        1215        2936
       Swap:  8175       60    8115
    
    Link: http://lkml.kernel.org/r/20200211024514.8730-1-gshan@redhat.com
    Fixes: 68600f62 ("mm: don't miss the last page because of round-off error")
    Signed-off-by: default avatarGavin Shan <gshan@redhat.com>
    Acked-by: default avatarRoman Gushchin <guro@fb.com>
    Cc: <stable@vger.kernel.org>	[4.20+]
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    76073c64
vmscan.c 125 KB