• Vlastimil Babka's avatar
    mm, compaction: periodically drop lock and restore IRQs in scanners · 8b44d279
    Vlastimil Babka authored
    Compaction scanners regularly check for lock contention and need_resched()
    through the compact_checklock_irqsave() function.  However, if there is no
    contention, the lock can be held and IRQ disabled for potentially long
    time.
    
    This has been addressed by commit b2eef8c0 ("mm: compaction: minimise
    the time IRQs are disabled while isolating pages for migration") for the
    migration scanner.  However, the refactoring done by commit 2a1402aa
    ("mm: compaction: acquire the zone->lru_lock as late as possible") has
    changed the conditions so that the lock is dropped only when there's
    contention on the lock or need_resched() is true.  Also, need_resched() is
    checked only when the lock is already held.  The comment "give a chance to
    irqs before checking need_resched" is therefore misleading, as IRQs remain
    disabled when the check is done.
    
    This patch restores the behavior intended by commit b2eef8c0 and also
    tries to better balance and make more deterministic the time spent by
    checking for contention vs the time the scanners might run between the
    checks.  It also avoids situations where checking has not been done often
    enough before.  The result should be avoiding both too frequent and too
    infrequent contention checking, and especially the potentially
    long-running scans with IRQs disabled and no checking of need_resched() or
    for fatal signal pending, which can happen when many consecutive pages or
    pageblocks fail the preliminary tests and do not reach the later call site
    to compact_checklock_irqsave(), as explained below.
    
    Before the patch:
    
    In the migration scanner, compact_checklock_irqsave() was called each
    loop, if reached.  If not reached, some lower-frequency checking could
    still be done if the lock was already held, but this would not result in
    aborting contended async compaction until reaching
    compact_checklock_irqsave() or end of pageblock.  In the free scanner, it
    was similar but completely without the periodical checking, so lock can be
    potentially held until reaching the end of pageblock.
    
    After the patch, in both scanners:
    
    The periodical check is done as the first thing in the loop on each
    SWAP_CLUSTER_MAX aligned pfn, using the new compact_unlock_should_abort()
    function, which always unlocks the lock (if locked) and aborts async
    compaction if scheduling is needed.  It also aborts any type of compaction
    when a fatal signal is pending.
    
    The compact_checklock_irqsave() function is replaced with a slightly
    different compact_trylock_irqsave().  The biggest difference is that the
    function is not called at all if the lock is already held.  The periodical
    need_resched() checking is left solely to compact_unlock_should_abort().
    The lock contention avoidance for async compaction is achieved by the
    periodical unlock by compact_unlock_should_abort() and by using trylock in
    compact_trylock_irqsave() and aborting when trylock fails.  Sync
    compaction does not use trylock.
    Signed-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
    Reviewed-by: default avatarZhang Yanfei <zhangyanfei@cn.fujitsu.com>
    Acked-by: default avatarMinchan Kim <minchan@kernel.org>
    Acked-by: default avatarMel Gorman <mgorman@suse.de>
    Cc: Michal Nazarewicz <mina86@mina86.com>
    Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
    Cc: Christoph Lameter <cl@linux.com>
    Cc: Rik van Riel <riel@redhat.com>
    Acked-by: default avatarDavid Rientjes <rientjes@google.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    8b44d279
compaction.c 39.4 KB