• Henry Willard's avatar
    mm: limit boost_watermark on small zones · 14f69140
    Henry Willard authored
    Commit 1c30844d ("mm: reclaim small amounts of memory when an
    external fragmentation event occurs") adds a boost_watermark() function
    which increases the min watermark in a zone by at least
    pageblock_nr_pages or the number of pages in a page block.
    
    On Arm64, with 64K pages and 512M huge pages, this is 8192 pages or
    512M.  It does this regardless of the number of managed pages managed in
    the zone or the likelihood of success.
    
    This can put the zone immediately under water in terms of allocating
    pages from the zone, and can cause a small machine to fail immediately
    due to OoM.  Unlike set_recommended_min_free_kbytes(), which
    substantially increases min_free_kbytes and is tied to THP,
    boost_watermark() can be called even if THP is not active.
    
    The problem is most likely to appear on architectures such as Arm64
    where pageblock_nr_pages is very large.
    
    It is desirable to run the kdump capture kernel in as small a space as
    possible to avoid wasting memory.  In some architectures, such as Arm64,
    there are restrictions on where the capture kernel can run, and
    therefore, the space available.  A capture kernel running in 768M can
    fail due to OoM immediately after boost_watermark() sets the min in zone
    DMA32, where most of the memory is, to 512M.  It fails even though there
    is over 500M of free memory.  With boost_watermark() suppressed, the
    capture kernel can run successfully in 448M.
    
    This patch limits boost_watermark() to boosting a zone's min watermark
    only when there are enough pages that the boost will produce positive
    results.  In this case that is estimated to be four times as many pages
    as pageblock_nr_pages.
    
    Mel said:
    
    : There is no harm in marking it stable.  Clearly it does not happen very
    : often but it's not impossible.  32-bit x86 is a lot less common now
    : which would previously have been vulnerable to triggering this easily.
    : ppc64 has a larger base page size but typically only has one zone.
    : arm64 is likely the most vulnerable, particularly when CMA is
    : configured with a small movable zone.
    
    Fixes: 1c30844d ("mm: reclaim small amounts of memory when an external fragmentation event occurs")
    Signed-off-by: default avatarHenry Willard <henry.willard@oracle.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Reviewed-by: default avatarDavid Hildenbrand <david@redhat.com>
    Acked-by: default avatarMel Gorman <mgorman@techsingularity.net>
    Cc: Vlastimil Babka <vbabka@suse.cz>
    Cc: <stable@vger.kernel.org>
    Link: http://lkml.kernel.org/r/1588294148-6586-1-git-send-email-henry.willard@oracle.comSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    14f69140
page_alloc.c 244 KB