Commit cfa54a0f authored by Andrew Barry's avatar Andrew Barry Committed by Linus Torvalds

mm/page_alloc.c: prevent unending loop in __alloc_pages_slowpath()

I believe I found a problem in __alloc_pages_slowpath, which allows a
process to get stuck endlessly looping, even when lots of memory is
available.

Running an I/O and memory intensive stress-test I see a 0-order page
allocation with __GFP_IO and __GFP_WAIT, running on a system with very
little free memory.  Right about the same time that the stress-test gets
killed by the OOM-killer, the utility trying to allocate memory gets stuck
in __alloc_pages_slowpath even though most of the systems memory was freed
by the oom-kill of the stress-test.

The utility ends up looping from the rebalance label down through the
wait_iff_congested continiously.  Because order=0,
__alloc_pages_direct_compact skips the call to get_page_from_freelist.
Because all of the reclaimable memory on the system has already been
reclaimed, __alloc_pages_direct_reclaim skips the call to
get_page_from_freelist.  Since there is no __GFP_FS flag, the block with
__alloc_pages_may_oom is skipped.  The loop hits the wait_iff_congested,
then jumps back to rebalance without ever trying to
get_page_from_freelist.  This loop repeats infinitely.

The test case is pretty pathological.  Running a mix of I/O stress-tests
that do a lot of fork() and consume all of the system memory, I can pretty
reliably hit this on 600 nodes, in about 12 hours.  32GB/node.
Signed-off-by: default avatarAndrew Barry <abarry@cray.com>
Signed-off-by: default avatarMinchan Kim <minchan.kim@gmail.com>
Reviewed-by: Rik van Riel<riel@redhat.com>
Acked-by: default avatarMel Gorman <mgorman@suse.de>
Cc: <stable@kernel.org>
Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
parent a539f353
...@@ -2106,6 +2106,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order, ...@@ -2106,6 +2106,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
first_zones_zonelist(zonelist, high_zoneidx, NULL, first_zones_zonelist(zonelist, high_zoneidx, NULL,
&preferred_zone); &preferred_zone);
rebalance:
/* This is the last chance, in general, before the goto nopage. */ /* This is the last chance, in general, before the goto nopage. */
page = get_page_from_freelist(gfp_mask, nodemask, order, zonelist, page = get_page_from_freelist(gfp_mask, nodemask, order, zonelist,
high_zoneidx, alloc_flags & ~ALLOC_NO_WATERMARKS, high_zoneidx, alloc_flags & ~ALLOC_NO_WATERMARKS,
...@@ -2113,7 +2114,6 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order, ...@@ -2113,7 +2114,6 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
if (page) if (page)
goto got_pg; goto got_pg;
rebalance:
/* Allocate without watermarks if the context allows */ /* Allocate without watermarks if the context allows */
if (alloc_flags & ALLOC_NO_WATERMARKS) { if (alloc_flags & ALLOC_NO_WATERMARKS) {
page = __alloc_pages_high_priority(gfp_mask, order, page = __alloc_pages_high_priority(gfp_mask, order,
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment