Commit 483a40e4 authored by Andrew Morton's avatar Andrew Morton Committed by Christoph Hellwig

[PATCH] fix a bogus OOM condition for __GFP_NOFS allocations

If a GFP_NOFS allocation is made when the ZONE_NORMAL inactive list is
full of dirty or under-writeback pages, there is nothing the caller can
do to force some page reclaim.  The caller ends up getting oom-killed.

- In mempool_alloc(), don't try to perform page reclaim again.  Just
  go to sleep and wait for some elements to be returned to the pool.

- In try_to_free_pages(): perform a single, short scan of the LRU and
  if that doesn't work, fail the allocation.  GFP_NOFS allocators know
  how to handle that.
parent f3b3dc81
...@@ -196,10 +196,11 @@ void * mempool_alloc(mempool_t *pool, int gfp_mask) ...@@ -196,10 +196,11 @@ void * mempool_alloc(mempool_t *pool, int gfp_mask)
return element; return element;
/* /*
* If the pool is less than 50% full then try harder * If the pool is less than 50% full and we can perform effective
* to allocate an element: * page reclaim then try harder to allocate an element.
*/ */
if ((gfp_mask != gfp_nowait) && (pool->curr_nr <= pool->min_nr/2)) { if ((gfp_mask & __GFP_FS) && (gfp_mask != gfp_nowait) &&
(pool->curr_nr <= pool->min_nr/2)) {
element = pool->alloc(gfp_mask, pool->pool_data); element = pool->alloc(gfp_mask, pool->pool_data);
if (likely(element != NULL)) if (likely(element != NULL))
return element; return element;
......
...@@ -536,6 +536,20 @@ shrink_caches(struct zone *classzone, int priority, ...@@ -536,6 +536,20 @@ shrink_caches(struct zone *classzone, int priority,
/* /*
* This is the main entry point to page reclaim. * This is the main entry point to page reclaim.
*
* If a full scan of the inactive list fails to free enough memory then we
* are "out of memory" and something needs to be killed.
*
* If the caller is !__GFP_FS then the probability of a failure is reasonably
* high - the zone may be full of dirty or under-writeback pages, which this
* caller can't do much about. So for !__GFP_FS callers, we just perform a
* small LRU walk and if that didn't work out, fail the allocation back to the
* caller. GFP_NOFS allocators need to know how to deal with it. Kicking
* bdflush, waiting and retrying will work.
*
* This is a fairly lame algorithm - it can result in excessive CPU burning and
* excessive rotation of the inactive list, which is _supposed_ to be an LRU,
* yes?
*/ */
int int
try_to_free_pages(struct zone *classzone, try_to_free_pages(struct zone *classzone,
...@@ -546,13 +560,16 @@ try_to_free_pages(struct zone *classzone, ...@@ -546,13 +560,16 @@ try_to_free_pages(struct zone *classzone,
KERNEL_STAT_INC(pageoutrun); KERNEL_STAT_INC(pageoutrun);
do { for (priority = DEF_PRIORITY; priority; priority--) {
nr_pages = shrink_caches(classzone, priority, nr_pages = shrink_caches(classzone, priority,
gfp_mask, nr_pages); gfp_mask, nr_pages);
if (nr_pages <= 0) if (nr_pages <= 0)
return 1; return 1;
} while (--priority); if (!(gfp_mask & __GFP_FS))
out_of_memory(); break;
}
if (gfp_mask & __GFP_FS)
out_of_memory();
return 0; return 0;
} }
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment