Commit 6b4f7799 authored by Johannes Weiner's avatar Johannes Weiner Committed by Linus Torvalds

mm: vmscan: invoke slab shrinkers from shrink_zone()

The slab shrinkers are currently invoked from the zonelist walkers in
kswapd, direct reclaim, and zone reclaim, all of which roughly gauge the
eligible LRU pages and assemble a nodemask to pass to NUMA-aware
shrinkers, which then again have to walk over the nodemask.  This is
redundant code, extra runtime work, and fairly inaccurate when it comes to
the estimation of actually scannable LRU pages.  The code duplication will
only get worse when making the shrinkers cgroup-aware and requiring them
to have out-of-band cgroup hierarchy walks as well.

Instead, invoke the shrinkers from shrink_zone(), which is where all
reclaimers end up, to avoid this duplication.

Take the count for eligible LRU pages out of get_scan_count(), which
considers many more factors than just the availability of swap space, like
zone_reclaimable_pages() currently does.  Accumulate the number over all
visited lruvecs to get the per-zone value.

Some nodes have multiple zones due to memory addressing restrictions.  To
avoid putting too much pressure on the shrinkers, only invoke them once
for each such node, using the class zone of the allocation as the pivot
zone.

For now, this integrates the slab shrinking better into the reclaim logic
and gets rid of duplicative invocations from kswapd, direct reclaim, and
zone reclaim.  It also prepares for cgroup-awareness, allowing
memcg-capable shrinkers to be added at the lruvec level without much
duplication of both code and runtime work.

This changes kswapd behavior, which used to invoke the shrinkers for each
zone, but with scan ratios gathered from the entire node, resulting in
meaningless pressure quantities on multi-zone nodes.

Zone reclaim behavior also changes.  It used to shrink slabs until the
same amount of pages were shrunk as were reclaimed from the LRUs.  Now it
merely invokes the shrinkers once with the zone's scan ratio, which makes
the shrinkers go easier on caches that implement aging and would prefer
feeding back pressure from recently used slab objects to unused LRU pages.

[vdavydov@parallels.com: assure class zone is populated]
Signed-off-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
Cc: Dave Chinner <david@fromorbit.com>
Signed-off-by: default avatarVladimir Davydov <vdavydov@parallels.com>
Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
parent f5f302e2
......@@ -418,7 +418,7 @@ static int ashmem_mmap(struct file *file, struct vm_area_struct *vma)
}
/*
* ashmem_shrink - our cache shrinker, called from mm/vmscan.c :: shrink_slab
* ashmem_shrink - our cache shrinker, called from mm/vmscan.c
*
* 'nr_to_scan' is the number of objects to scan for freeing.
*
......@@ -785,7 +785,6 @@ static long ashmem_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
.nr_to_scan = LONG_MAX,
};
ret = ashmem_shrink_count(&ashmem_shrinker, &sc);
nodes_setall(sc.nodes_to_scan);
ashmem_shrink_scan(&ashmem_shrinker, &sc);
}
break;
......
......@@ -40,13 +40,14 @@ static void drop_pagecache_sb(struct super_block *sb, void *unused)
static void drop_slab(void)
{
int nr_objects;
struct shrink_control shrink = {
.gfp_mask = GFP_KERNEL,
};
nodes_setall(shrink.nodes_to_scan);
do {
nr_objects = shrink_slab(&shrink, 1000, 1000);
int nid;
nr_objects = 0;
for_each_online_node(nid)
nr_objects += shrink_node_slabs(GFP_KERNEL, nid,
1000, 1000);
} while (nr_objects > 10);
}
......
......@@ -2110,9 +2110,9 @@ int drop_caches_sysctl_handler(struct ctl_table *, int,
void __user *, size_t *, loff_t *);
#endif
unsigned long shrink_slab(struct shrink_control *shrink,
unsigned long nr_pages_scanned,
unsigned long lru_pages);
unsigned long shrink_node_slabs(gfp_t gfp_mask, int nid,
unsigned long nr_scanned,
unsigned long nr_eligible);
#ifndef CONFIG_MMU
#define randomize_va_space 0
......
......@@ -18,8 +18,6 @@ struct shrink_control {
*/
unsigned long nr_to_scan;
/* shrink from these nodes */
nodemask_t nodes_to_scan;
/* current node being shrunk (for NUMA aware shrinkers) */
int nid;
};
......
......@@ -239,19 +239,14 @@ void shake_page(struct page *p, int access)
}
/*
* Only call shrink_slab here (which would also shrink other caches) if
* access is not potentially fatal.
* Only call shrink_node_slabs here (which would also shrink
* other caches) if access is not potentially fatal.
*/
if (access) {
int nr;
int nid = page_to_nid(p);
do {
struct shrink_control shrink = {
.gfp_mask = GFP_KERNEL,
};
node_set(nid, shrink.nodes_to_scan);
nr = shrink_slab(&shrink, 1000, 1000);
nr = shrink_node_slabs(GFP_KERNEL, nid, 1000, 1000);
if (page_count(p) == 1)
break;
} while (nr > 10);
......
......@@ -6284,9 +6284,9 @@ bool has_unmovable_pages(struct zone *zone, struct page *page, int count,
if (!PageLRU(page))
found++;
/*
* If there are RECLAIMABLE pages, we need to check it.
* But now, memory offline itself doesn't call shrink_slab()
* and it still to be fixed.
* If there are RECLAIMABLE pages, we need to check
* it. But now, memory offline itself doesn't call
* shrink_node_slabs() and it still to be fixed.
*/
/*
* If the page is not RAM, page_count()should be 0.
......
This diff is collapsed.
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment