Commits · 2e1da329b424c693662e7e2afa34654989de3fac · Kirill Smelkov / linux

09 Jun, 2023 40 commits

maple_tree: add comments and some minor cleanups to mas_wr_append() · 2e1da329

Peng Zhang authored May 24, 2023

Add comment for mas_wr_append(), move mas_update_gap() into
mas_wr_append(), and other cleanups to make mas_wr_modify() cleaner.

Link: https://lkml.kernel.org/r/20230524031247.65949-8-zhangpeng.00@bytedance.comSigned-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

2e1da329

maple_tree: add mas_wr_new_end() to calculate new_end accurately · c6fc9e4a

Peng Zhang authored May 24, 2023

The previous new_end calculation is inaccurate, because it assumes that
two new pivots must be added (this is inaccurate), and sometimes it will
miss the fast path and enter the slow path. Add mas_wr_new_end() to
accurately calculate new_end to make the conditions for entering the fast
path more accurate.

Link: https://lkml.kernel.org/r/20230524031247.65949-7-zhangpeng.00@bytedance.comSigned-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

c6fc9e4a

maple_tree: make the code symmetrical in mas_wr_extend_null() · 8c995a63

Peng Zhang authored May 24, 2023

Just make the code symmetrical to improve readability.

Link: https://lkml.kernel.org/r/20230524031247.65949-6-zhangpeng.00@bytedance.comSigned-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

8c995a63

maple_tree: simplify mas_is_span_wr() · bc147f0f

Peng Zhang authored May 24, 2023

Make the code for detecting spanning writes more concise.

Link: https://lkml.kernel.org/r/20230524031247.65949-5-zhangpeng.00@bytedance.comSigned-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

bc147f0f

maple_tree: fix the arguments to __must_hold() · 14c4b5ab

Peng Zhang authored May 24, 2023

Fix the arguments to __must_hold() to make sparse work.

Link: https://lkml.kernel.org/r/20230524031247.65949-4-zhangpeng.00@bytedance.comSigned-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

14c4b5ab

maple_tree: drop mas_{rev_}alloc() and mas_fill_gap() · c2aa6f53

Peng Zhang authored May 24, 2023

mas_{rev_}alloc() and mas_fill_gap() are no longer used, delete them.

Link: https://lkml.kernel.org/r/20230524031247.65949-3-zhangpeng.00@bytedance.comSigned-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

c2aa6f53

maple_tree: rework mtree_alloc_{range,rrange}() · 52371677

Peng Zhang authored May 24, 2023

Patch series "Clean ups for maple tree", v4.

Some clean ups, mainly to make the code of maple tree more concise.
This patchset has passed the self-test.


This patch (of 10):

Use mas_empty_area{_rev}() to refactor mtree_alloc_{range,rrange}()

Link: https://lkml.kernel.org/r/20230524031247.65949-2-zhangpeng.00@bytedance.comSigned-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

52371677

mm/memcontrol: export memcg.swap watermark via sysfs for v2 memcg · e0e0b412

Lars R. Damerow authored May 24, 2023

This patch is similar to commit 8e20d4b3 ("mm/memcontrol: export
memcg->watermark via sysfs for v2 memcg"), but exports the swap counter's
watermark.

We allocate jobs to our compute farm using heuristics determined by memory
and swap usage from previous jobs.  Tracking the peak swap usage for new
jobs is important for determining when jobs are exceeding their expected
bounds, or when our baseline metrics are getting outdated.

Our toolset was written to use the "memory.memsw.max_usage_in_bytes" file
in cgroups v1, and altering it to poll cgroups v2's "memory.swap.current"
would give less accurate results as well as add complication to the code. 
Having this watermark exposed in sysfs is much preferred.

Link: https://lkml.kernel.org/r/20230524181734.125696-1-lars@pixar.comSigned-off-by: Lars R. Damerow <lars@pixar.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Zefan Li <lizefan.x@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

e0e0b412

mm: shmem: fix UAF bug in shmem_show_options() · 283ebdee

Tu Jinjiang authored May 25, 2023

shmem_show_options() uses sbinfo->mpol without adding it's refcnt. This
may lead to race with replacement of the mpol by remount. The execution
sequence is as follows.

       CPU0                                   CPU1
shmem_show_options()                        shmem_reconfigure()
    shmem_show_mpol(seq, sbinfo->mpol)          mpol = sbinfo->mpol
                                                mpol_put(mpol)
        mpol->mode

The KASAN report is as follows.

BUG: KASAN: slab-use-after-free in shmem_show_options+0x21b/0x340
Read of size 2 at addr ffff888124324004 by task mount/2388

CPU: 2 PID: 2388 Comm: mount Not tainted 6.4.0-rc3-00017-g9d646009-dirty #8
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
Call Trace:
 <TASK>
 dump_stack_lvl+0x37/0x50
 print_report+0xd0/0x620
 ? shmem_show_options+0x21b/0x340
 ? __virt_addr_valid+0xf4/0x180
 ? shmem_show_options+0x21b/0x340
 kasan_report+0xb8/0xe0
 ? shmem_show_options+0x21b/0x340
 shmem_show_options+0x21b/0x340
 ? __pfx_shmem_show_options+0x10/0x10
 ? strchr+0x2c/0x50
 ? strlen+0x23/0x40
 ? seq_puts+0x7d/0x90
 show_vfsmnt+0x1e6/0x260
 ? __pfx_show_vfsmnt+0x10/0x10
 ? __kasan_kmalloc+0x7f/0x90
 seq_read_iter+0x57a/0x740
 vfs_read+0x2e2/0x4a0
 ? __pfx_vfs_read+0x10/0x10
 ? down_write_killable+0xb8/0x140
 ? __pfx_down_write_killable+0x10/0x10
 ? __fget_light+0xa9/0x1e0
 ? up_write+0x3f/0x80
 ksys_read+0xb8/0x150
 ? __pfx_ksys_read+0x10/0x10
 ? fpregs_assert_state_consistent+0x55/0x60
 ? exit_to_user_mode_prepare+0x2d/0x120
 do_syscall_64+0x3c/0x90
 entry_SYSCALL_64_after_hwframe+0x72/0xdc

 </TASK>

Allocated by task 2387:
 kasan_save_stack+0x22/0x50
 kasan_set_track+0x25/0x30
 __kasan_slab_alloc+0x59/0x70
 kmem_cache_alloc+0xdd/0x220
 mpol_new+0x83/0x150
 mpol_parse_str+0x280/0x4a0
 shmem_parse_one+0x364/0x520
 vfs_parse_fs_param+0xf8/0x1a0
 vfs_parse_fs_string+0xc9/0x130
 shmem_parse_options+0xb2/0x110
 path_mount+0x597/0xdf0
 do_mount+0xcd/0xf0
 __x64_sys_mount+0xbd/0x100
 do_syscall_64+0x3c/0x90
 entry_SYSCALL_64_after_hwframe+0x72/0xdc

Freed by task 2389:
 kasan_save_stack+0x22/0x50
 kasan_set_track+0x25/0x30
 kasan_save_free_info+0x2e/0x50
 __kasan_slab_free+0x10e/0x1a0
 kmem_cache_free+0x9c/0x350
 shmem_reconfigure+0x278/0x370
 reconfigure_super+0x383/0x450
 path_mount+0xcc5/0xdf0
 do_mount+0xcd/0xf0
 __x64_sys_mount+0xbd/0x100
 do_syscall_64+0x3c/0x90
 entry_SYSCALL_64_after_hwframe+0x72/0xdc

The buggy address belongs to the object at ffff888124324000
 which belongs to the cache numa_policy of size 32
The buggy address is located 4 bytes inside of
 freed 32-byte region [ffff888124324000, ffff888124324020)
==================================================================

To fix the bug, shmem_get_sbmpol() / mpol_put() needs to be called
before / after shmem_show_mpol() call.

Link: https://lkml.kernel.org/r/20230525031640.593733-1-tujinjiang@huawei.comSigned-off-by: Tu Jinjiang <tujinjiang@huawei.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Nanyong Sun <sunnanyong@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

283ebdee

mm: compaction: skip fast freepages isolation if enough freepages are isolated · a8d13355

Baolin Wang authored May 25, 2023

I've observed that fast isolation often isolates more pages than
cc->migratepages, and the excess freepages will be released back to the
buddy system. So skip fast freepages isolation if enough freepages are
isolated to save some CPU cycles.

Link: https://lkml.kernel.org/r/f39c2c07f2dba2732fd9c0843572e5bef96f7f67.1685018752.git.baolin.wang@linux.alibaba.comSigned-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

a8d13355

mm: compaction: add trace event for fast freepages isolation · 447ba886

Baolin Wang authored May 25, 2023

The fast_isolate_freepages() can also isolate freepages, but we can not
know the fast isolation efficiency to understand the fast isolation
pressure. So add a trace event to show some numbers to help to understand
the efficiency for fast freepages isolation.

Link: https://lkml.kernel.org/r/78d2932d0160d122c15372aceb3f2c45460a17fc.1685018752.git.baolin.wang@linux.alibaba.comSigned-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

447ba886

mm: compaction: only set skip flag if cc->no_set_skip_hint is false · 8b71b499

Baolin Wang authored May 25, 2023

To keep the same logic as test_and_set_skip(), only set the skip flag if
cc->no_set_skip_hint is false, which makes code more reasonable.

Link: https://lkml.kernel.org/r/0eb2cd2407ffb259ae6e3071e10f70f2d41d0f3e.1685018752.git.baolin.wang@linux.alibaba.comSigned-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

8b71b499

mm: compaction: skip more fully scanned pageblock · cf650342

Baolin Wang authored May 25, 2023

In fast_isolate_around(), it assumes the pageblock is fully scanned if
cc->nr_freepages < cc->nr_migratepages after trying to isolate some free
pages, and will set skip flag to avoid scanning in future.  However this
can miss setting the skip flag for a fully scanned pageblock (returned
'start_pfn' is equal to 'end_pfn') in the case where cc->nr_freepages is
larger than cc->nr_migratepages.

So using the returned 'start_pfn' from isolate_freepages_block() and
'end_pfn' to decide if a pageblock is fully scanned makes more sense.  It
can also cover the case where cc->nr_freepages < cc->nr_migratepages,
which means the 'start_pfn' is usually equal to 'end_pfn' except some
uncommon fatal error occurs after non-strict mode isolation.

Link: https://lkml.kernel.org/r/f4efd2fa08735794a6d809da3249b6715ba6ad38.1685018752.git.baolin.wang@linux.alibaba.comSigned-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

cf650342

mm: compaction: change fast_isolate_freepages() to void type · 2dbd9005

Baolin Wang authored May 25, 2023

No caller cares about the return value of fast_isolate_freepages(), void
it.

Link: https://lkml.kernel.org/r/759fca20b22ebf4c81afa30496837b9e0fb2e53b.1685018752.git.baolin.wang@linux.alibaba.comSigned-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

2dbd9005

mm: compaction: drop the redundant page validation in update_pageblock_skip() · 75990f64

Baolin Wang authored May 25, 2023

Patch series "Misc cleanups and improvements for compaction".

This series cantains some cleanups and improvements for compaction.


This patch (of 6):

The caller has validated the page before calling
update_pageblock_skip(), thus drop the redundant page validation in
update_pageblock_skip().

Link: https://lkml.kernel.org/r/5142e15b9295fe8c447dbb39b7907a20177a1413.1685018752.git.baolin.wang@linux.alibaba.comSigned-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

75990f64

mm/vmalloc: dont purge usable blocks unnecessarily · 77e50af0

Thomas Gleixner authored May 25, 2023

Purging fragmented blocks is done unconditionally in several contexts:

  1) From drain_vmap_area_work(), when the number of lazy to be freed
     vmap_areas reached the threshold

  2) Reclaiming vmalloc address space from pcpu_get_vm_areas()

  3) _vm_unmap_aliases()

#1 There is no reason to zap fragmented vmap blocks unconditionally, simply
   because reclaiming all lazy areas drains at least

      32MB * fls(num_online_cpus())

   per invocation which is plenty.

#2 Reclaiming when running out of space or due to memory pressure makes a
   lot of sense

#3 _unmap_aliases() requires to touch everything because the caller has no
   clue which vmap_area used a particular page last and the vmap_area lost
   that information too.

   Except for the vfree + VM_FLUSH_RESET_PERMS case, which removes the
   vmap area first and then cares about the flush. That in turn requires
   a full walk of _all_ vmap areas including the one which was just
   added to the purge list.

   But as this has to be flushed anyway this is an opportunity to combine
   outstanding TLB flushes and do the housekeeping of purging freed areas,
   but like #1 there is no real good reason to zap usable vmap blocks
   unconditionally.

Add a @force_purge argument to the newly split out block purge function and
if not true only purge fragmented blocks which have less than 1/4 of their
capacity left.

Rename purge_vmap_area_lazy() to reclaim_and_purge_vmap_areas() to make it
clear what the function does.

[lstoakes@gmail.com: correct VMAP_PURGE_THRESHOLD check]
  Link: https://lkml.kernel.org/r/3e92ef61-b910-4576-88e7-cf43211fd4e7@lucifer.local
Link: https://lkml.kernel.org/r/20230525124504.864005691@linutronix.deSigned-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
Reviewed-by: Baoquan He <bhe@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

77e50af0

mm/vmalloc: add missing READ/WRITE_ONCE() annotations · 7f48121e

Thomas Gleixner authored May 25, 2023

purge_fragmented_blocks() accesses vmap_block::free and vmap_block::dirty
lockless for a quick check.

Add the missing READ/WRITE_ONCE() annotations.

Link: https://lkml.kernel.org/r/20230525124504.807356682@linutronix.deSigned-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Baoquan He <bhe@redhat.com>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

7f48121e

mm/vmalloc: check free space in vmap_block lockless · 43d76502

Thomas Gleixner authored May 25, 2023

vb_alloc() unconditionally locks a vmap_block on the free list to check
the free space.

This can be done locklessly because vmap_block::free never increases, it's
only decreased on allocations.

Check the free space lockless and only if that succeeds, recheck under the
lock.

Link: https://lkml.kernel.org/r/20230525124504.750481992@linutronix.deSigned-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Lorenzo Stoakes <lstoakes@gmail.com>
Reviewed-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

43d76502

mm/vmalloc: prevent flushing dirty space over and over · a09fad96

Thomas Gleixner authored May 25, 2023

vmap blocks which have active mappings cannot be purged.  Allocations
which have been freed are accounted for in vmap_block::dirty_min/max, so
that they can be detected in _vm_unmap_aliases() as potentially stale
TLBs.

If there are several invocations of _vm_unmap_aliases() then each of them
will flush the dirty range.  That's pointless and just increases the
probability of full TLB flushes.

Avoid that by resetting the flush range after accounting for it.  That's
safe versus other invocations of _vm_unmap_aliases() because this is all
serialized with vmap_purge_lock.

Link: https://lkml.kernel.org/r/20230525124504.692056496@linutronix.deSigned-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Baoquan He <bhe@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

a09fad96

mm/vmalloc: avoid iterating over per CPU vmap blocks twice · ca5e46c3

Thomas Gleixner authored May 25, 2023

_vunmap_aliases() walks the per CPU xarrays to find partially unmapped
blocks and then walks the per cpu free lists to purge fragmented blocks.

Arguably that's waste of CPU cycles and cache lines as the full xarray
walk already touches every block.

Avoid this double iteration:

  - Split out the code to purge one block and the code to free the local
    purge list into helper functions.

  - Try to purge the fragmented blocks in the xarray walk before looking at
    their dirty space.

Link: https://lkml.kernel.org/r/20230525124504.633469722@linutronix.deSigned-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Baoquan He <bhe@redhat.com>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

ca5e46c3

mm/vmalloc: prevent stale TLBs in fully utilized blocks · fc1e0d98

Thomas Gleixner authored May 25, 2023

Patch series "mm/vmalloc: Assorted fixes and improvements", v2.

this series addresses the following issues:

  1) Prevent the stale TLB problem related to fully utilized vmap blocks

  2) Avoid the double per CPU list walk in _vm_unmap_aliases()

  3) Avoid flushing dirty space over and over

  4) Add a lockless quickcheck in vb_alloc() and add missing
     READ/WRITE_ONCE() annotations

  5) Prevent overeager purging of usable vmap_blocks if
     not under memory/address space pressure.


This patch (of 6):

_vm_unmap_aliases() is used to ensure that no unflushed TLB entries for a
page are left in the system. This is required due to the lazy TLB flush
mechanism in vmalloc.

This is tried to achieve by walking the per CPU free lists, but those do
not contain fully utilized vmap blocks because they are removed from the
free list once the blocks free space became zero.

When the block is not fully unmapped then it is not on the purge list
either.

So neither the per CPU list iteration nor the purge list walk find the
block and if the page was mapped via such a block and the TLB has not yet
been flushed, the guarantee of _vm_unmap_aliases() that there are no stale
TLBs after returning is broken:

x = vb_alloc() // Removes vmap_block from free list because vb->free became 0
vb_free(x)     // Unmaps page and marks in dirty_min/max range
	       // Block has still mappings and is not put on purge list

// Page is reused
vm_unmap_aliases() // Can't find vmap block with the dirty space -> FAIL

So instead of walking the per CPU free lists, walk the per CPU xarrays
which hold pointers to _all_ active blocks in the system including those
removed from the free lists.

Link: https://lkml.kernel.org/r/20230525122342.109672430@linutronix.de
Link: https://lkml.kernel.org/r/20230525124504.573987880@linutronix.de
Fixes: db64fe02 ("mm: rewrite vmap layer")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Lorenzo Stoakes <lstoakes@gmail.com>
Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Reviewed-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

fc1e0d98

kmemleak-test: drop __init to get better backtrace · dcb8cbb5

Jim Cromie authored May 25, 2023

Drop the __init on kmemleak_test_init().  With it, the storage is
reclaimed, but then the symbol isn't available for "%pS" rendering,
and the backtrace gets a bare pointer where the actual leak happened.

unreferenced object 0xffff88800a2b0800 (size 1024):
  comm "modprobe", pid 413, jiffies 4294953430
  hex dump (first 32 bytes):
    73 02 00 00 75 01 00 68 02 00 00 01 00 00 00 04  s...u..h........
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<00000000fabad728>] kmalloc_trace+0x26/0x90
    [<00000000ef738764>] 0xffffffffc02350a2
    [<00000000004e5795>] do_one_initcall+0x43/0x210
    [<00000000d768905e>] do_init_module+0x4a/0x210
    [<0000000087135ab5>] __do_sys_finit_module+0x93/0xf0
    [<000000004fcb1fa2>] do_syscall_64+0x34/0x80
    [<00000000c73c8d9d>] entry_SYSCALL_64_after_hwframe+0x46/0xb0

with __init gone, that trace entry renders like:

    [<00000000ef738764>] kmemleak_test_init+<offset>/<size>

Link: https://lkml.kernel.org/r/20230525174356.69711-1-jim.cromie@gmail.comSigned-off-by: Jim Cromie <jim.cromie@gmail.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

dcb8cbb5

mm: multi-gen LRU: cleanup lru_gen_test_recent() · d7f1afd0

T.J. Alumbaugh authored May 22, 2023

Avoid passing memcg* and pglist_data* to lru_gen_test_recent()
since we only use the lruvec anyway.

Link: https://lkml.kernel.org/r/20230522112058.2965866-4-talumbau@google.comSigned-off-by: T.J. Alumbaugh <talumbau@google.com>
Reviewed-by: Yuanchu Xie <yuanchu@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Yu Zhao <yuzhao@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

d7f1afd0

mm: multi-gen LRU: add helpers in page table walks · bd02df41

T.J. Alumbaugh authored May 22, 2023

Add helpers to page table walking code:
 - Clarifies intent via name "should_walk_mmu" and "should_clear_pmd_young"
 - Avoids repeating same logic in two places

Link: https://lkml.kernel.org/r/20230522112058.2965866-3-talumbau@google.comSigned-off-by: T.J. Alumbaugh <talumbau@google.com>
Reviewed-by: Yuanchu Xie <yuanchu@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Yu Zhao <yuzhao@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

bd02df41

mm: multi-gen LRU: cleanup lru_gen_soft_reclaim() · 5c7e7a0d

T.J. Alumbaugh authored May 22, 2023

lru_gen_soft_reclaim() gets the lruvec from the memcg and node ID to keep a
cleaner interface on the caller side.

Link: https://lkml.kernel.org/r/20230522112058.2965866-2-talumbau@google.comSigned-off-by: T.J. Alumbaugh <talumbau@google.com>
Reviewed-by: Yuanchu Xie <yuanchu@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Yu Zhao <yuzhao@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

5c7e7a0d

mm: multi-gen LRU: use macro for bitmap · 0285762c

T.J. Alumbaugh authored May 22, 2023

Use DECLARE_BITMAP macro when possible.

Link: https://lkml.kernel.org/r/20230522112058.2965866-1-talumbau@google.comSigned-off-by: T.J. Alumbaugh <talumbau@google.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Yuanchu Xie <yuanchu@google.com>
Cc: Yu Zhao <yuzhao@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

0285762c

selftests: cgroup: fix unexpected failure on test_memcg_low · 19ab3657

Haifeng Xu authored May 22, 2023

Since commit f079a020 ("selftests: memcg: factor out common parts of
memory.{low,min} tests"), the value used in second alloc_anon has changed
from 148M to 170M.  Because memory.low allows reclaiming page cache in
child cgroups, so the memory.current is close to 30M instead of 50M. 
Therefore, adjust the expected value of parent cgroup.

Link: https://lkml.kernel.org/r/20230522095233.4246-2-haifeng.xu@shopee.com
Fixes: f079a020 ("selftests: memcg: factor out common parts of memory.{low,min} tests")
Signed-off-by: Haifeng Xu <haifeng.xu@shopee.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Shakeel Butt <shakeelb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

19ab3657

mm/memcontrol: fix typo in comment · 08e0f49e

Haifeng Xu authored May 22, 2023

Replace 'then' with 'than'.

Link: https://lkml.kernel.org/r/20230522095233.4246-1-haifeng.xu@shopee.comSigned-off-by: Haifeng Xu <haifeng.xu@shopee.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Shakeel Butt <shakeelb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

08e0f49e

mm/mlock: rename mlock_future_check() to mlock_future_ok() · b0cc5e89

Andrew Morton authored May 22, 2023

It is felt that the name mlock_future_check() is vague - it doesn't
particularly convey the function's operation.  mlock_future_ok() is a
clearer name for a predicate function.
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Mike Rapoport (IBM) <rppt@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

b0cc5e89

mm/mmap: refactor mlock_future_check() · 3c54a298

Lorenzo Stoakes authored May 22, 2023

In all but one instance, mlock_future_check() is treated as a boolean
function despite returning an error code. In one instance, this error
code is ignored and replaced with -ENOMEM.

This is confusing, and the inversion of true -> failure, false -> success
is not warranted. Convert the function to a bool, lightly refactor and
return true if the check passes, false if not.

Link: https://lkml.kernel.org/r/20230522082412.56685-1-lstoakes@gmail.comSigned-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Mike Rapoport (IBM) <rppt@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

3c54a298

selftests/mm: gup_longterm: add liburing tests · 89207c66

David Hildenbrand authored May 19, 2023

Similar to the COW selftests, also use io_uring fixed buffers to test if
long-term page pinning works as expected.

Link: https://lkml.kernel.org/r/20230519102723.185721-4-david@redhat.comSigned-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

89207c66

selftests/mm: gup_longterm: new functional test for FOLL_LONGTERM · c879462a

David Hildenbrand authored May 19, 2023

Let's add a new test for checking whether GUP long-term page pinning works
as expected (R/O vs.  R/W, MAP_PRIVATE vs.  MAP_SHARED, GUP vs. 
GUP-fast).  Note that COW handling with long-term R/O pinning in private
mappings, and pinning of anonymous memory in general, is tested by the COW
selftest.  This test, therefore, focuses on page pinning in file mappings.

The most interesting case is probably the "local tmpfile" case, as that
will likely end up on a "real" filesystem such as ext4 or xfs, not on a
virtual one like tmpfs or hugetlb where any long-term page pinning is
always expected to succeed.

For now, only add tests that use the "/sys/kernel/debug/gup_test"
interface.  We'll add tests based on liburing separately next.

[akpm@linux-foundation.org: update .gitignore for gup_longterm, per Peter]
Link: https://lkml.kernel.org/r/20230519102723.185721-3-david@redhat.comSigned-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

c879462a

selftests/mm: factor out detection of hugetlb page sizes into vm_util · 81b1e3f9

David Hildenbrand authored May 19, 2023

Patch series "selftests/mm: new test for FOLL_LONGTERM on file mappings".

Let's add some selftests to make sure that:
* R/O long-term pinning always works of file mappings
* R/W long-term pinning always works in MAP_PRIVATE file mappings
* R/W long-term pinning only works in MAP_SHARED mappings with special
  filesystems (shmem, hugetlb) and fails with other filesystems (ext4, btrfs,
  xfs).

The tests make use of the gup_test kernel module to trigger ordinary GUP
and GUP-fast, and liburing (similar to our COW selftests).  Test with
memfd, memfd hugetlb, tmpfile() and mkstemp().  The latter usually gives
us a "real" filesystem (ext4, btrfs, xfs) where long-term pinning is
expected to fail.

Note that these selftests don't contain any actual reproducers for data
corruptions in case R/W long-term pinning on problematic filesystems
"would" work.

Maybe we can later come up with a racy !FOLL_LONGTERM reproducer that can
reuse an existing interface to trigger short-term pinning (I'll look into
that next).

On current mm/mm-unstable:
	# ./gup_longterm
	# [INFO] detected hugetlb page size: 2048 KiB
	# [INFO] detected hugetlb page size: 1048576 KiB
	TAP version 13
	1..50
	# [RUN] R/W longterm GUP pin in MAP_SHARED file mapping ... with memfd
	ok 1 Should have worked
	# [RUN] R/W longterm GUP pin in MAP_SHARED file mapping ... with tmpfile
	ok 2 Should have worked
	# [RUN] R/W longterm GUP pin in MAP_SHARED file mapping ... with local tmpfile
	ok 3 Should have failed
	# [RUN] R/W longterm GUP pin in MAP_SHARED file mapping ... with memfd hugetlb (2048 kB)
	ok 4 Should have worked
	# [RUN] R/W longterm GUP pin in MAP_SHARED file mapping ... with memfd hugetlb (1048576 kB)
	ok 5 Should have worked
	# [RUN] R/W longterm GUP-fast pin in MAP_SHARED file mapping ... with memfd
	ok 6 Should have worked
	# [RUN] R/W longterm GUP-fast pin in MAP_SHARED file mapping ... with tmpfile
	ok 7 Should have worked
	# [RUN] R/W longterm GUP-fast pin in MAP_SHARED file mapping ... with local tmpfile
	ok 8 Should have failed
	# [RUN] R/W longterm GUP-fast pin in MAP_SHARED file mapping ... with memfd hugetlb (2048 kB)
	ok 9 Should have worked
	# [RUN] R/W longterm GUP-fast pin in MAP_SHARED file mapping ... with memfd hugetlb (1048576 kB)
	ok 10 Should have worked
	# [RUN] R/O longterm GUP pin in MAP_SHARED file mapping ... with memfd
	ok 11 Should have worked
	# [RUN] R/O longterm GUP pin in MAP_SHARED file mapping ... with tmpfile
	ok 12 Should have worked
	# [RUN] R/O longterm GUP pin in MAP_SHARED file mapping ... with local tmpfile
	ok 13 Should have worked
	# [RUN] R/O longterm GUP pin in MAP_SHARED file mapping ... with memfd hugetlb (2048 kB)
	ok 14 Should have worked
	# [RUN] R/O longterm GUP pin in MAP_SHARED file mapping ... with memfd hugetlb (1048576 kB)
	ok 15 Should have worked
	# [RUN] R/O longterm GUP-fast pin in MAP_SHARED file mapping ... with memfd
	ok 16 Should have worked
	# [RUN] R/O longterm GUP-fast pin in MAP_SHARED file mapping ... with tmpfile
	ok 17 Should have worked
	# [RUN] R/O longterm GUP-fast pin in MAP_SHARED file mapping ... with local tmpfile
	ok 18 Should have worked
	# [RUN] R/O longterm GUP-fast pin in MAP_SHARED file mapping ... with memfd hugetlb (2048 kB)
	ok 19 Should have worked
	# [RUN] R/O longterm GUP-fast pin in MAP_SHARED file mapping ... with memfd hugetlb (1048576 kB)
	ok 20 Should have worked
	# [RUN] R/W longterm GUP pin in MAP_PRIVATE file mapping ... with memfd
	ok 21 Should have worked
	# [RUN] R/W longterm GUP pin in MAP_PRIVATE file mapping ... with tmpfile
	ok 22 Should have worked
	# [RUN] R/W longterm GUP pin in MAP_PRIVATE file mapping ... with local tmpfile
	ok 23 Should have worked
	# [RUN] R/W longterm GUP pin in MAP_PRIVATE file mapping ... with memfd hugetlb (2048 kB)
	ok 24 Should have worked
	# [RUN] R/W longterm GUP pin in MAP_PRIVATE file mapping ... with memfd hugetlb (1048576 kB)
	ok 25 Should have worked
	# [RUN] R/W longterm GUP-fast pin in MAP_PRIVATE file mapping ... with memfd
	ok 26 Should have worked
	# [RUN] R/W longterm GUP-fast pin in MAP_PRIVATE file mapping ... with tmpfile
	ok 27 Should have worked
	# [RUN] R/W longterm GUP-fast pin in MAP_PRIVATE file mapping ... with local tmpfile
	ok 28 Should have worked
	# [RUN] R/W longterm GUP-fast pin in MAP_PRIVATE file mapping ... with memfd hugetlb (2048 kB)
	ok 29 Should have worked
	# [RUN] R/W longterm GUP-fast pin in MAP_PRIVATE file mapping ... with memfd hugetlb (1048576 kB)
	ok 30 Should have worked
	# [RUN] R/O longterm GUP pin in MAP_PRIVATE file mapping ... with memfd
	ok 31 Should have worked
	# [RUN] R/O longterm GUP pin in MAP_PRIVATE file mapping ... with tmpfile
	ok 32 Should have worked
	# [RUN] R/O longterm GUP pin in MAP_PRIVATE file mapping ... with local tmpfile
	ok 33 Should have worked
	# [RUN] R/O longterm GUP pin in MAP_PRIVATE file mapping ... with memfd hugetlb (2048 kB)
	ok 34 Should have worked
	# [RUN] R/O longterm GUP pin in MAP_PRIVATE file mapping ... with memfd hugetlb (1048576 kB)
	ok 35 Should have worked
	# [RUN] R/O longterm GUP-fast pin in MAP_PRIVATE file mapping ... with memfd
	ok 36 Should have worked
	# [RUN] R/O longterm GUP-fast pin in MAP_PRIVATE file mapping ... with tmpfile
	ok 37 Should have worked
	# [RUN] R/O longterm GUP-fast pin in MAP_PRIVATE file mapping ... with local tmpfile
	ok 38 Should have worked
	# [RUN] R/O longterm GUP-fast pin in MAP_PRIVATE file mapping ... with memfd hugetlb (2048 kB)
	ok 39 Should have worked
	# [RUN] R/O longterm GUP-fast pin in MAP_PRIVATE file mapping ... with memfd hugetlb (1048576 kB)
	ok 40 Should have worked
	# [RUN] io_uring fixed buffer with MAP_SHARED file mapping ... with memfd
	ok 41 Should have worked
	# [RUN] io_uring fixed buffer with MAP_SHARED file mapping ... with tmpfile
	ok 42 Should have worked
	# [RUN] io_uring fixed buffer with MAP_SHARED file mapping ... with local tmpfile
	ok 43 Should have failed
	# [RUN] io_uring fixed buffer with MAP_SHARED file mapping ... with memfd hugetlb (2048 kB)
	ok 44 Should have worked
	# [RUN] io_uring fixed buffer with MAP_SHARED file mapping ... with memfd hugetlb (1048576 kB)
	ok 45 Should have worked
	# [RUN] io_uring fixed buffer with MAP_PRIVATE file mapping ... with memfd
	ok 46 Should have worked
	# [RUN] io_uring fixed buffer with MAP_PRIVATE file mapping ... with tmpfile
	ok 47 Should have worked
	# [RUN] io_uring fixed buffer with MAP_PRIVATE file mapping ... with local tmpfile
	ok 48 Should have worked
	# [RUN] io_uring fixed buffer with MAP_PRIVATE file mapping ... with memfd hugetlb (2048 kB)
	ok 49 Should have worked
	# [RUN] io_uring fixed buffer with MAP_PRIVATE file mapping ... with memfd hugetlb (1048576 kB)
	ok 50 Should have worked
	# Totals: pass:50 fail:0 xfail:0 xpass:0 skip:0 error:0


This patch (of 3):

Let's factor detection out into vm_util, to be reused by a new test.

Link: https://lkml.kernel.org/r/20230519102723.185721-1-david@redhat.com
Link: https://lkml.kernel.org/r/20230519102723.185721-2-david@redhat.comSigned-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

81b1e3f9

mm: compaction: avoid GFP_NOFS ABBA deadlock · 4fbbb3fd

Johannes Weiner authored May 19, 2023

During stress testing with higher-order allocations, a deadlock scenario
was observed in compaction: One GFP_NOFS allocation was sleeping on
mm/compaction.c::too_many_isolated(), while all CPUs in the system were
busy with compactors spinning on buffer locks held by the sleeping
GFP_NOFS allocation.

Reclaim is susceptible to this same deadlock; we fixed it by granting
GFP_NOFS allocations additional LRU isolation headroom, to ensure it makes
forward progress while holding fs locks that other reclaimers might
acquire. Do the same here.

This code has been like this since compaction was initially merged, and I
only managed to trigger this with out-of-tree patches that dramatically
increase the contexts that do GFP_NOFS compaction. While the issue is
real, it seems theoretical in nature given existing allocation sites.
Worth fixing now, but no Fixes tag or stable CC.

Link: https://lkml.kernel.org/r/20230519111359.40475-1-hannes@cmpxchg.orgSigned-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

4fbbb3fd

mm: compaction: have compaction_suitable() return bool · 3cf04937

Johannes Weiner authored Jun 02, 2023

Since it only returns COMPACT_CONTINUE or COMPACT_SKIPPED now, a bool
return value simplifies the callsites.

Link: https://lkml.kernel.org/r/20230602151204.GD161817@cmpxchg.orgSigned-off-by: Johannes Weiner <hannes@cmpxchg.org>
Suggested-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

3cf04937

mm: compaction: drop redundant watermark check in compaction_zonelist_suitable() · 1c9568e8

Johannes Weiner authored May 19, 2023

The watermark check in compaction_zonelist_suitable(), called from
should_compact_retry(), is sandwiched between two watermark checks
already: before, there are freelist attempts as part of direct reclaim and
direct compaction; after, there is a last-minute freelist attempt in
__alloc_pages_may_oom().

The check in compaction_zonelist_suitable() isn't necessary. Kill it.

Link: https://lkml.kernel.org/r/20230519123959.77335-6-hannes@cmpxchg.orgSigned-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

1c9568e8

mm: compaction: remove unnecessary is_via_compact_memory() checks · f98a497e

Johannes Weiner authored May 19, 2023

Remove from all paths not reachable via /proc/sys/vm/compact_memory.

Link: https://lkml.kernel.org/r/20230519123959.77335-5-hannes@cmpxchg.orgSigned-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

f98a497e

mm: compaction: refactor __compaction_suitable() · e8606320

Johannes Weiner authored May 19, 2023

__compaction_suitable() is supposed to check for available migration
targets.  However, it also checks whether the operation was requested via
/proc/sys/vm/compact_memory, and whether the original allocation request
can already succeed.  These don't apply to all callsites.

Move the checks out to the callers, so that later patches can deal with
them one by one.  No functional change intended.

[hannes@cmpxchg.org: fix comment, per Vlastimil]
  Link: https://lkml.kernel.org/r/20230602144942.GC161817@cmpxchg.org
Link: https://lkml.kernel.org/r/20230519123959.77335-4-hannes@cmpxchg.orgSigned-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

e8606320

mm: compaction: simplify should_compact_retry() · 511a69b2

Johannes Weiner authored May 19, 2023

The different branches for retry are unnecessarily complicated.  There are
really only three outcomes: progress (retry n times), skipped (retry if
reclaim can help), failed (retry with higher priority).

Rearrange the branches and the retry counter to make it simpler.

[hannes@cmpxchg.org: restore behavior when hitting max_retries]
  Link: https://lkml.kernel.org/r/20230602144705.GB161817@cmpxchg.org
Link: https://lkml.kernel.org/r/20230519123959.77335-3-hannes@cmpxchg.orgSigned-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

511a69b2

mm: compaction: remove compaction result helpers · ecd8b292

Johannes Weiner authored May 19, 2023

Patch series "mm: compaction: cleanups & simplifications".

These compaction cleanups are split out from the huge page allocator
series[1], as requested by reviewer feedback.

[1] https://lore.kernel.org/linux-mm/20230418191313.268131-1-hannes@cmpxchg.org/


This patch (of 5):

The compaction result helpers encode quirks that are specific to the
allocator's retry logic.  E.g.  COMPACT_SUCCESS and COMPACT_COMPLETE
actually represent failures that should be retried upon, and so on.  I
frequently found myself pulling up the helper implementation in order to
understand and work on the retry logic.  They're not quite clean
abstractions; rather they split the retry logic into two locations.

Remove the helpers and inline the checks.  Then comment on the result
interpretations directly where the decision making happens.

Link: https://lkml.kernel.org/r/20230519123959.77335-1-hannes@cmpxchg.org
Link: https://lkml.kernel.org/r/20230519123959.77335-2-hannes@cmpxchg.orgSigned-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

ecd8b292