• Michal Hocko's avatar
    mm, vmscan: add cond_resched() into shrink_node_memcg() · bd041733
    Michal Hocko authored
    Boris Zhmurov has reported RCU stalls during the kswapd reclaim:
    
      INFO: rcu_sched detected stalls on CPUs/tasks:
       23-...: (22 ticks this GP) idle=92f/140000000000000/0 softirq=2638404/2638404 fqs=23
       (detected by 4, t=6389 jiffies, g=786259, c=786258, q=42115)
      Task dump for CPU 23:
      kswapd1         R  running task        0   148      2 0x00000008
      Call Trace:
        shrink_node+0xd2/0x2f0
        kswapd+0x2cb/0x6a0
        mem_cgroup_shrink_node+0x160/0x160
        kthread+0xbd/0xe0
        __switch_to+0x1fa/0x5c0
        ret_from_fork+0x1f/0x40
        kthread_create_on_node+0x180/0x180
    
    a closer code inspection has shown that we might indeed miss all the
    scheduling points in the reclaim path if no pages can be isolated from
    the LRU list.  This is a pathological case but other reports from Donald
    Buczek have shown that we might indeed hit such a path:
    
            clusterd-989   [009] .... 118023.654491: mm_vmscan_direct_reclaim_end: nr_reclaimed=193
             kswapd1-86    [001] dN.. 118023.987475: mm_vmscan_lru_isolate: isolate_mode=0 classzone=0 order=0 nr_requested=32 nr_scanned=4239830 nr_taken=0 file=1
             kswapd1-86    [001] dN.. 118024.320968: mm_vmscan_lru_isolate: isolate_mode=0 classzone=0 order=0 nr_requested=32 nr_scanned=4239844 nr_taken=0 file=1
             kswapd1-86    [001] dN.. 118024.654375: mm_vmscan_lru_isolate: isolate_mode=0 classzone=0 order=0 nr_requested=32 nr_scanned=4239858 nr_taken=0 file=1
             kswapd1-86    [001] dN.. 118024.987036: mm_vmscan_lru_isolate: isolate_mode=0 classzone=0 order=0 nr_requested=32 nr_scanned=4239872 nr_taken=0 file=1
             kswapd1-86    [001] dN.. 118025.319651: mm_vmscan_lru_isolate: isolate_mode=0 classzone=0 order=0 nr_requested=32 nr_scanned=4239886 nr_taken=0 file=1
             kswapd1-86    [001] dN.. 118025.652248: mm_vmscan_lru_isolate: isolate_mode=0 classzone=0 order=0 nr_requested=32 nr_scanned=4239900 nr_taken=0 file=1
             kswapd1-86    [001] dN.. 118025.984870: mm_vmscan_lru_isolate: isolate_mode=0 classzone=0 order=0 nr_requested=32 nr_scanned=4239914 nr_taken=0 file=1
      [...]
             kswapd1-86    [001] dN.. 118084.274403: mm_vmscan_lru_isolate: isolate_mode=0 classzone=0 order=0 nr_requested=32 nr_scanned=4241133 nr_taken=0 file=1
    
    this is minute long snapshot which didn't take a single page from the
    LRU.  It is not entirely clear why only 1303 pages have been scanned
    during that time (maybe there was a heavy IRQ activity interfering).
    
    In any case it looks like we can really hit long periods without
    scheduling on non preemptive kernels so an explicit cond_resched() in
    shrink_node_memcg which is independent on the reclaim operation is due.
    
    Link: http://lkml.kernel.org/r/20161202095841.16648-1-mhocko@kernel.orgSigned-off-by: default avatarMichal Hocko <mhocko@suse.com>
    Reported-by: default avatarBoris Zhmurov <bb@kernelpanic.ru>
    Tested-by: default avatarBoris Zhmurov <bb@kernelpanic.ru>
    Reported-by: default avatarDonald Buczek <buczek@molgen.mpg.de>
    Reported-by: default avatar"Christopher S. Aker" <caker@theshore.net>
    Reported-by: default avatarPaul Menzel <pmenzel@molgen.mpg.de>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    bd041733
vmscan.c 111 KB