• Vlastimil Babka's avatar
    mm, vmscan: guarantee drop_slab_node() termination · 1399af7e
    Vlastimil Babka authored
    drop_slab_node() is called as part of echo 2>/proc/sys/vm/drop_caches
    operation.  It iterates over all memcgs and calls shrink_slab() which in
    turn iterates over all slab shrinkers.  Freed objects are counted and as
    long as the total number of freed objects from all memcgs and shrinkers is
    higher than 10, drop_slab_node() loops for another full memcgs*shrinkers
    iteration.
    
    This arbitrary constant threshold of 10 can result in effectively an
    infinite loop on a system with large number of memcgs and/or parallel
    activity that allocates new objects.  This has been reported previously by
    Chunxin Zang [1] and recently by our customer.
    
    The previous report [1] has resulted in commit 069c411d ("mm/vmscan:
    fix infinite loop in drop_slab_node") which added a check for signals
    allowing the user to terminate the command writing to drop_caches.  At the
    time it was also considered to make the threshold grow with each iteration
    to guarantee termination, but such patch hasn't been formally proposed
    yet.
    
    This patch implements the dynamically growing threshold.  At first
    iteration it's enough to free one object to continue, and this threshold
    effectively doubles with each iteration.  Our customer's feedback was
    positive.
    
    There is always a risk that this change will result on some system in a
    previously terminating drop_caches operation to terminate sooner and free
    fewer objects.  Ideally the semantics would guarantee freeing all freeable
    objects that existed at the moment of starting the operation, while not
    looping forever for newly allocated objects, but that's not feasible to
    track.  In the less ideal solution based on thresholds, arguably the
    termination guarantee is more important than the exhaustiveness guarantee.
    If there are reports of large regression wrt being exhaustive, we can
    tune how fast the threshold grows.
    
    [1] https://lore.kernel.org/lkml/20200909152047.27905-1-zangchunxin@bytedance.com/T/#u
    
    [vbabka@suse.cz: avoid undefined shift behaviour]
      Link: https://lkml.kernel.org/r/2f034e6f-a753-550a-f374-e4e23899d3d5@suse.cz
    
    Link: https://lkml.kernel.org/r/20210818152239.25502-1-vbabka@suse.czSigned-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
    Reported-by: default avatarChunxin Zang <zangchunxin@bytedance.com>
    Cc: Muchun Song <songmuchun@bytedance.com>
    Cc: Chris Down <chris@chrisdown.name>
    Cc: Michal Hocko <mhocko@kernel.org>
    Cc: Matthew Wilcox <willy@infradead.org>
    Cc: Vlastimil Babka <vbabka@suse.cz>
    Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    1399af7e
vmscan.c 133 KB