• Lisa Du's avatar
    mm: vmscan: fix do_try_to_free_pages() livelock · 6e543d57
    Lisa Du authored
    This patch is based on KOSAKI's work and I add a little more description,
    please refer https://lkml.org/lkml/2012/6/14/74.
    
    Currently, I found system can enter a state that there are lots of free
    pages in a zone but only order-0 and order-1 pages which means the zone is
    heavily fragmented, then high order allocation could make direct reclaim
    path's long stall(ex, 60 seconds) especially in no swap and no compaciton
    enviroment.  This problem happened on v3.4, but it seems issue still lives
    in current tree, the reason is do_try_to_free_pages enter live lock:
    
    kswapd will go to sleep if the zones have been fully scanned and are still
    not balanced.  As kswapd thinks there's little point trying all over again
    to avoid infinite loop.  Instead it changes order from high-order to
    0-order because kswapd think order-0 is the most important.  Look at
    73ce02e9 in detail.  If watermarks are ok, kswapd will go back to sleep
    and may leave zone->all_unreclaimable =3D 0.  It assume high-order users
    can still perform direct reclaim if they wish.
    
    Direct reclaim continue to reclaim for a high order which is not a
    COSTLY_ORDER without oom-killer until kswapd turn on
    zone->all_unreclaimble= .  This is because to avoid too early oom-kill.
    So it means direct_reclaim depends on kswapd to break this loop.
    
    In worst case, direct-reclaim may continue to page reclaim forever when
    kswapd sleeps forever until someone like watchdog detect and finally kill
    the process.  As described in:
    http://thread.gmane.org/gmane.linux.kernel.mm/103737
    
    We can't turn on zone->all_unreclaimable from direct reclaim path because
    direct reclaim path don't take any lock and this way is racy.  Thus this
    patch removes zone->all_unreclaimable field completely and recalculates
    zone reclaimable state every time.
    
    Note: we can't take the idea that direct-reclaim see zone->pages_scanned
    directly and kswapd continue to use zone->all_unreclaimable.  Because, it
    is racy.  commit 929bea7c (vmscan: all_unreclaimable() use
    zone->all_unreclaimable as a name) describes the detail.
    
    [akpm@linux-foundation.org: uninline zone_reclaimable_pages() and zone_reclaimable()]
    Cc: Aaditya Kumar <aaditya.kumar.30@gmail.com>
    Cc: Ying Han <yinghan@google.com>
    Cc: Nick Piggin <npiggin@gmail.com>
    Acked-by: default avatarRik van Riel <riel@redhat.com>
    Cc: Mel Gorman <mel@csn.ul.ie>
    Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
    Cc: Christoph Lameter <cl@linux.com>
    Cc: Bob Liu <lliubbo@gmail.com>
    Cc: Neil Zhang <zhangwm@marvell.com>
    Cc: Russell King - ARM Linux <linux@arm.linux.org.uk>
    Reviewed-by: default avatarMichal Hocko <mhocko@suse.cz>
    Acked-by: default avatarMinchan Kim <minchan@kernel.org>
    Acked-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
    Signed-off-by: default avatarKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
    Signed-off-by: default avatarLisa Du <cldu@marvell.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    6e543d57
mmzone.h 39.8 KB