• Martin Bligh's avatar
    [PATCH] vmscan: Fix temp_priority race · 3bb1a852
    Martin Bligh authored
    The temp_priority field in zone is racy, as we can walk through a reclaim
    path, and just before we copy it into prev_priority, it can be overwritten
    (say with DEF_PRIORITY) by another reclaimer.
    
    The same bug is contained in both try_to_free_pages and balance_pgdat, but
    it is fixed slightly differently.  In balance_pgdat, we keep a separate
    priority record per zone in a local array.  In try_to_free_pages there is
    no need to do this, as the priority level is the same for all zones that we
    reclaim from.
    
    Impact of this bug is that temp_priority is copied into prev_priority, and
    setting this artificially high causes reclaimers to set distress
    artificially low.  They then fail to reclaim mapped pages, when they are,
    in fact, under severe memory pressure (their priority may be as low as 0).
    This causes the OOM killer to fire incorrectly.
    
    From: Andrew Morton <akpm@osdl.org>
    
    __zone_reclaim() isn't modifying zone->prev_priority.  But zone->prev_priority
    is used in the decision whether or not to bring mapped pages onto the inactive
    list.  Hence there's a risk here that __zone_reclaim() will fail because
    zone->prev_priority ir large (ie: low urgency) and lots of mapped pages end up
    stuck on the active list.
    
    Fix that up by decreasing (ie making more urgent) zone->prev_priority as
    __zone_reclaim() scans the zone's pages.
    
    This bug perhaps explains why ZONE_RECLAIM_PRIORITY was created.  It should be
    possible to remove that now, and to just start out at DEF_PRIORITY?
    
    Cc: Nick Piggin <nickpiggin@yahoo.com.au>
    Cc: Christoph Lameter <clameter@engr.sgi.com>
    Cc: <stable@kernel.org>
    Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
    3bb1a852
vmstat.c 15.5 KB