• Michal Hocko's avatar
    mm, oom: move GFP_NOFS check to out_of_memory · 3da88fb3
    Michal Hocko authored
    __alloc_pages_may_oom is the central place to decide when the
    out_of_memory should be invoked.  This is a good approach for most
    checks there because they are page allocator specific and the allocation
    fails right after for all of them.
    
    The notable exception is GFP_NOFS context which is faking
    did_some_progress and keep the page allocator looping even though there
    couldn't have been any progress from the OOM killer.  This patch doesn't
    change this behavior because we are not ready to allow those allocation
    requests to fail yet (and maybe we will face the reality that we will
    never manage to safely fail these request).  Instead __GFP_FS check is
    moved down to out_of_memory and prevent from OOM victim selection there.
    There are two reasons for that
    
    	- OOM notifiers might release some memory even from this context
    	  as none of the registered notifier seems to be FS related
    	- this might help a dying thread to get an access to memory
              reserves and move on which will make the behavior more
              consistent with the case when the task gets killed from a
              different context.
    
    Keep a comment in __alloc_pages_may_oom to make sure we do not forget
    how GFP_NOFS is special and that we really want to do something about
    it.
    
    Note to the current oom_notifier users:
    
    The observable difference for you is that oom notifiers cannot depend on
    any fs locks because we could deadlock.  Not that this would be allowed
    today because that would just lockup machine in most of the cases and
    ruling out the OOM killer along the way.  Another difference is that
    callbacks might be invoked sooner now because GFP_NOFS is a weaker
    reclaim context and so there could be reclaimable memory which is just
    not reachable now.  That would require GFP_NOFS only loads which are
    really rare and more importantly the observable result would be dropping
    of reconstructible object and potential performance drop which is not
    such a big deal when we are struggling to fulfill other important
    allocation requests.
    Signed-off-by: default avatarMichal Hocko <mhocko@suse.com>
    Cc: Raushaniya Maksudova <rmaksudova@parallels.com>
    Cc: Michael S. Tsirkin <mst@redhat.com>
    Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
    Cc: David Rientjes <rientjes@google.com>
    Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
    Cc: Daniel Vetter <daniel.vetter@intel.com>
    Cc: Oleg Nesterov <oleg@redhat.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    3da88fb3
oom_kill.c 25.6 KB