• Michal Hocko's avatar
    oom, suspend: fix oom_reaper vs. oom_killer_disable race · 74070542
    Michal Hocko authored
    Tetsuo has reported the following potential oom_killer_disable vs.
    oom_reaper race:
    
     (1) freeze_processes() starts freezing user space threads.
     (2) Somebody (maybe a kenrel thread) calls out_of_memory().
     (3) The OOM killer calls mark_oom_victim() on a user space thread
         P1 which is already in __refrigerator().
     (4) oom_killer_disable() sets oom_killer_disabled = true.
     (5) P1 leaves __refrigerator() and enters do_exit().
     (6) The OOM reaper calls exit_oom_victim(P1) before P1 can call
         exit_oom_victim(P1).
     (7) oom_killer_disable() returns while P1 not yet finished
     (8) P1 perform IO/interfere with the freezer.
    
    This situation is unfortunate.  We cannot move oom_killer_disable after
    all the freezable kernel threads are frozen because the oom victim might
    depend on some of those kthreads to make a forward progress to exit so
    we could deadlock.  It is also far from trivial to teach the oom_reaper
    to not call exit_oom_victim() because then we would lose a guarantee of
    the OOM killer and oom_killer_disable forward progress because
    exit_mm->mmput might block and never call exit_oom_victim.
    
    It seems the easiest way forward is to workaround this race by calling
    try_to_freeze_tasks again after oom_killer_disable.  This will make sure
    that all the tasks are frozen or it bails out.
    
    Fixes: 449d777d ("mm, oom_reaper: clear TIF_MEMDIE for all tasks queued for oom_reaper")
    Link: http://lkml.kernel.org/r/1466597634-16199-1-git-send-email-mhocko@kernel.orgSigned-off-by: default avatarMichal Hocko <mhocko@suse.com>
    Reported-by: default avatarTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
    Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    74070542
process.c 5.61 KB