• Oleg Nesterov's avatar
    wait: introduce EXIT_TRACE to avoid the racy EXIT_DEAD->EXIT_ZOMBIE transition · abd50b39
    Oleg Nesterov authored
    wait_task_zombie() first does EXIT_ZOMBIE->EXIT_DEAD transition and
    drops tasklist_lock.  If this task is not the natural child and it is
    traced, we change its state back to EXIT_ZOMBIE for ->real_parent.
    
    The last transition is racy, this is even documented in 50b8d257
    "ptrace: partially fix the do_wait(WEXITED) vs EXIT_DEAD->EXIT_ZOMBIE
    race".  wait_consider_task() tries to detect this transition and clear
    ->notask_error but we can't rely on ptrace_reparented(), debugger can
    exit and do ptrace_unlink() before its sub-thread sets EXIT_ZOMBIE.
    
    And there is another problem which were missed before: this transition
    can also race with reparent_leader() which doesn't reset >exit_signal if
    EXIT_DEAD, assuming that this task must be reaped by someone else.  So
    the tracee can be re-parented with ->exit_signal != SIGCHLD, and if
    /sbin/init doesn't use __WALL it becomes unreapable.  This was fixed by
    the previous commit, but it was the temporary hack.
    
    1. Add the new exit_state, EXIT_TRACE. It means that the task is the
       traced zombie, debugger is going to detach and notify its natural
       parent.
    
       This new state is actually EXIT_ZOMBIE | EXIT_DEAD. This way we
       can avoid the changes in proc/kgdb code, get_task_state() still
       reports "X (dead)" in this case.
    
       Note: with or without this change userspace can see Z -> X -> Z
       transition. Not really bad, but probably makes sense to fix.
    
    2. Change wait_task_zombie() to use EXIT_TRACE instead of EXIT_DEAD
       if we need to notify the ->real_parent.
    
    3. Revert the previous hack in reparent_leader(), now that EXIT_DEAD
       is always the final state we can safely ignore such a task.
    
    4. Change wait_consider_task() to check EXIT_TRACE separately and kill
       the racy and no longer needed ptrace_reparented() case.
    
       If ptrace == T an EXIT_TRACE thread should be simply ignored, the
       owner of this state is going to ptrace_unlink() this task. We can
       pretend that it was already removed from ->ptraced list.
    
       Otherwise we should skip this thread too but clear ->notask_error,
       we must be the natural parent and debugger is going to untrace and
       notify us. IOW, this doesn't differ from "EXIT_ZOMBIE && p->ptrace"
       even if the task was already untraced.
    Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
    Reported-by: default avatarJan Kratochvil <jan.kratochvil@redhat.com>
    Reported-by: default avatarMichal Schmidt <mschmidt@redhat.com>
    Tested-by: default avatarMichal Schmidt <mschmidt@redhat.com>
    Cc: Al Viro <viro@ZenIV.linux.org.uk>
    Cc: Lennart Poettering <lpoetter@redhat.com>
    Cc: Roland McGrath <roland@hack.frob.com>
    Cc: Tejun Heo <tj@kernel.org>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    abd50b39
exit.c 42.7 KB