• Oleg Nesterov's avatar
    [PATCH] fix do_coredump() vs SIGSTOP race · 788e05a6
    Oleg Nesterov authored
    Let's suppose we have 2 threads in thread group:
    	A - does coredump
    	B - has pending SIGSTOP
    
    thread A						thread B
    
    do_coredump:						get_signal_to_deliver:
    
      lock(->sighand)
      ->signal->flags = SIGNAL_GROUP_EXIT
      unlock(->sighand)
    
    							lock(->sighand)
    							signr = dequeue_signal()
    								->signal->flags |= SIGNAL_STOP_DEQUEUED
    								return SIGSTOP;
    
    							do_signal_stop:
    							    unlock(->sighand)
    
      coredump_wait:
    
          zap_threads:
              lock(tasklist_lock)
              send SIGKILL to B
                  // signal_wake_up() does nothing
              unlock(tasklist_lock)
    
    							    lock(tasklist_lock)
    							    lock(->sighand)
    							    re-check sig->flags & SIGNAL_STOP_DEQUEUED, yes
    							    set_current_state(TASK_STOPPED);
    							    finish_stop:
    							        schedule();
    							            // ->state == TASK_STOPPED
    
          wait_for_completion(&startup_done)
             // waits for complete() from B,
             // ->state == TASK_UNINTERRUPTIBLE
    
    We can't wake up 'B' in any way:
    
    	SIGCONT will be ignored because handle_stop_signal() sees
    	->signal->flags & SIGNAL_GROUP_EXIT.
    
    	sys_kill(SIGKILL)->__group_complete_signal() will choose
    	uninterruptible 'A', so it can't help.
    
    	sys_tkill(B, SIGKILL) will be ignored by specific_send_sig_info()
    	because B already has pending SIGKILL.
    
    This scenario is not possbile if 'A' does do_group_exit(), because
    it sets sig->flags = SIGNAL_GROUP_EXIT and delivers SIGKILL to
    subthreads atomically, holding both tasklist_lock and sighand->lock.
    That means that do_signal_stop() will notice !SIGNAL_STOP_DEQUEUED
    after re-locking ->sighand. And it is not possible to any other
    thread to re-add SIGNAL_STOP_DEQUEUED later, because dequeue_signal()
    can only return SIGKILL.
    
    I think it is better to change do_coredump() to do sigaddset(SIGKILL)
    and signal_wake_up() under sighand->lock, but this patch is much
    simpler.
    Signed-off-by: default avatarOleg Nesterov <oleg@tv-sign.ru>
    Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
    788e05a6
signal.c 68.8 KB