Commit ebf5ebe3 authored by Ingo Molnar's avatar Ingo Molnar Committed by Linus Torvalds

[PATCH] signal-fixes-2.5.59-A4

this is the current threading patchset, which accumulated up during the
past two weeks. It consists of a biggest set of changes from Roland, to
make threaded signals work. There were still tons of testcases and
boundary conditions (mostly in the signal/exit/ptrace area) that we did
not handle correctly.

Roland's thread-signal semantics/behavior/ptrace fixes:

 - fix signal delivery race with do_exit() => signals are re-queued to the
   'process' if do_exit() finds pending unhandled ones. This prevents
   signals getting lost upon thread-sys_exit().

 - a non-main thread has died on one processor and gone to TASK_ZOMBIE,
   but before it's gotten to release_task a sys_wait4 on the other
   processor reaps it.  It's only because it's ptraced that this gets
   through eligible_child.  Somewhere in there the main thread is also
   dying so it reparents the child thread to hit that case.  This means
   that there is a race where P might be totally invalid.

 - forget_original_parent is not doing the right thing when the group
   leader dies, i.e. reparenting threads to init when there is a zombie
   group leader.  Perhaps it doesn't matter for any practical purpose
   without ptrace, though it makes for ppid=1 for each thread in core
   dumps, which looks funny. Incidentally, SIGCHLD here really should be
   p->exit_signal.

 - one of the gdb tests makes a questionable assumption about what kill
   will do when it has some threads stopped by ptrace and others running.

exit races:

1. Processor A is in sys_wait4 case TASK_STOPPED considering task P.
   Processor B is about to resume P and then switch to it.

   While A is inside that case block, B starts running P and it clears
   P->exit_code, or takes a pending fatal signal and sets it to a new
   value. Depending on the interleaving, the possible failure modes are:
        a. A gets to its put_user after B has cleared P->exit_code
           => returns with WIFSTOPPED, WSTOPSIG==0
        b. A gets to its put_user after B has set P->exit_code anew
           => returns with e.g. WIFSTOPPED, WSTOPSIG==SIGKILL

   A can spend an arbitrarily long time in that case block, because
   there's getrusage and put_user that can take page faults, and
   write_lock'ing of the tasklist_lock that can block.  But even if it's
   short the race is there in principle.

2. This is new with NPTL, i.e. CLONE_THREAD.
   Two processors A and B are both in sys_wait4 case TASK_STOPPED
   considering task P.

   Both get through their tests and fetches of P->exit_code before either
   gets to P->exit_code = 0.  => two threads return the same pid from
   waitpid.

   In other interleavings where one processor gets to its put_user after
   the other has cleared P->exit_code, it's like case 1(a).


3. SMP races with stop/cont signals

   First, take:

        kill(pid, SIGSTOP);
        kill(pid, SIGCONT);

   or:

        kill(pid, SIGSTOP);
        kill(pid, SIGKILL);

   It's possible for this to leave the process stopped with a pending
   SIGCONT/SIGKILL.  That's a state that should never be possible.
   Moreover, kill(pid, SIGKILL) without any repetition should always be
   enough to kill a process.  (Likewise SIGCONT when you know it's
   sequenced after the last stop signal, must be sufficient to resume a
   process.)

4. take:

        kill(pid, SIGKILL);     // or any fatal signal
        kill(pid, SIGCONT);     // or SIGKILL

    it's possible for this to cause pid to be reaped with status 0
    instead of its true termination status.  The equivalent scenario
    happens when the process being killed is in an _exit call or a
    trap-induced fatal signal before the kills.

plus i've done stability fixes for bugs that popped up during
beta-testing, and minor tidying of Roland's changes:

 - a rare tasklist corruption during exec, causing some very spurious and
   colorful crashes.

 - a copy_process()-related dereference of already freed thread structure
   if hit with a SIGKILL in the wrong moment.

 - SMP spinlock deadlocks in the signal code

this patchset has been tested quite well in the 2.4 backport of the
threading changes - and i've done some stresstesting on 2.5.59 SMP as
well, and did an x86 UP testcompile + testboot as well.
parent 44a5a59c
......@@ -587,7 +587,7 @@ static inline int de_thread(struct signal_struct *oldsig)
return -EAGAIN;
}
oldsig->group_exit = 1;
__broadcast_thread_group(current, SIGKILL);
zap_other_threads(current);
/*
* Account for the thread group leader hanging around:
......@@ -660,6 +660,7 @@ static inline int de_thread(struct signal_struct *oldsig)
__ptrace_link(current, parent);
}
list_del(&current->tasks);
list_add_tail(&current->tasks, &init_task.tasks);
current->exit_signal = SIGCHLD;
state = leader->state;
......@@ -680,6 +681,7 @@ static inline int de_thread(struct signal_struct *oldsig)
newsig->group_exit = 0;
newsig->group_exit_code = 0;
newsig->group_exit_task = NULL;
newsig->group_stop_count = 0;
memcpy(newsig->action, current->sig->action, sizeof(newsig->action));
init_sigpending(&newsig->shared_pending);
......
......@@ -235,6 +235,9 @@ struct signal_struct {
int group_exit;
int group_exit_code;
struct task_struct *group_exit_task;
/* thread group stop support, overloads group_exit_code too */
int group_stop_count;
};
/*
......@@ -508,7 +511,6 @@ extern int in_egroup_p(gid_t);
extern void proc_caches_init(void);
extern void flush_signals(struct task_struct *);
extern void flush_signal_handlers(struct task_struct *);
extern void sig_exit(int, int, struct siginfo *);
extern int dequeue_signal(sigset_t *mask, siginfo_t *info);
extern void block_all_signals(int (*notifier)(void *priv), void *priv,
sigset_t *mask);
......@@ -525,7 +527,7 @@ extern void do_notify_parent(struct task_struct *, int);
extern void force_sig(int, struct task_struct *);
extern void force_sig_specific(int, struct task_struct *);
extern int send_sig(int, struct task_struct *, int);
extern int __broadcast_thread_group(struct task_struct *p, int sig);
extern void zap_other_threads(struct task_struct *p);
extern int kill_pg(pid_t, int, int);
extern int kill_sl(pid_t, int, int);
extern int kill_proc(pid_t, int, int);
......@@ -590,6 +592,8 @@ extern void exit_files(struct task_struct *);
extern void exit_sighand(struct task_struct *);
extern void __exit_sighand(struct task_struct *);
extern NORET_TYPE void do_group_exit(int);
extern void reparent_to_init(void);
extern void daemonize(void);
extern task_t *child_reaper;
......@@ -762,6 +766,8 @@ static inline void cond_resched_lock(spinlock_t * lock)
extern FASTCALL(void recalc_sigpending_tsk(struct task_struct *t));
extern void recalc_sigpending(void);
extern void signal_wake_up(struct task_struct *t, int resume_stopped);
/*
* Wrappers for p->thread_info->cpu access. No-op on UP.
*/
......
......@@ -647,7 +647,7 @@ NORET_TYPE void do_exit(long code)
exit_namespace(tsk);
exit_thread();
if (current->leader)
if (tsk->leader)
disassociate_ctty(1);
module_put(tsk->thread_info->exec_domain->module);
......@@ -657,8 +657,31 @@ NORET_TYPE void do_exit(long code)
tsk->exit_code = code;
exit_notify();
preempt_disable();
if (current->exit_signal == -1)
release_task(current);
if (signal_pending(tsk) && !tsk->sig->group_exit
&& !thread_group_empty(tsk)) {
/*
* This occurs when there was a race between our exit
* syscall and a group signal choosing us as the one to
* wake up. It could be that we are the only thread
* alerted to check for pending signals, but another thread
* should be woken now to take the signal since we will not.
* Now we'll wake all the threads in the group just to make
* sure someone gets all the pending signals.
*/
struct task_struct *t;
read_lock(&tasklist_lock);
spin_lock_irq(&tsk->sig->siglock);
for (t = next_thread(tsk); t != tsk; t = next_thread(t))
if (!signal_pending(t) && !(t->flags & PF_EXITING)) {
recalc_sigpending_tsk(t);
if (signal_pending(t))
signal_wake_up(t, 0);
}
spin_unlock_irq(&tsk->sig->siglock);
read_unlock(&tasklist_lock);
}
if (tsk->exit_signal == -1)
release_task(tsk);
schedule();
BUG();
/*
......@@ -710,31 +733,44 @@ task_t *next_thread(task_t *p)
}
/*
* this kills every thread in the thread group. Note that any externally
* wait4()-ing process will get the correct exit code - even if this
* thread is not the thread group leader.
* Take down every thread in the group. This is called by fatal signals
* as well as by sys_exit_group (below).
*/
asmlinkage long sys_exit_group(int error_code)
NORET_TYPE void
do_group_exit(int exit_code)
{
unsigned int exit_code = (error_code & 0xff) << 8;
if (!thread_group_empty(current)) {
struct signal_struct *sig = current->sig;
BUG_ON(exit_code & 0x80); /* core dumps don't get here */
if (current->sig->group_exit)
exit_code = current->sig->group_exit_code;
else if (!thread_group_empty(current)) {
struct signal_struct *const sig = current->sig;
read_lock(&tasklist_lock);
spin_lock_irq(&sig->siglock);
if (sig->group_exit) {
spin_unlock_irq(&sig->siglock);
/* another thread was faster: */
do_exit(sig->group_exit_code);
}
if (sig->group_exit)
/* Another thread got here before we took the lock. */
exit_code = sig->group_exit_code;
else {
sig->group_exit = 1;
sig->group_exit_code = exit_code;
__broadcast_thread_group(current, SIGKILL);
zap_other_threads(current);
}
spin_unlock_irq(&sig->siglock);
read_unlock(&tasklist_lock);
}
do_exit(exit_code);
/* NOTREACHED */
}
/*
* this kills every thread in the thread group. Note that any externally
* wait4()-ing process will get the correct exit code - even if this
* thread is not the thread group leader.
*/
asmlinkage long sys_exit_group(int error_code)
{
do_group_exit((error_code & 0xff) << 8);
}
static int eligible_child(pid_t pid, int options, task_t *p)
......@@ -800,6 +836,8 @@ asmlinkage long sys_wait4(pid_t pid,unsigned int * stat_addr, int options, struc
int ret;
list_for_each(_p,&tsk->children) {
int exit_code;
p = list_entry(_p,struct task_struct,sibling);
ret = eligible_child(pid, options, p);
......@@ -813,20 +851,69 @@ asmlinkage long sys_wait4(pid_t pid,unsigned int * stat_addr, int options, struc
continue;
if (!(options & WUNTRACED) && !(p->ptrace & PT_PTRACED))
continue;
if (ret == 2 && !(p->ptrace & PT_PTRACED) &&
p->sig && p->sig->group_stop_count > 0)
/*
* A group stop is in progress and
* we are the group leader. We won't
* report until all threads have
* stopped.
*/
continue;
read_unlock(&tasklist_lock);
/* move to end of parent's list to avoid starvation */
write_lock_irq(&tasklist_lock);
remove_parent(p);
add_parent(p, p->parent);
/*
* This uses xchg to be atomic with
* the thread resuming and setting it.
* It must also be done with the write
* lock held to prevent a race with the
* TASK_ZOMBIE case (below).
*/
exit_code = xchg(&p->exit_code, 0);
if (unlikely(p->state > TASK_STOPPED)) {
/*
* The task resumed and then died.
* Let the next iteration catch it
* in TASK_ZOMBIE. Note that
* exit_code might already be zero
* here if it resumed and did
* _exit(0). The task itself is
* dead and won't touch exit_code
* again; other processors in
* this function are locked out.
*/
p->exit_code = exit_code;
exit_code = 0;
}
if (unlikely(exit_code == 0)) {
/*
* Another thread in this function
* got to it first, or it resumed,
* or it resumed and then died.
*/
write_unlock_irq(&tasklist_lock);
continue;
}
/*
* Make sure this doesn't get reaped out from
* under us while we are examining it below.
* We don't want to keep holding onto the
* tasklist_lock while we call getrusage and
* possibly take page faults for user memory.
*/
get_task_struct(p);
write_unlock_irq(&tasklist_lock);
retval = ru ? getrusage(p, RUSAGE_BOTH, ru) : 0;
if (!retval && stat_addr)
retval = put_user((p->exit_code << 8) | 0x7f, stat_addr);
if (!retval) {
p->exit_code = 0;
retval = put_user((exit_code << 8) | 0x7f, stat_addr);
if (!retval)
retval = p->pid;
}
put_task_struct(p);
goto end_wait4;
case TASK_ZOMBIE:
/*
......@@ -841,6 +928,13 @@ asmlinkage long sys_wait4(pid_t pid,unsigned int * stat_addr, int options, struc
state = xchg(&p->state, TASK_DEAD);
if (state != TASK_ZOMBIE)
continue;
if (unlikely(p->exit_signal == -1))
/*
* This can only happen in a race with
* a ptraced thread dying on another
* processor.
*/
continue;
read_unlock(&tasklist_lock);
retval = ru ? getrusage(p, RUSAGE_BOTH, ru) : 0;
......@@ -857,11 +951,17 @@ asmlinkage long sys_wait4(pid_t pid,unsigned int * stat_addr, int options, struc
retval = p->pid;
if (p->real_parent != p->parent) {
write_lock_irq(&tasklist_lock);
/* Double-check with lock held. */
if (p->real_parent != p->parent) {
__ptrace_unlink(p);
do_notify_parent(p, SIGCHLD);
do_notify_parent(
p, p->exit_signal);
p->state = TASK_ZOMBIE;
p = NULL;
}
write_unlock_irq(&tasklist_lock);
} else
}
if (p != NULL)
release_task(p);
goto end_wait4;
default:
......
......@@ -680,6 +680,7 @@ static inline int copy_sighand(unsigned long clone_flags, struct task_struct * t
sig->group_exit = 0;
sig->group_exit_code = 0;
sig->group_exit_task = NULL;
sig->group_stop_count = 0;
memcpy(sig->action, current->sig->action, sizeof(sig->action));
sig->curr_target = NULL;
init_sigpending(&sig->shared_pending);
......@@ -801,7 +802,7 @@ static struct task_struct *copy_process(unsigned long clone_flags,
spin_lock_init(&p->alloc_lock);
spin_lock_init(&p->switch_lock);
clear_tsk_thread_flag(p,TIF_SIGPENDING);
clear_tsk_thread_flag(p, TIF_SIGPENDING);
init_sigpending(&p->pending);
p->it_real_value = p->it_virt_value = p->it_prof_value = 0;
......@@ -910,6 +911,7 @@ static struct task_struct *copy_process(unsigned long clone_flags,
*/
if (sigismember(&current->pending.signal, SIGKILL)) {
write_unlock_irq(&tasklist_lock);
retval = -EINTR;
goto bad_fork_cleanup_namespace;
}
......@@ -934,6 +936,17 @@ static struct task_struct *copy_process(unsigned long clone_flags,
}
p->tgid = current->tgid;
p->group_leader = current->group_leader;
if (current->sig->group_stop_count > 0) {
/*
* There is an all-stop in progress for the group.
* We ourselves will stop as soon as we check signals.
* Make the new thread part of that group stop too.
*/
current->sig->group_stop_count++;
set_tsk_thread_flag(p, TIF_SIGPENDING);
}
spin_unlock(&current->sig->siglock);
}
......@@ -1036,8 +1049,13 @@ struct task_struct *do_fork(unsigned long clone_flags,
init_completion(&vfork);
}
if (p->ptrace & PT_PTRACED)
send_sig(SIGSTOP, p, 1);
if (p->ptrace & PT_PTRACED) {
/*
* We'll start up with an immediate SIGSTOP.
*/
sigaddset(&p->pending.signal, SIGSTOP);
set_tsk_thread_flag(p, TIF_SIGPENDING);
}
wake_up_forked_process(p); /* do this last */
++total_forks;
......
This diff is collapsed.
......@@ -65,7 +65,6 @@
#include <asm/pgtable.h>
#include <asm/io.h>
extern void signal_wake_up(struct task_struct *t);
extern int sys_sync(void);
unsigned char software_suspend_enabled = 0;
......@@ -220,7 +219,7 @@ int freeze_processes(void)
without locking */
p->flags |= PF_FREEZE;
spin_lock_irqsave(&p->sig->siglock, flags);
signal_wake_up(p);
signal_wake_up(p, 0);
spin_unlock_irqrestore(&p->sig->siglock, flags);
todo++;
} while_each_thread(g, p);
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment