Commit f3276a18 authored by Roland McGrath's avatar Roland McGrath Committed by Linus Torvalds

[PATCH] fix for potential deadlock after posix-timers change

Ulrich has been working on the glibc code using posix-timers and
stressing it more now than it has before.  He ran into an SMP deadlock
on process exit in the case there are pending queued signals from a
timer.

The deadlock arises because in the path through exit_itimers, the
tasklist_lock is already held (for writing).  When a timer is being
deleted, sigqueue_free will try to take it (for reading) in the case
where that timer has a pending signal queued on somebody's queue.  This
patch avoids the problem by making sure the queues are flushed before
calling exit_itimers, thus ensuring its code path won't try to take
tasklist_lock.
parent b4389817
...@@ -352,10 +352,8 @@ void __exit_signal(struct task_struct *tsk) ...@@ -352,10 +352,8 @@ void __exit_signal(struct task_struct *tsk)
if (tsk == sig->curr_target) if (tsk == sig->curr_target)
sig->curr_target = next_thread(tsk); sig->curr_target = next_thread(tsk);
tsk->signal = NULL; tsk->signal = NULL;
exit_itimers(sig);
spin_unlock(&sighand->siglock); spin_unlock(&sighand->siglock);
flush_sigqueue(&sig->shared_pending); flush_sigqueue(&sig->shared_pending);
kmem_cache_free(signal_cachep, sig);
} else { } else {
/* /*
* If there is any task waiting for the group exit * If there is any task waiting for the group exit
...@@ -369,9 +367,28 @@ void __exit_signal(struct task_struct *tsk) ...@@ -369,9 +367,28 @@ void __exit_signal(struct task_struct *tsk)
sig->curr_target = next_thread(tsk); sig->curr_target = next_thread(tsk);
tsk->signal = NULL; tsk->signal = NULL;
spin_unlock(&sighand->siglock); spin_unlock(&sighand->siglock);
sig = NULL; /* Marker for below. */
} }
clear_tsk_thread_flag(tsk,TIF_SIGPENDING); clear_tsk_thread_flag(tsk,TIF_SIGPENDING);
flush_sigqueue(&tsk->pending); flush_sigqueue(&tsk->pending);
if (sig) {
/*
* We are cleaning up the signal_struct here. We delayed
* calling exit_itimers until after flush_sigqueue, just in
* case our thread-local pending queue contained a queued
* timer signal that would have been cleared in
* exit_itimers. When that called sigqueue_free, it would
* attempt to re-take the tasklist_lock and deadlock. This
* can never happen if we ensure that all queues the
* timer's signal might be queued on have been flushed
* first. The shared_pending queue, and our own pending
* queue are the only queues the timer could be on, since
* there are no other threads left in the group and timer
* signals are constrained to threads inside the group.
*/
exit_itimers(sig);
kmem_cache_free(signal_cachep, sig);
}
} }
void exit_signal(struct task_struct *tsk) void exit_signal(struct task_struct *tsk)
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment