Commit 3fd9952d authored by Linus Torvalds's avatar Linus Torvalds

Merge branch 'fixes-2.6.39' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq

* 'fixes-2.6.39' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
  workqueue: fix deadlock in worker_maybe_bind_and_lock()
  workqueue: Document debugging tricks

Fix up trivial spelling conflict in kernel/workqueue.c
parents 1be6a1f8 5035b20f
...@@ -12,6 +12,7 @@ CONTENTS ...@@ -12,6 +12,7 @@ CONTENTS
4. Application Programming Interface (API) 4. Application Programming Interface (API)
5. Example Execution Scenarios 5. Example Execution Scenarios
6. Guidelines 6. Guidelines
7. Debugging
1. Introduction 1. Introduction
...@@ -379,3 +380,42 @@ If q1 has WQ_CPU_INTENSIVE set, ...@@ -379,3 +380,42 @@ If q1 has WQ_CPU_INTENSIVE set,
* Unless work items are expected to consume a huge amount of CPU * Unless work items are expected to consume a huge amount of CPU
cycles, using a bound wq is usually beneficial due to the increased cycles, using a bound wq is usually beneficial due to the increased
level of locality in wq operations and work item execution. level of locality in wq operations and work item execution.
7. Debugging
Because the work functions are executed by generic worker threads
there are a few tricks needed to shed some light on misbehaving
workqueue users.
Worker threads show up in the process list as:
root 5671 0.0 0.0 0 0 ? S 12:07 0:00 [kworker/0:1]
root 5672 0.0 0.0 0 0 ? S 12:07 0:00 [kworker/1:2]
root 5673 0.0 0.0 0 0 ? S 12:12 0:00 [kworker/0:0]
root 5674 0.0 0.0 0 0 ? S 12:13 0:00 [kworker/1:0]
If kworkers are going crazy (using too much cpu), there are two types
of possible problems:
1. Something beeing scheduled in rapid succession
2. A single work item that consumes lots of cpu cycles
The first one can be tracked using tracing:
$ echo workqueue:workqueue_queue_work > /sys/kernel/debug/tracing/set_event
$ cat /sys/kernel/debug/tracing/trace_pipe > out.txt
(wait a few secs)
^C
If something is busy looping on work queueing, it would be dominating
the output and the offender can be determined with the work item
function.
For the second type of problems it should be possible to just check
the stack trace of the offending worker thread.
$ cat /proc/THE_OFFENDING_KWORKER/stack
The work item's function should be trivially visible in the stack
trace.
...@@ -1291,8 +1291,14 @@ __acquires(&gcwq->lock) ...@@ -1291,8 +1291,14 @@ __acquires(&gcwq->lock)
return true; return true;
spin_unlock_irq(&gcwq->lock); spin_unlock_irq(&gcwq->lock);
/* CPU has come up in between, retry migration */ /*
* We've raced with CPU hot[un]plug. Give it a breather
* and retry migration. cond_resched() is required here;
* otherwise, we might deadlock against cpu_stop trying to
* bring down the CPU on non-preemptive kernel.
*/
cpu_relax(); cpu_relax();
cond_resched();
} }
} }
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment