Commit d19720a9 authored by Paul E. McKenney's avatar Paul E. McKenney Committed by Linus Torvalds

[PATCH] RCU documentation fixes (January 2006 update)

Updates to in-tree RCU documentation based on comments over the past few
months.
Signed-off-by: default avatar"Paul E. McKenney" <paulmck@us.ibm.com>
Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
parent 53d8be5c
...@@ -90,16 +90,20 @@ at OLS. The resulting abundance of RCU patches was presented the ...@@ -90,16 +90,20 @@ at OLS. The resulting abundance of RCU patches was presented the
following year [McKenney02a], and use of RCU in dcache was first following year [McKenney02a], and use of RCU in dcache was first
described that same year [Linder02a]. described that same year [Linder02a].
Also in 2002, Michael [Michael02b,Michael02a] presented techniques Also in 2002, Michael [Michael02b,Michael02a] presented "hazard-pointer"
that defer the destruction of data structures to simplify non-blocking techniques that defer the destruction of data structures to simplify
synchronization (wait-free synchronization, lock-free synchronization, non-blocking synchronization (wait-free synchronization, lock-free
and obstruction-free synchronization are all examples of non-blocking synchronization, and obstruction-free synchronization are all examples of
synchronization). In particular, this technique eliminates locking, non-blocking synchronization). In particular, this technique eliminates
reduces contention, reduces memory latency for readers, and parallelizes locking, reduces contention, reduces memory latency for readers, and
pipeline stalls and memory latency for writers. However, these parallelizes pipeline stalls and memory latency for writers. However,
techniques still impose significant read-side overhead in the form of these techniques still impose significant read-side overhead in the
memory barriers. Researchers at Sun worked along similar lines in the form of memory barriers. Researchers at Sun worked along similar lines
same timeframe [HerlihyLM02,HerlihyLMS03]. in the same timeframe [HerlihyLM02,HerlihyLMS03]. These techniques
can be thought of as inside-out reference counts, where the count is
represented by the number of hazard pointers referencing a given data
structure (rather than the more conventional counter field within the
data structure itself).
In 2003, the K42 group described how RCU could be used to create In 2003, the K42 group described how RCU could be used to create
hot-pluggable implementations of operating-system functions. Later that hot-pluggable implementations of operating-system functions. Later that
...@@ -113,7 +117,6 @@ number of operating-system kernels [PaulEdwardMcKenneyPhD], a paper ...@@ -113,7 +117,6 @@ number of operating-system kernels [PaulEdwardMcKenneyPhD], a paper
describing how to make RCU safe for soft-realtime applications [Sarma04c], describing how to make RCU safe for soft-realtime applications [Sarma04c],
and a paper describing SELinux performance with RCU [JamesMorris04b]. and a paper describing SELinux performance with RCU [JamesMorris04b].
2005 has seen further adaptation of RCU to realtime use, permitting 2005 has seen further adaptation of RCU to realtime use, permitting
preemption of RCU realtime critical sections [PaulMcKenney05a, preemption of RCU realtime critical sections [PaulMcKenney05a,
PaulMcKenney05b]. PaulMcKenney05b].
......
...@@ -177,3 +177,9 @@ over a rather long period of time, but improvements are always welcome! ...@@ -177,3 +177,9 @@ over a rather long period of time, but improvements are always welcome!
If you want to wait for some of these other things, you might If you want to wait for some of these other things, you might
instead need to use synchronize_irq() or synchronize_sched(). instead need to use synchronize_irq() or synchronize_sched().
12. Any lock acquired by an RCU callback must be acquired elsewhere
with irq disabled, e.g., via spin_lock_irqsave(). Failing to
disable irq on a given acquisition of that lock will result in
deadlock as soon as the RCU callback happens to interrupt that
acquisition's critical section.
...@@ -232,7 +232,7 @@ entry does not exist. For this to be helpful, the search function must ...@@ -232,7 +232,7 @@ entry does not exist. For this to be helpful, the search function must
return holding the per-entry spinlock, as ipc_lock() does in fact do. return holding the per-entry spinlock, as ipc_lock() does in fact do.
Quick Quiz: Why does the search function need to return holding the Quick Quiz: Why does the search function need to return holding the
per-entry lock for this deleted-flag technique to be helpful? per-entry lock for this deleted-flag technique to be helpful?
If the system-call audit module were to ever need to reject stale data, If the system-call audit module were to ever need to reject stale data,
one way to accomplish this would be to add a "deleted" flag and a "lock" one way to accomplish this would be to add a "deleted" flag and a "lock"
...@@ -275,8 +275,8 @@ flag under the spinlock as follows: ...@@ -275,8 +275,8 @@ flag under the spinlock as follows:
{ {
struct audit_entry *e; struct audit_entry *e;
/* Do not use the _rcu iterator here, since this is the only /* Do not need to use the _rcu iterator here, since this
* deletion routine. */ * is the only deletion routine. */
list_for_each_entry(e, list, list) { list_for_each_entry(e, list, list) {
if (!audit_compare_rule(rule, &e->rule)) { if (!audit_compare_rule(rule, &e->rule)) {
spin_lock(&e->lock); spin_lock(&e->lock);
...@@ -304,9 +304,12 @@ function to reject newly deleted data. ...@@ -304,9 +304,12 @@ function to reject newly deleted data.
Answer to Quick Quiz Answer to Quick Quiz
Why does the search function need to return holding the per-entry
If the search function drops the per-entry lock before returning, then lock for this deleted-flag technique to be helpful?
the caller will be processing stale data in any case. If it is really
OK to be processing stale data, then you don't need a "deleted" flag. If the search function drops the per-entry lock before returning,
If processing stale data really is a problem, then you need to hold the then the caller will be processing stale data in any case. If it
per-entry lock across all of the code that uses the value looked up. is really OK to be processing stale data, then you don't need a
"deleted" flag. If processing stale data really is a problem,
then you need to hold the per-entry lock across all of the code
that uses the value that was returned.
...@@ -111,6 +111,11 @@ o What are all these files in this directory? ...@@ -111,6 +111,11 @@ o What are all these files in this directory?
You are reading it! You are reading it!
rcuref.txt
Describes how to combine use of reference counts
with RCU.
whatisRCU.txt whatisRCU.txt
Overview of how the RCU implementation works. Along Overview of how the RCU implementation works. Along
......
Refcounter design for elements of lists/arrays protected by RCU. Reference-count design for elements of lists/arrays protected by RCU.
Refcounting on elements of lists which are protected by traditional Reference counting on elements of lists which are protected by traditional
reader/writer spinlocks or semaphores are straight forward as in: reader/writer spinlocks or semaphores are straightforward:
1. 2. 1. 2.
add() search_and_reference() add() search_and_reference()
...@@ -28,12 +28,12 @@ release_referenced() delete() ...@@ -28,12 +28,12 @@ release_referenced() delete()
... ...
} }
If this list/array is made lock free using rcu as in changing the If this list/array is made lock free using RCU as in changing the
write_lock in add() and delete() to spin_lock and changing read_lock write_lock() in add() and delete() to spin_lock and changing read_lock
in search_and_reference to rcu_read_lock(), the atomic_get in in search_and_reference to rcu_read_lock(), the atomic_get in
search_and_reference could potentially hold reference to an element which search_and_reference could potentially hold reference to an element which
has already been deleted from the list/array. atomic_inc_not_zero takes has already been deleted from the list/array. Use atomic_inc_not_zero()
care of this scenario. search_and_reference should look as; in this scenario as follows:
1. 2. 1. 2.
add() search_and_reference() add() search_and_reference()
...@@ -51,17 +51,16 @@ add() search_and_reference() ...@@ -51,17 +51,16 @@ add() search_and_reference()
release_referenced() delete() release_referenced() delete()
{ { { {
... write_lock(&list_lock); ... write_lock(&list_lock);
atomic_dec(&el->rc, relfunc) ... if (atomic_dec_and_test(&el->rc)) ...
... delete_element call_rcu(&el->head, el_free); delete_element
} write_unlock(&list_lock); ... write_unlock(&list_lock);
... } ...
if (atomic_dec_and_test(&el->rc)) if (atomic_dec_and_test(&el->rc))
call_rcu(&el->head, el_free); call_rcu(&el->head, el_free);
... ...
} }
Sometimes, reference to the element need to be obtained in the Sometimes, a reference to the element needs to be obtained in the
update (write) stream. In such cases, atomic_inc_not_zero might be an update (write) stream. In such cases, atomic_inc_not_zero() might be
overkill since the spinlock serialising list updates are held. atomic_inc overkill, since we hold the update-side spinlock. One might instead
is to be used in such cases. use atomic_inc() in such cases.
...@@ -200,10 +200,11 @@ rcu_assign_pointer() ...@@ -200,10 +200,11 @@ rcu_assign_pointer()
the new value, and also executes any memory-barrier instructions the new value, and also executes any memory-barrier instructions
required for a given CPU architecture. required for a given CPU architecture.
Perhaps more important, it serves to document which pointers Perhaps just as important, it serves to document (1) which
are protected by RCU. That said, rcu_assign_pointer() is most pointers are protected by RCU and (2) the point at which a
frequently used indirectly, via the _rcu list-manipulation given structure becomes accessible to other CPUs. That said,
primitives such as list_add_rcu(). rcu_assign_pointer() is most frequently used indirectly, via
the _rcu list-manipulation primitives such as list_add_rcu().
rcu_dereference() rcu_dereference()
...@@ -258,9 +259,11 @@ rcu_dereference() ...@@ -258,9 +259,11 @@ rcu_dereference()
locking. locking.
As with rcu_assign_pointer(), an important function of As with rcu_assign_pointer(), an important function of
rcu_dereference() is to document which pointers are protected rcu_dereference() is to document which pointers are protected by
by RCU. And, again like rcu_assign_pointer(), rcu_dereference() RCU, in particular, flagging a pointer that is subject to changing
is typically used indirectly, via the _rcu list-manipulation at any time, including immediately after the rcu_dereference().
And, again like rcu_assign_pointer(), rcu_dereference() is
typically used indirectly, via the _rcu list-manipulation
primitives, such as list_for_each_entry_rcu(). primitives, such as list_for_each_entry_rcu().
The following diagram shows how each API communicates among the The following diagram shows how each API communicates among the
...@@ -327,7 +330,7 @@ for specialized uses, but are relatively uncommon. ...@@ -327,7 +330,7 @@ for specialized uses, but are relatively uncommon.
3. WHAT ARE SOME EXAMPLE USES OF CORE RCU API? 3. WHAT ARE SOME EXAMPLE USES OF CORE RCU API?
This section shows a simple use of the core RCU API to protect a This section shows a simple use of the core RCU API to protect a
global pointer to a dynamically allocated structure. More typical global pointer to a dynamically allocated structure. More-typical
uses of RCU may be found in listRCU.txt, arrayRCU.txt, and NMI-RCU.txt. uses of RCU may be found in listRCU.txt, arrayRCU.txt, and NMI-RCU.txt.
struct foo { struct foo {
...@@ -410,6 +413,8 @@ o Use synchronize_rcu() -after- removing a data element from an ...@@ -410,6 +413,8 @@ o Use synchronize_rcu() -after- removing a data element from an
data item. data item.
See checklist.txt for additional rules to follow when using RCU. See checklist.txt for additional rules to follow when using RCU.
And again, more-typical uses of RCU may be found in listRCU.txt,
arrayRCU.txt, and NMI-RCU.txt.
4. WHAT IF MY UPDATING THREAD CANNOT BLOCK? 4. WHAT IF MY UPDATING THREAD CANNOT BLOCK?
...@@ -513,7 +518,7 @@ production-quality implementation, and see: ...@@ -513,7 +518,7 @@ production-quality implementation, and see:
for papers describing the Linux kernel RCU implementation. The OLS'01 for papers describing the Linux kernel RCU implementation. The OLS'01
and OLS'02 papers are a good introduction, and the dissertation provides and OLS'02 papers are a good introduction, and the dissertation provides
more details on the current implementation. more details on the current implementation as of early 2004.
5A. "TOY" IMPLEMENTATION #1: LOCKING 5A. "TOY" IMPLEMENTATION #1: LOCKING
...@@ -768,7 +773,6 @@ RCU pointer/list traversal: ...@@ -768,7 +773,6 @@ RCU pointer/list traversal:
rcu_dereference rcu_dereference
list_for_each_rcu (to be deprecated in favor of list_for_each_rcu (to be deprecated in favor of
list_for_each_entry_rcu) list_for_each_entry_rcu)
list_for_each_safe_rcu (deprecated, not used)
list_for_each_entry_rcu list_for_each_entry_rcu
list_for_each_continue_rcu (to be deprecated in favor of new list_for_each_continue_rcu (to be deprecated in favor of new
list_for_each_entry_continue_rcu) list_for_each_entry_continue_rcu)
...@@ -807,7 +811,8 @@ Quick Quiz #1: Why is this argument naive? How could a deadlock ...@@ -807,7 +811,8 @@ Quick Quiz #1: Why is this argument naive? How could a deadlock
Answer: Consider the following sequence of events: Answer: Consider the following sequence of events:
1. CPU 0 acquires some unrelated lock, call it 1. CPU 0 acquires some unrelated lock, call it
"problematic_lock". "problematic_lock", disabling irq via
spin_lock_irqsave().
2. CPU 1 enters synchronize_rcu(), write-acquiring 2. CPU 1 enters synchronize_rcu(), write-acquiring
rcu_gp_mutex. rcu_gp_mutex.
...@@ -894,7 +899,7 @@ Answer: Just as PREEMPT_RT permits preemption of spinlock ...@@ -894,7 +899,7 @@ Answer: Just as PREEMPT_RT permits preemption of spinlock
ACKNOWLEDGEMENTS ACKNOWLEDGEMENTS
My thanks to the people who helped make this human-readable, including My thanks to the people who helped make this human-readable, including
Jon Walpole, Josh Triplett, Serge Hallyn, and Suzanne Wood. Jon Walpole, Josh Triplett, Serge Hallyn, Suzanne Wood, and Alan Stern.
For more information, see http://www.rdrop.com/users/paulmck/RCU. For more information, see http://www.rdrop.com/users/paulmck/RCU.
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment