Commit f556082d authored by Akira Yokosawa's avatar Akira Yokosawa Committed by Paul E. McKenney

docs/memory-barriers.txt: Fixup long lines

Substitution of "data dependency barrier" with "address-dependency
barrier" left quite a lot of lines exceeding 80 columns.

Reflow those lines as well as a few short ones not related to
the substitution.

No changes in documentation text.
Signed-off-by: default avatarAkira Yokosawa <akiyks@gmail.com>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Will Deacon <will@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Andrea Parri <parri.andrea@gmail.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Daniel Lustig <dlustig@nvidia.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: default avatarPaul E. McKenney <paulmck@kernel.org>
parent 203185f6
...@@ -187,9 +187,9 @@ As a further example, consider this sequence of events: ...@@ -187,9 +187,9 @@ As a further example, consider this sequence of events:
B = 4; Q = P; B = 4; Q = P;
P = &B; D = *Q; P = &B; D = *Q;
There is an obvious address dependency here, as the value loaded into D depends on There is an obvious address dependency here, as the value loaded into D depends
the address retrieved from P by CPU 2. At the end of the sequence, any of the on the address retrieved from P by CPU 2. At the end of the sequence, any of
following results are possible: the following results are possible:
(Q == &A) and (D == 1) (Q == &A) and (D == 1)
(Q == &B) and (D == 2) (Q == &B) and (D == 2)
...@@ -397,25 +397,25 @@ Memory barriers come in four basic varieties: ...@@ -397,25 +397,25 @@ Memory barriers come in four basic varieties:
(2) Address-dependency barriers (historical). (2) Address-dependency barriers (historical).
An address-dependency barrier is a weaker form of read barrier. In the case An address-dependency barrier is a weaker form of read barrier. In the
where two loads are performed such that the second depends on the result case where two loads are performed such that the second depends on the
of the first (eg: the first load retrieves the address to which the second result of the first (eg: the first load retrieves the address to which
load will be directed), an address-dependency barrier would be required to the second load will be directed), an address-dependency barrier would
make sure that the target of the second load is updated after the address be required to make sure that the target of the second load is updated
obtained by the first load is accessed. after the address obtained by the first load is accessed.
An address-dependency barrier is a partial ordering on interdependent loads An address-dependency barrier is a partial ordering on interdependent
only; it is not required to have any effect on stores, independent loads loads only; it is not required to have any effect on stores, independent
or overlapping loads. loads or overlapping loads.
As mentioned in (1), the other CPUs in the system can be viewed as As mentioned in (1), the other CPUs in the system can be viewed as
committing sequences of stores to the memory system that the CPU being committing sequences of stores to the memory system that the CPU being
considered can then perceive. An address-dependency barrier issued by the CPU considered can then perceive. An address-dependency barrier issued by
under consideration guarantees that for any load preceding it, if that the CPU under consideration guarantees that for any load preceding it,
load touches one of a sequence of stores from another CPU, then by the if that load touches one of a sequence of stores from another CPU, then
time the barrier completes, the effects of all the stores prior to that by the time the barrier completes, the effects of all the stores prior to
touched by the load will be perceptible to any loads issued after the address- that touched by the load will be perceptible to any loads issued after
dependency barrier. the address-dependency barrier.
See the "Examples of memory barrier sequences" subsection for diagrams See the "Examples of memory barrier sequences" subsection for diagrams
showing the ordering constraints. showing the ordering constraints.
...@@ -437,16 +437,16 @@ Memory barriers come in four basic varieties: ...@@ -437,16 +437,16 @@ Memory barriers come in four basic varieties:
(3) Read (or load) memory barriers. (3) Read (or load) memory barriers.
A read barrier is an address-dependency barrier plus a guarantee that all the A read barrier is an address-dependency barrier plus a guarantee that all
LOAD operations specified before the barrier will appear to happen before the LOAD operations specified before the barrier will appear to happen
all the LOAD operations specified after the barrier with respect to the before all the LOAD operations specified after the barrier with respect to
other components of the system. the other components of the system.
A read barrier is a partial ordering on loads only; it is not required to A read barrier is a partial ordering on loads only; it is not required to
have any effect on stores. have any effect on stores.
Read memory barriers imply address-dependency barriers, and so can substitute Read memory barriers imply address-dependency barriers, and so can
for them. substitute for them.
[!] Note that read barriers should normally be paired with write barriers; [!] Note that read barriers should normally be paired with write barriers;
see the "SMP barrier pairing" subsection. see the "SMP barrier pairing" subsection.
...@@ -584,8 +584,8 @@ following sequence of events: ...@@ -584,8 +584,8 @@ following sequence of events:
[!] READ_ONCE_OLD() corresponds to READ_ONCE() of pre-4.15 kernel, which [!] READ_ONCE_OLD() corresponds to READ_ONCE() of pre-4.15 kernel, which
doesn't imply an address-dependency barrier. doesn't imply an address-dependency barrier.
There's a clear address dependency here, and it would seem that by the end of the There's a clear address dependency here, and it would seem that by the end of
sequence, Q must be either &A or &B, and that: the sequence, Q must be either &A or &B, and that:
(Q == &A) implies (D == 1) (Q == &A) implies (D == 1)
(Q == &B) implies (D == 4) (Q == &B) implies (D == 4)
...@@ -599,8 +599,8 @@ While this may seem like a failure of coherency or causality maintenance, it ...@@ -599,8 +599,8 @@ While this may seem like a failure of coherency or causality maintenance, it
isn't, and this behaviour can be observed on certain real CPUs (such as the DEC isn't, and this behaviour can be observed on certain real CPUs (such as the DEC
Alpha). Alpha).
To deal with this, READ_ONCE() provides an implicit address-dependency To deal with this, READ_ONCE() provides an implicit address-dependency barrier
barrier since kernel release v4.15: since kernel release v4.15:
CPU 1 CPU 2 CPU 1 CPU 2
=============== =============== =============== ===============
...@@ -627,12 +627,12 @@ but the old value of the variable B (2). ...@@ -627,12 +627,12 @@ but the old value of the variable B (2).
An address-dependency barrier is not required to order dependent writes An address-dependency barrier is not required to order dependent writes
because the CPUs that the Linux kernel supports don't do writes because the CPUs that the Linux kernel supports don't do writes until they
until they are certain (1) that the write will actually happen, (2) are certain (1) that the write will actually happen, (2) of the location of
of the location of the write, and (3) of the value to be written. the write, and (3) of the value to be written.
But please carefully read the "CONTROL DEPENDENCIES" section and the But please carefully read the "CONTROL DEPENDENCIES" section and the
Documentation/RCU/rcu_dereference.rst file: The compiler can and does Documentation/RCU/rcu_dereference.rst file: The compiler can and does break
break dependencies in a great many highly creative ways. dependencies in a great many highly creative ways.
CPU 1 CPU 2 CPU 1 CPU 2
=============== =============== =============== ===============
...@@ -678,8 +678,8 @@ not understand them. The purpose of this section is to help you prevent ...@@ -678,8 +678,8 @@ not understand them. The purpose of this section is to help you prevent
the compiler's ignorance from breaking your code. the compiler's ignorance from breaking your code.
A load-load control dependency requires a full read memory barrier, not A load-load control dependency requires a full read memory barrier, not
simply an (implicit) address-dependency barrier to make it work correctly. Consider the simply an (implicit) address-dependency barrier to make it work correctly.
following bit of code: Consider the following bit of code:
q = READ_ONCE(a); q = READ_ONCE(a);
<implicit address-dependency barrier> <implicit address-dependency barrier>
...@@ -691,8 +691,8 @@ following bit of code: ...@@ -691,8 +691,8 @@ following bit of code:
This will not have the desired effect because there is no actual address This will not have the desired effect because there is no actual address
dependency, but rather a control dependency that the CPU may short-circuit dependency, but rather a control dependency that the CPU may short-circuit
by attempting to predict the outcome in advance, so that other CPUs see by attempting to predict the outcome in advance, so that other CPUs see
the load from b as having happened before the load from a. In such a the load from b as having happened before the load from a. In such a case
case what's actually required is: what's actually required is:
q = READ_ONCE(a); q = READ_ONCE(a);
if (q) { if (q) {
...@@ -980,8 +980,8 @@ Basically, the read barrier always has to be there, even though it can be of ...@@ -980,8 +980,8 @@ Basically, the read barrier always has to be there, even though it can be of
the "weaker" type. the "weaker" type.
[!] Note that the stores before the write barrier would normally be expected to [!] Note that the stores before the write barrier would normally be expected to
match the loads after the read barrier or the address-dependency barrier, and vice match the loads after the read barrier or the address-dependency barrier, and
versa: vice versa:
CPU 1 CPU 2 CPU 1 CPU 2
=================== =================== =================== ===================
...@@ -1033,8 +1033,8 @@ STORE B, STORE C } all occurring before the unordered set of { STORE D, STORE E ...@@ -1033,8 +1033,8 @@ STORE B, STORE C } all occurring before the unordered set of { STORE D, STORE E
V V
Secondly, address-dependency barriers act as partial orderings on address-dependent Secondly, address-dependency barriers act as partial orderings on address-
loads. Consider the following sequence of events: dependent loads. Consider the following sequence of events:
CPU 1 CPU 2 CPU 1 CPU 2
======================= ======================= ======================= =======================
...@@ -1079,8 +1079,8 @@ effectively random order, despite the write barrier issued by CPU 1: ...@@ -1079,8 +1079,8 @@ effectively random order, despite the write barrier issued by CPU 1:
In the above example, CPU 2 perceives that B is 7, despite the load of *C In the above example, CPU 2 perceives that B is 7, despite the load of *C
(which would be B) coming after the LOAD of C. (which would be B) coming after the LOAD of C.
If, however, an address-dependency barrier were to be placed between the load of C If, however, an address-dependency barrier were to be placed between the load
and the load of *C (ie: B) on CPU 2: of C and the load of *C (ie: B) on CPU 2:
CPU 1 CPU 2 CPU 1 CPU 2
======================= ======================= ======================= =======================
...@@ -2761,7 +2761,8 @@ is discarded from the CPU's cache and reloaded. To deal with this, the ...@@ -2761,7 +2761,8 @@ is discarded from the CPU's cache and reloaded. To deal with this, the
appropriate part of the kernel must invalidate the overlapping bits of the appropriate part of the kernel must invalidate the overlapping bits of the
cache on each CPU. cache on each CPU.
See Documentation/core-api/cachetlb.rst for more information on cache management. See Documentation/core-api/cachetlb.rst for more information on cache
management.
CACHE COHERENCY VS MMIO CACHE COHERENCY VS MMIO
...@@ -2901,8 +2902,8 @@ AND THEN THERE'S THE ALPHA ...@@ -2901,8 +2902,8 @@ AND THEN THERE'S THE ALPHA
The DEC Alpha CPU is one of the most relaxed CPUs there is. Not only that, The DEC Alpha CPU is one of the most relaxed CPUs there is. Not only that,
some versions of the Alpha CPU have a split data cache, permitting them to have some versions of the Alpha CPU have a split data cache, permitting them to have
two semantically-related cache lines updated at separate times. This is where two semantically-related cache lines updated at separate times. This is where
the address-dependency barrier really becomes necessary as this synchronises both the address-dependency barrier really becomes necessary as this synchronises
caches with the memory coherence system, thus making it seem like pointer both caches with the memory coherence system, thus making it seem like pointer
changes vs new data occur in the right order. changes vs new data occur in the right order.
The Alpha defines the Linux kernel's memory model, although as of v4.15 The Alpha defines the Linux kernel's memory model, although as of v4.15
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment