Commits · dd1277d2ad95e7f0de1b79c70fdfe635d9df0f80 · Kirill Smelkov / linux

07 Oct, 2021 3 commits

Merge branches 'fixes.2021.10.07a', 'scftorture.2021.09.16a',... · dd1277d2

Paul E. McKenney authored Oct 07, 2021

Merge branches 'fixes.2021.10.07a', 'scftorture.2021.09.16a', 'tasks.2021.09.15a', 'torture.2021.09.13b' and 'torturescript.2021.09.16a' into HEAD

fixes.2021.10.07a: Miscellaneous fixes.
scftorture.2021.09.16a: smp_call_function torture-test updates.
tasks.2021.09.15a: Tasks-trace RCU updates.
torture.2021.09.13b: Other torture-test updates.
torturescript.2021.09.16a: Torture-test scripting updates.

dd1277d2

rcu: Fix rcu_dynticks_curr_cpu_in_eqs() vs noinstr · 74aece72

Peter Zijlstra authored Sep 28, 2021

  vmlinux.o: warning: objtool: rcu_nmi_enter()+0x36: call to __kasan_check_read() leaves .noinstr.text section

noinstr cannot have atomic_*() functions in because they're explicitly
annotated, use arch_atomic_*().

Fixes: 2be57f73 ("rcu: Weaken ->dynticks accesses and updates")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

74aece72

rcu: Always inline rcu_dynticks_task*_{enter,exit}() · 7663ad9a

Peter Zijlstra authored Sep 28, 2021

RCU managed to grow a few noinstr violations:

vmlinux.o: warning: objtool: rcu_dynticks_eqs_enter()+0x0: call to rcu_dynticks_task_trace_enter() leaves .noinstr.text section
vmlinux.o: warning: objtool: rcu_dynticks_eqs_exit()+0xe: call to rcu_dynticks_task_trace_exit() leaves .noinstr.text section

Fix them by adding __always_inline to the relevant trivial functions.

Also replace the noinstr with __always_inline for the existing
rcu_dynticks_task_*() functions since noinstr would force noinline
them, even when empty, which seems silly.

Fixes: 7d0c9c50 ("rcu-tasks: Avoid IPIing userspace/idle tasks if kernel is so built")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

7663ad9a

16 Sep, 2021 9 commits

torture: Make kvm-remote.sh print size of downloaded tarball · faaaf2ac

Paul E. McKenney authored Aug 05, 2021

This commit causes kvm-remote.sh to print the size of the tarball that
is downloaded to each of the remote systems.  This size can help with
performance projections and analysis.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

faaaf2ac

torture: Allot 1G of memory for scftorture runs · ae3357ac

Paul E. McKenney authored Jul 21, 2021

By default, torture.sh allots 512M of memory for each guest OS.  However,
when running scftorture with KASAN, 1G is needed.  This commit therefore
causes torture.sh to provide the required 1G.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

ae3357ac

tools/rcu: Add an extract-stall script · 2010776f

Paul E. McKenney authored Jul 19, 2021

This commit adds a script that extracts RCU CPU stall warnings
from console output.  The user can optionally specify the number of
lines preceding the stall to output, and also the number of lines of
stall-warning text.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

2010776f

scftorture: Warn on individual scf_torture_init() error conditions · f2bdf7dc

Paul E. McKenney authored Aug 05, 2021

When running scftorture as a module, any scf_torture_init() issues will be
reflected in the error code from modprobe or insmod, as the case may be.
However, these error codes are not available when running scftorture
built-in, for example, when using the kvm.sh script. This commit
therefore adds WARN_ON_ONCE() to allow distinguishing scf_torture_init()
errors when running scftorture built-in.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

f2bdf7dc

scftorture: Count reschedule IPIs · c3d0258d

Paul E. McKenney authored Jul 14, 2021

Currently, only those IPIs that invoke scftorture's scf_handler()
IPI handler function are counted. This means that runs exercising
only scftorture.weight_resched will look like they have made no forward
progress, resulting in "GP HANG" complaints from the rcutorture scripting.
This commit therefore increments the scf_invoked_count per-CPU counter
immediately after calling resched_cpu().

Fixes: 1ac78b49 ("scftorture: Add an alternative IPI vector")
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

c3d0258d

scftorture: Account for weight_resched when checking for all zeroes · da9366c6

Paul E. McKenney authored Jul 13, 2021

The "all zero weights makes no sense" error is emitted even when
scftorture.weight_resched is non-zero because it was left out of
the enclosing "if" condition.  This commit adds it in.

Fixes: 1ac78b49 ("scftorture: Add an alternative IPI vector")
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

da9366c6

scftorture: Shut down if nonsensical arguments given · 2b1388f8

Paul E. McKenney authored Jul 13, 2021

If (say) a 10-hour scftorture run is started, but the module parameters
are so nonsensical that the run doesn't even start, then scftorture will
wait the full ten hours when run built into a guest OS. This commit
therefore shuts down the system in this case so that the error is reported
immediately instead of ten hours hence.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

2b1388f8

scftorture: Allow zero weight to exclude an smp_call_function*() category · 2f611d04

Paul E. McKenney authored Jul 13, 2021

This commit reworks the weighting calculations to allow zero to
be specified to disable a given weight.  For example, specifying the
scftorture.weight_resched=0 kernel boot parameter without specifying a
non-zero value for any of the other scftorture.weight_* parameters would
provide the default weights for the others, but would refrain from doing
any resched-based IPIs.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

2f611d04

rcu: Avoid unneeded function call in rcu_read_unlock() · 925da92b

Waiman Long authored Aug 26, 2021

Since commit aa40c138 ("rcu: Report QS for outermost PREEMPT=n
rcu_read_unlock() for strict GPs") the function rcu_read_unlock_strict()
is invoked by the inlined rcu_read_unlock() function.  However,
rcu_read_unlock_strict() is an empty function in production kernels,
which are built with CONFIG_RCU_STRICT_GRACE_PERIOD=n.

There is a mention of rcu_read_unlock_strict() in the BPF verifier,
but this is in a deny-list, meaning that BPF does not care whether
rcu_read_unlock_strict() is ever called.

This commit therefore provides a slight performance improvement
by hoisting the check of CONFIG_RCU_STRICT_GRACE_PERIOD from
rcu_read_unlock_strict() into rcu_read_unlock(), thus avoiding the
pointless call to an empty function.

Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

925da92b

15 Sep, 2021 12 commits

rcu-tasks: Update comments to cond_resched_tasks_rcu_qs() · 8af9e2c7

Paul E. McKenney authored Sep 15, 2021

The cond_resched_rcu_qs() function no longer exists, despite being mentioned
several times in kernel/rcu/tasks.h.  This commit therefore updates it to
the current cond_resched_tasks_rcu_qs().
Reported-by: Neeraj Upadhyay <neeraju@codeaurora.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

8af9e2c7

rcu-tasks: Fix IPI failure handling in trc_wait_for_one_reader · 46aa886c

Neeraj Upadhyay authored Aug 27, 2021

The trc_wait_for_one_reader() function is called at multiple stages
of trace rcu-tasks GP function, rcu_tasks_wait_gp():

- First, it is called as part of per task function -
  rcu_tasks_trace_pertask(), for all non-idle tasks. As part of per task
  processing, this function add the task in the holdout list and if the
  task is currently running on a CPU, it sends IPI to the task's CPU.
  The IPI handler takes action depending on whether task is in trace
  rcu-tasks read side critical section or not:

  - a. If the task is in trace rcu-tasks read side critical section
       (t->trc_reader_nesting != 0), the IPI handler sets the task's
       ->trc_reader_special.b.need_qs, so that this task notifies exit
       from its outermost read side critical section (by decrementing
       trc_n_readers_need_end) to the GP handling function.
       trc_wait_for_one_reader() also increments trc_n_readers_need_end,
       so that the trace rcu-tasks GP handler function waits for this
       task's read side exit notification. The IPI handler also sets
       t->trc_reader_checked to true, and no further IPIs are sent for
       this task, for this trace rcu-tasks grace period and this
       task can be removed from holdout list.

  - b. If the task is in the process of exiting its trace rcu-tasks
       read side critical section, (t->trc_reader_nesting < 0), defer
       this task's processing to future calls to trc_wait_for_one_reader().

  - c. If task is not in rcu-task read side critical section,
       t->trc_reader_nesting == 0, ->trc_reader_checked is set for this
       task, so that this task is removed from holdout list.

- Second, trc_wait_for_one_reader() is called as part of post scan, in
  function rcu_tasks_trace_postscan(), for all idle tasks.

- Third, in function check_all_holdout_tasks_trace(), this function is
  called for each task in the holdout list, but only if there isn't
  a pending IPI for the task (->trc_ipi_to_cpu == -1). This function
  removed the task from holdout list, if IPI handler has completed the
  required work, to ensure that the current trace rcu-tasks grace period
  either waits for this task, or this task is not in a trace rcu-tasks
  read side critical section.

Now, considering the scenario where smp_call_function_single() fails in
first case, inside rcu_tasks_trace_pertask(). In this case,
->trc_ipi_to_cpu is set to the current CPU for that task. This will
result in trc_wait_for_one_reader() getting skipped in third case,
inside check_all_holdout_tasks_trace(), for this task. This further
results in ->trc_reader_checked never getting set for this task,
and the task not getting removed from holdout list. This can cause
the current trace rcu-tasks grace period to stall.

Fix the above problem, by resetting ->trc_ipi_to_cpu to -1, on
smp_call_function_single() failure, so that future IPI calls can
be send for this task.

Note that all three of the trc_wait_for_one_reader() function's
callers (rcu_tasks_trace_pertask(), rcu_tasks_trace_postscan(),
check_all_holdout_tasks_trace()) hold cpu_read_lock().  This means
that smp_call_function_single() cannot race with CPU hotplug, and thus
should never fail.  Therefore, also add a warning in order to report
any such failure in case smp_call_function_single() grows some other
reason for failure.
Signed-off-by: Neeraj Upadhyay <neeraju@codeaurora.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

46aa886c

rcu-tasks: Fix read-side primitives comment for call_rcu_tasks_trace · ed42c380

Neeraj Upadhyay authored Aug 25, 2021

call_rcu_tasks_trace() does have read-side primitives - rcu_read_lock_trace()
and rcu_read_unlock_trace(). Fix this information in the comments.
Signed-off-by: Neeraj Upadhyay <neeraju@codeaurora.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

ed42c380

rcu-tasks: Clarify read side section info for rcu_tasks_rude GP primitives · a6517e9c

Neeraj Upadhyay authored Aug 18, 2021

RCU tasks rude variant does not check whether the current
running context on a CPU is usermode. Read side critical section ends
on transition to usermode execution, by the virtue of usermode
execution being schedulable. Clarify this in comments for
call_rcu_tasks_rude() and synchronize_rcu_tasks_rude().
Signed-off-by: Neeraj Upadhyay <neeraju@codeaurora.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

a6517e9c

rcu-tasks: Correct comparisons for CPU numbers in show_stalled_task_trace · d39ec8f3

Neeraj Upadhyay authored Aug 18, 2021

Valid CPU numbers can be zero or greater, but the checks for
->trc_ipi_to_cpu and tick_nohz_full_cpu()'s argument are for strictly
greater than.  This commit therefore corrects the check for no_hz_full
cpu in show_stalled_task_trace() so as to include cpu 0.
Signed-off-by: Neeraj Upadhyay <neeraju@codeaurora.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

d39ec8f3

rcu-tasks: Correct firstreport usage in check_all_holdout_tasks_trace · 89401176

Neeraj Upadhyay authored Aug 18, 2021

In check_all_holdout_tasks_trace(), firstreport is a pointer argument;
so, check the dereferenced value, instead of checking the pointer.
Signed-off-by: Neeraj Upadhyay <neeraju@codeaurora.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

89401176

rcu-tasks: Fix s/rcu_add_holdout/trc_add_holdout/ typo in comment · d0a85858
Neeraj Upadhyay authored Aug 18, 2021
```
Signed-off-by: Neeraj Upadhyay <neeraju@codeaurora.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
```
d0a85858

rcu-tasks: Move RTGS_WAIT_CBS to beginning of rcu_tasks_kthread() loop · 0db7c32a

Paul E. McKenney authored Aug 11, 2021

Early in debugging, it made some sense to differentiate the first
iteration from subsequent iterations, but now this just causes confusion.
This commit therefore moves the "set_tasks_gp_state(rtp, RTGS_WAIT_CBS)"
statement to the beginning of the "for" loop in rcu_tasks_kthread().
Reported-by: Neeraj Upadhyay <neeraju@codeaurora.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

0db7c32a

rcu-tasks: Fix s/instruction/instructions/ typo in comment · c4f113ac
Paul E. McKenney authored Aug 05, 2021
```
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
```
c4f113ac

rcu-tasks: Remove second argument of rcu_read_unlock_trace_special() · a5c071cc

Paul E. McKenney authored Jul 28, 2021

The second argument of rcu_read_unlock_trace_special() is always zero.
When called from exit_tasks_rcu_finish_trace(), it is the constant
zero, and rcu_read_unlock_trace_special() doesn't get called from
rcu_read_unlock_trace() unless the value of local variable "nesting"
is zero because in that case the early return is taken instead.

This commit therefore removes the "nesting" argument from the
rcu_read_unlock_trace_special() function, substituting the constant
zero within that function. This commit also adds a WARN_ON_ONCE()
to rcu_read_lock_trace_held() in case non-zeroness some day appears.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

a5c071cc

rcu-tasks: Add trc_inspect_reader() checks for exiting critical section · 18f08e75

Paul E. McKenney authored Jul 28, 2021

Currently, trc_inspect_reader() treats a task exiting its RCU Tasks
Trace read-side critical section the same as being within that critical
section.  However, this can fail because that task might have already
checked its .need_qs field, which means that it might never decrement
the all-important trc_n_readers_need_end counter.  Of course, for that
to happen, the task would need to never again execute an RCU Tasks Trace
read-side critical section, but this really could happen if the system's
last trampoline was removed.  Note that exit from such a critical section
cannot be treated as a quiescent state due to the possibility of nested
critical sections.  This means that if trc_inspect_reader() sees a
negative nesting value, it must set up to try again later.

This commit therefore ignores tasks that are exiting their RCU Tasks
Trace read-side critical sections so that they will be rechecked later.

[ paulmck: Apply feedback from Neeraj Upadhyay and Boqun Feng. ]
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

18f08e75

rcu-tasks: Simplify trc_read_check_handler() atomic operations · 96017bf9

Paul E. McKenney authored Jul 28, 2021

Currently, trc_wait_for_one_reader() atomically increments
the trc_n_readers_need_end counter before sending the IPI
invoking trc_read_check_handler(). All failure paths out of
trc_read_check_handler() and also from the smp_call_function_single()
within trc_wait_for_one_reader() must carefully atomically decrement
this counter. This is more complex than it needs to be.

This commit therefore simplifies things and saves a few lines of
code by dispensing with the atomic decrements in favor of having
trc_read_check_handler() do the atomic increment only in the success case.
In theory, this represents no change in functionality.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

96017bf9

13 Sep, 2021 16 commits

torture: Make torture.sh print the number of files to be compressed · b380b10b

Paul E. McKenney authored Jul 15, 2021

Compressing gigabyte vmlinux files can take some time, and it can be a
bit annoying to not know many more batches of compression there will be.
This commit therefore makes torture.sh print the number of files to be
compressed just before starting compression and just after compression
completes.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

b380b10b

rcutorture: Avoid problematic critical section nesting on PREEMPT_RT · 71921a96

Scott Wood authored Aug 20, 2021

rcutorture is generating some nesting scenarios that are not compatible on PREEMPT_RT.
For example:
	preempt_disable();
	rcu_read_lock_bh();
	preempt_enable();
	rcu_read_unlock_bh();

The problem here is that on PREEMPT_RT the bottom halves have to be
disabled and enabled in preemptible context.

Reorder locking: start with BH locking and continue with then with
disabling preemption or interrupts. In the unlocking do it reverse by
first enabling interrupts and preemption and BH at the very end.
Ensure that on PREEMPT_RT BH locking remains unchanged if in
non-preemptible context.

Link: https://lkml.kernel.org/r/20190911165729.11178-6-swood@redhat.com
Link: https://lkml.kernel.org/r/20210819182035.GF4126399@paulmck-ThinkPad-P17-Gen-1Signed-off-by: Scott Wood <swood@redhat.com>
[bigeasy: Drop ATOM_BH, make it only about changing BH in atomic
context. Allow enabling RCU in IRQ-off section. Reword commit message.]
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

71921a96

rcutorture: Don't cpuhp_remove_state() if cpuhp_setup_state() failed · fd13fe16

Paul E. McKenney authored Aug 06, 2021

Currently, in CONFIG_RCU_BOOST kernels, if the rcu_torture_init()
function's call to cpuhp_setup_state() fails, rcu_torture_cleanup()
gamely passes nonsense to cpuhp_remove_state(). This results in
strange and misleading splats. This commit therefore ensures that if
the rcu_torture_init() function's call to cpuhp_setup_state() fails,
rcu_torture_cleanup() avoids invoking cpuhp_remove_state().
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

fd13fe16

rcuscale: Warn on individual rcu_scale_init() error conditions · eb77abfd

Paul E. McKenney authored Aug 05, 2021

When running rcuscale as a module, any rcu_scale_init() issues will be
reflected in the error code from modprobe or insmod, as the case may be.
However, these error codes are not available when running rcuscale
built-in, for example, when using the kvm.sh script. This commit
therefore adds WARN_ON_ONCE() to allow distinguishing rcu_scale_init()
errors when running rcuscale built-in.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

eb77abfd

refscale: Warn on individual ref_scale_init() error conditions · ed60ad73

Paul E. McKenney authored Aug 05, 2021

When running refscale as a module, any ref_scale_init() issues will be
reflected in the error code from modprobe or insmod, as the case may be.
However, these error codes are not available when running refscale
built-in, for example, when using the kvm.sh script. This commit
therefore adds WARN_ON_ONCE() to allow distinguishing ref_scale_init()
errors when running refscale built-in.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

ed60ad73

locktorture: Warn on individual lock_torture_init() error conditions · b3b3cc61

Paul E. McKenney authored Aug 05, 2021

When running locktorture as a module, any lock_torture_init() issues will be
reflected in the error code from modprobe or insmod, as the case may be.
However, these error codes are not available when running locktorture
built-in, for example, when using the kvm.sh script. This commit
therefore adds WARN_ON_ONCE() to allow distinguishing lock_torture_init()
errors when running locktorture built-in.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

b3b3cc61

rcutorture: Warn on individual rcu_torture_init() error conditions · efeff6b3

Paul E. McKenney authored Aug 05, 2021

When running rcutorture as a module, any rcu_torture_init() issues will be
reflected in the error code from modprobe or insmod, as the case may be.
However, these error codes are not available when running rcutorture
built-in, for example, when using the kvm.sh script. This commit
therefore adds WARN_ON_ONCE() to allow distinguishing rcu_torture_init()
errors when running rcutorture built-in.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

efeff6b3

rcutorture: Suppressing read-exit testing is not an error · fda84866

Paul E. McKenney authored Aug 03, 2021

Currently, specifying the rcutorture.read_exit_burst=0 kernel boot
parameter will result in a -EINVAL exit code that will stop the rcutorture
test run before it has fully initialized.  This commit therefore uses a
zero exit code in that case, thus allowing rcutorture.read_exit_burst=0
to complete normally.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

fda84866

rcu-tasks: Wait for trc_read_check_handler() IPIs · cbe0d8d9

Paul E. McKenney authored Jul 30, 2021

Currently, RCU Tasks Trace initializes the trc_n_readers_need_end counter
to the value one, increments it before each trc_read_check_handler()
IPI, then decrements it within trc_read_check_handler() if the target
task was in a quiescent state (or if the target task moved to some other
CPU while the IPI was in flight), complaining if the new value was zero.
The rationale for complaining is that the initial value of one must be
decremented away before zero can be reached, and this decrement has not
yet happened.

Except that trc_read_check_handler() is initiated with an asynchronous
smp_call_function_single(), which might be significantly delayed. This
can result in false-positive complaints about the counter reaching zero.

This commit therefore waits for in-flight IPI handlers to complete before
decrementing away the initial value of one from the trc_n_readers_need_end
counter.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

cbe0d8d9

rcu: Fix existing exp request check in sync_sched_exp_online_cleanup() · f0b2b2df

Neeraj Upadhyay authored Aug 18, 2021

The sync_sched_exp_online_cleanup() checks to see if RCU needs
an expedited quiescent state from the incoming CPU, sending it
an IPI if so. Before sending IPI, it checks whether expedited
qs need has been already requested for the incoming CPU, by
checking rcu_data.cpu_no_qs.b.exp for the current cpu, on which
sync_sched_exp_online_cleanup() is running. This works for the
case where incoming CPU is same as self. However, for the case
where incoming CPU is different from self, expedited request
won't get marked, which can potentially delay reporting of
expedited quiescent state for the incoming CPU.

Fixes: e015a341 ("rcu: Avoid self-IPI in sync_sched_exp_online_cleanup()")
Signed-off-by: Neeraj Upadhyay <neeraju@codeaurora.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

f0b2b2df

rcu: Make rcu update module parameters world-readable · 1eac0075

Juri Lelli authored Aug 10, 2021

rcu update module parameters currently don't appear in sysfs and this is
a serviceability issue as it might be needed to access their default
values at runtime.

Fix this issue by changing rcu update module parameters permissions to
world-readable.
Suggested-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Juri Lelli <juri.lelli@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

1eac0075

rcu: Make rcu_normal_after_boot writable again · ebb6d30d

Juri Lelli authored Aug 10, 2021

Certain configurations (e.g., systems that make heavy use of netns)
need to use synchronize_rcu_expedited() to service RCU grace periods
even after boot.

Even though synchronize_rcu_expedited() has been traditionally
considered harmful for RT for the heavy use of IPIs, it is perfectly
usable under certain conditions (e.g. nohz_full).

Make rcupdate.rcu_normal_after_boot= again writeable on RT (if NO_HZ_
FULL is defined), but keep its default value to 1 (enabled) to avoid
regressions. Users who need synchronize_rcu_expedited() will boot with
rcupdate.rcu_normal_after_ boot=0 in the kernel cmdline.

Reflect the change in synchronize_rcu_expedited_wait() by removing the
WARN related to CONFIG_PREEMPT_RT.
Signed-off-by: Juri Lelli <juri.lelli@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

ebb6d30d

rcu: Make rcutree_dying_cpu() use its "cpu" parameter · 4aa846f9

Paul E. McKenney authored Jul 29, 2021

The CPU-hotplug functions take a "cpu" parameter, but rcutree_dying_cpu()
ignores it in favor of this_cpu_ptr(). This works at the moment, but
it would be better to be consistent. This might also work better given
some possible future changes. This commit therefore uses per_cpu_ptr()
to avoid ignoring the rcutree_dying_cpu() function's argument.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

4aa846f9

rcu: Simplify rcu_report_dead() call to rcu_report_exp_rdp() · 768f5d50

Paul E. McKenney authored Jul 29, 2021

Currently, rcu_report_dead() disables preemption across its call to
rcu_report_exp_rdp(), but this is pointless because interrupts are
already disabled by the caller.  In addition, rcu_report_dead() computes
the address of the outgoing CPU's rcu_data structure, which is also
pointless because this address is already present in local variable rdp.
This commit therefore drops the preemption disabling and passes rdp
to rcu_report_exp_rdp().
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

768f5d50

rcu: Move rcu_dynticks_eqs_online() to rcu_cpu_starting() · 2caebefb

Paul E. McKenney authored Jul 28, 2021

The purpose of rcu_dynticks_eqs_online() is to adjust the ->dynticks
counter of an incoming CPU when required. It is currently invoked
from rcutree_prepare_cpu(), which runs before the incoming CPU is
running, and thus on some other CPU. This makes the per-CPU accesses in
rcu_dynticks_eqs_online() iffy at best, and it all "works" only because
the running CPU cannot possibly be in dyntick-idle mode, which means
that rcu_dynticks_eqs_online() never has any effect.

It is currently OK for rcu_dynticks_eqs_online() to have no effect, but
only because the CPU-offline process just happens to leave ->dynticks in
the correct state. After all, if ->dynticks were in the wrong state on a
just-onlined CPU, rcutorture would complain bitterly the next time that
CPU went idle, at least in kernels built with CONFIG_RCU_EQS_DEBUG=y,
for example, those built by rcutorture scenario TREE04. One could
argue that this means that rcu_dynticks_eqs_online() is unnecessary,
however, removing it would make the CPU-online process vulnerable to
slight changes in the CPU-offline process.

One could also ask why it is safe to move the rcu_dynticks_eqs_online()
call so late in the CPU-online process. Indeed, there was a time when it
would not have been safe, which does much to explain its current location.
However, the marking of a CPU as online from an RCU perspective has long
since moved from rcutree_prepare_cpu() to rcu_cpu_starting(), and all
that is required is that ->dynticks be set correctly by the time that
the CPU is marked as online from an RCU perspective. After all, the RCU
grace-period kthread does not check to see if offline CPUs are also idle.
(In case you were curious, this is one reason why there is quiescent-state
reporting as part of the offlining process.)

This commit therefore moves the call to rcu_dynticks_eqs_online() from
rcutree_prepare_cpu() to rcu_cpu_starting(), this latter being guaranteed
to be running on the incoming CPU. The call to this function must of
course be placed before this rcu_cpu_starting() announces this CPU's
presence to RCU.
Reported-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

2caebefb

rcu: Comment rcu_gp_init() code waiting for CPU-hotplug operations · ebc88ad4

Paul E. McKenney authored Jul 26, 2021

Near the beginning of rcu_gp_init() is a per-rcu_node loop that waits
for CPU-hotplug operations that might have started before the new
grace period did.  This commit adds a comment explaining that this
wait does not exclude CPU-hotplug operations.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

ebc88ad4