1. 07 Oct, 2021 3 commits
  2. 16 Sep, 2021 9 commits
  3. 15 Sep, 2021 12 commits
    • Paul E. McKenney's avatar
      rcu-tasks: Update comments to cond_resched_tasks_rcu_qs() · 8af9e2c7
      Paul E. McKenney authored
      The cond_resched_rcu_qs() function no longer exists, despite being mentioned
      several times in kernel/rcu/tasks.h.  This commit therefore updates it to
      the current cond_resched_tasks_rcu_qs().
      Reported-by: default avatarNeeraj Upadhyay <neeraju@codeaurora.org>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      8af9e2c7
    • Neeraj Upadhyay's avatar
      rcu-tasks: Fix IPI failure handling in trc_wait_for_one_reader · 46aa886c
      Neeraj Upadhyay authored
      The trc_wait_for_one_reader() function is called at multiple stages
      of trace rcu-tasks GP function, rcu_tasks_wait_gp():
      
      - First, it is called as part of per task function -
        rcu_tasks_trace_pertask(), for all non-idle tasks. As part of per task
        processing, this function add the task in the holdout list and if the
        task is currently running on a CPU, it sends IPI to the task's CPU.
        The IPI handler takes action depending on whether task is in trace
        rcu-tasks read side critical section or not:
      
        - a. If the task is in trace rcu-tasks read side critical section
             (t->trc_reader_nesting != 0), the IPI handler sets the task's
             ->trc_reader_special.b.need_qs, so that this task notifies exit
             from its outermost read side critical section (by decrementing
             trc_n_readers_need_end) to the GP handling function.
             trc_wait_for_one_reader() also increments trc_n_readers_need_end,
             so that the trace rcu-tasks GP handler function waits for this
             task's read side exit notification. The IPI handler also sets
             t->trc_reader_checked to true, and no further IPIs are sent for
             this task, for this trace rcu-tasks grace period and this
             task can be removed from holdout list.
      
        - b. If the task is in the process of exiting its trace rcu-tasks
             read side critical section, (t->trc_reader_nesting < 0), defer
             this task's processing to future calls to trc_wait_for_one_reader().
      
        - c. If task is not in rcu-task read side critical section,
             t->trc_reader_nesting == 0, ->trc_reader_checked is set for this
             task, so that this task is removed from holdout list.
      
      - Second, trc_wait_for_one_reader() is called as part of post scan, in
        function rcu_tasks_trace_postscan(), for all idle tasks.
      
      - Third, in function check_all_holdout_tasks_trace(), this function is
        called for each task in the holdout list, but only if there isn't
        a pending IPI for the task (->trc_ipi_to_cpu == -1). This function
        removed the task from holdout list, if IPI handler has completed the
        required work, to ensure that the current trace rcu-tasks grace period
        either waits for this task, or this task is not in a trace rcu-tasks
        read side critical section.
      
      Now, considering the scenario where smp_call_function_single() fails in
      first case, inside rcu_tasks_trace_pertask(). In this case,
      ->trc_ipi_to_cpu is set to the current CPU for that task. This will
      result in trc_wait_for_one_reader() getting skipped in third case,
      inside check_all_holdout_tasks_trace(), for this task. This further
      results in ->trc_reader_checked never getting set for this task,
      and the task not getting removed from holdout list. This can cause
      the current trace rcu-tasks grace period to stall.
      
      Fix the above problem, by resetting ->trc_ipi_to_cpu to -1, on
      smp_call_function_single() failure, so that future IPI calls can
      be send for this task.
      
      Note that all three of the trc_wait_for_one_reader() function's
      callers (rcu_tasks_trace_pertask(), rcu_tasks_trace_postscan(),
      check_all_holdout_tasks_trace()) hold cpu_read_lock().  This means
      that smp_call_function_single() cannot race with CPU hotplug, and thus
      should never fail.  Therefore, also add a warning in order to report
      any such failure in case smp_call_function_single() grows some other
      reason for failure.
      Signed-off-by: default avatarNeeraj Upadhyay <neeraju@codeaurora.org>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      46aa886c
    • Neeraj Upadhyay's avatar
      rcu-tasks: Fix read-side primitives comment for call_rcu_tasks_trace · ed42c380
      Neeraj Upadhyay authored
      call_rcu_tasks_trace() does have read-side primitives - rcu_read_lock_trace()
      and rcu_read_unlock_trace(). Fix this information in the comments.
      Signed-off-by: default avatarNeeraj Upadhyay <neeraju@codeaurora.org>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      ed42c380
    • Neeraj Upadhyay's avatar
      rcu-tasks: Clarify read side section info for rcu_tasks_rude GP primitives · a6517e9c
      Neeraj Upadhyay authored
      RCU tasks rude variant does not check whether the current
      running context on a CPU is usermode. Read side critical section ends
      on transition to usermode execution, by the virtue of usermode
      execution being schedulable. Clarify this in comments for
      call_rcu_tasks_rude() and synchronize_rcu_tasks_rude().
      Signed-off-by: default avatarNeeraj Upadhyay <neeraju@codeaurora.org>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      a6517e9c
    • Neeraj Upadhyay's avatar
      rcu-tasks: Correct comparisons for CPU numbers in show_stalled_task_trace · d39ec8f3
      Neeraj Upadhyay authored
      Valid CPU numbers can be zero or greater, but the checks for
      ->trc_ipi_to_cpu and tick_nohz_full_cpu()'s argument are for strictly
      greater than.  This commit therefore corrects the check for no_hz_full
      cpu in show_stalled_task_trace() so as to include cpu 0.
      Signed-off-by: default avatarNeeraj Upadhyay <neeraju@codeaurora.org>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      d39ec8f3
    • Neeraj Upadhyay's avatar
      rcu-tasks: Correct firstreport usage in check_all_holdout_tasks_trace · 89401176
      Neeraj Upadhyay authored
      In check_all_holdout_tasks_trace(), firstreport is a pointer argument;
      so, check the dereferenced value, instead of checking the pointer.
      Signed-off-by: default avatarNeeraj Upadhyay <neeraju@codeaurora.org>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      89401176
    • Neeraj Upadhyay's avatar
    • Paul E. McKenney's avatar
      rcu-tasks: Move RTGS_WAIT_CBS to beginning of rcu_tasks_kthread() loop · 0db7c32a
      Paul E. McKenney authored
      Early in debugging, it made some sense to differentiate the first
      iteration from subsequent iterations, but now this just causes confusion.
      This commit therefore moves the "set_tasks_gp_state(rtp, RTGS_WAIT_CBS)"
      statement to the beginning of the "for" loop in rcu_tasks_kthread().
      Reported-by: default avatarNeeraj Upadhyay <neeraju@codeaurora.org>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      0db7c32a
    • Paul E. McKenney's avatar
    • Paul E. McKenney's avatar
      rcu-tasks: Remove second argument of rcu_read_unlock_trace_special() · a5c071cc
      Paul E. McKenney authored
      The second argument of rcu_read_unlock_trace_special() is always zero.
      When called from exit_tasks_rcu_finish_trace(), it is the constant
      zero, and rcu_read_unlock_trace_special() doesn't get called from
      rcu_read_unlock_trace() unless the value of local variable "nesting"
      is zero because in that case the early return is taken instead.
      
      This commit therefore removes the "nesting" argument from the
      rcu_read_unlock_trace_special() function, substituting the constant
      zero within that function.  This commit also adds a WARN_ON_ONCE()
      to rcu_read_lock_trace_held() in case non-zeroness some day appears.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      a5c071cc
    • Paul E. McKenney's avatar
      rcu-tasks: Add trc_inspect_reader() checks for exiting critical section · 18f08e75
      Paul E. McKenney authored
      Currently, trc_inspect_reader() treats a task exiting its RCU Tasks
      Trace read-side critical section the same as being within that critical
      section.  However, this can fail because that task might have already
      checked its .need_qs field, which means that it might never decrement
      the all-important trc_n_readers_need_end counter.  Of course, for that
      to happen, the task would need to never again execute an RCU Tasks Trace
      read-side critical section, but this really could happen if the system's
      last trampoline was removed.  Note that exit from such a critical section
      cannot be treated as a quiescent state due to the possibility of nested
      critical sections.  This means that if trc_inspect_reader() sees a
      negative nesting value, it must set up to try again later.
      
      This commit therefore ignores tasks that are exiting their RCU Tasks
      Trace read-side critical sections so that they will be rechecked later.
      
      [ paulmck: Apply feedback from Neeraj Upadhyay and Boqun Feng. ]
      Signed-off-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      18f08e75
    • Paul E. McKenney's avatar
      rcu-tasks: Simplify trc_read_check_handler() atomic operations · 96017bf9
      Paul E. McKenney authored
      Currently, trc_wait_for_one_reader() atomically increments
      the trc_n_readers_need_end counter before sending the IPI
      invoking trc_read_check_handler().  All failure paths out of
      trc_read_check_handler() and also from the smp_call_function_single()
      within trc_wait_for_one_reader() must carefully atomically decrement
      this counter.  This is more complex than it needs to be.
      
      This commit therefore simplifies things and saves a few lines of
      code by dispensing with the atomic decrements in favor of having
      trc_read_check_handler() do the atomic increment only in the success case.
      In theory, this represents no change in functionality.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      96017bf9
  4. 13 Sep, 2021 16 commits
    • Paul E. McKenney's avatar
      torture: Make torture.sh print the number of files to be compressed · b380b10b
      Paul E. McKenney authored
      Compressing gigabyte vmlinux files can take some time, and it can be a
      bit annoying to not know many more batches of compression there will be.
      This commit therefore makes torture.sh print the number of files to be
      compressed just before starting compression and just after compression
      completes.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      b380b10b
    • Scott Wood's avatar
      rcutorture: Avoid problematic critical section nesting on PREEMPT_RT · 71921a96
      Scott Wood authored
      rcutorture is generating some nesting scenarios that are not compatible on PREEMPT_RT.
      For example:
      	preempt_disable();
      	rcu_read_lock_bh();
      	preempt_enable();
      	rcu_read_unlock_bh();
      
      The problem here is that on PREEMPT_RT the bottom halves have to be
      disabled and enabled in preemptible context.
      
      Reorder locking: start with BH locking and continue with then with
      disabling preemption or interrupts. In the unlocking do it reverse by
      first enabling interrupts and preemption and BH at the very end.
      Ensure that on PREEMPT_RT BH locking remains unchanged if in
      non-preemptible context.
      
      Link: https://lkml.kernel.org/r/20190911165729.11178-6-swood@redhat.com
      Link: https://lkml.kernel.org/r/20210819182035.GF4126399@paulmck-ThinkPad-P17-Gen-1Signed-off-by: default avatarScott Wood <swood@redhat.com>
      [bigeasy: Drop ATOM_BH, make it only about changing BH in atomic
      context. Allow enabling RCU in IRQ-off section. Reword commit message.]
      Signed-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      71921a96
    • Paul E. McKenney's avatar
      rcutorture: Don't cpuhp_remove_state() if cpuhp_setup_state() failed · fd13fe16
      Paul E. McKenney authored
      Currently, in CONFIG_RCU_BOOST kernels, if the rcu_torture_init()
      function's call to cpuhp_setup_state() fails, rcu_torture_cleanup()
      gamely passes nonsense to cpuhp_remove_state().  This results in
      strange and misleading splats.  This commit therefore ensures that if
      the rcu_torture_init() function's call to cpuhp_setup_state() fails,
      rcu_torture_cleanup() avoids invoking cpuhp_remove_state().
      Signed-off-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      fd13fe16
    • Paul E. McKenney's avatar
      rcuscale: Warn on individual rcu_scale_init() error conditions · eb77abfd
      Paul E. McKenney authored
      When running rcuscale as a module, any rcu_scale_init() issues will be
      reflected in the error code from modprobe or insmod, as the case may be.
      However, these error codes are not available when running rcuscale
      built-in, for example, when using the kvm.sh script.  This commit
      therefore adds WARN_ON_ONCE() to allow distinguishing rcu_scale_init()
      errors when running rcuscale built-in.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      eb77abfd
    • Paul E. McKenney's avatar
      refscale: Warn on individual ref_scale_init() error conditions · ed60ad73
      Paul E. McKenney authored
      When running refscale as a module, any ref_scale_init() issues will be
      reflected in the error code from modprobe or insmod, as the case may be.
      However, these error codes are not available when running refscale
      built-in, for example, when using the kvm.sh script.  This commit
      therefore adds WARN_ON_ONCE() to allow distinguishing ref_scale_init()
      errors when running refscale built-in.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      ed60ad73
    • Paul E. McKenney's avatar
      locktorture: Warn on individual lock_torture_init() error conditions · b3b3cc61
      Paul E. McKenney authored
      When running locktorture as a module, any lock_torture_init() issues will be
      reflected in the error code from modprobe or insmod, as the case may be.
      However, these error codes are not available when running locktorture
      built-in, for example, when using the kvm.sh script.  This commit
      therefore adds WARN_ON_ONCE() to allow distinguishing lock_torture_init()
      errors when running locktorture built-in.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      b3b3cc61
    • Paul E. McKenney's avatar
      rcutorture: Warn on individual rcu_torture_init() error conditions · efeff6b3
      Paul E. McKenney authored
      When running rcutorture as a module, any rcu_torture_init() issues will be
      reflected in the error code from modprobe or insmod, as the case may be.
      However, these error codes are not available when running rcutorture
      built-in, for example, when using the kvm.sh script.  This commit
      therefore adds WARN_ON_ONCE() to allow distinguishing rcu_torture_init()
      errors when running rcutorture built-in.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      efeff6b3
    • Paul E. McKenney's avatar
      rcutorture: Suppressing read-exit testing is not an error · fda84866
      Paul E. McKenney authored
      Currently, specifying the rcutorture.read_exit_burst=0 kernel boot
      parameter will result in a -EINVAL exit code that will stop the rcutorture
      test run before it has fully initialized.  This commit therefore uses a
      zero exit code in that case, thus allowing rcutorture.read_exit_burst=0
      to complete normally.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      fda84866
    • Paul E. McKenney's avatar
      rcu-tasks: Wait for trc_read_check_handler() IPIs · cbe0d8d9
      Paul E. McKenney authored
      Currently, RCU Tasks Trace initializes the trc_n_readers_need_end counter
      to the value one, increments it before each trc_read_check_handler()
      IPI, then decrements it within trc_read_check_handler() if the target
      task was in a quiescent state (or if the target task moved to some other
      CPU while the IPI was in flight), complaining if the new value was zero.
      The rationale for complaining is that the initial value of one must be
      decremented away before zero can be reached, and this decrement has not
      yet happened.
      
      Except that trc_read_check_handler() is initiated with an asynchronous
      smp_call_function_single(), which might be significantly delayed.  This
      can result in false-positive complaints about the counter reaching zero.
      
      This commit therefore waits for in-flight IPI handlers to complete before
      decrementing away the initial value of one from the trc_n_readers_need_end
      counter.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      cbe0d8d9
    • Neeraj Upadhyay's avatar
      rcu: Fix existing exp request check in sync_sched_exp_online_cleanup() · f0b2b2df
      Neeraj Upadhyay authored
      The sync_sched_exp_online_cleanup() checks to see if RCU needs
      an expedited quiescent state from the incoming CPU, sending it
      an IPI if so. Before sending IPI, it checks whether expedited
      qs need has been already requested for the incoming CPU, by
      checking rcu_data.cpu_no_qs.b.exp for the current cpu, on which
      sync_sched_exp_online_cleanup() is running. This works for the
      case where incoming CPU is same as self. However, for the case
      where incoming CPU is different from self, expedited request
      won't get marked, which can potentially delay reporting of
      expedited quiescent state for the incoming CPU.
      
      Fixes: e015a341 ("rcu: Avoid self-IPI in sync_sched_exp_online_cleanup()")
      Signed-off-by: default avatarNeeraj Upadhyay <neeraju@codeaurora.org>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      f0b2b2df
    • Juri Lelli's avatar
      rcu: Make rcu update module parameters world-readable · 1eac0075
      Juri Lelli authored
      rcu update module parameters currently don't appear in sysfs and this is
      a serviceability issue as it might be needed to access their default
      values at runtime.
      
      Fix this issue by changing rcu update module parameters permissions to
      world-readable.
      Suggested-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      Signed-off-by: default avatarJuri Lelli <juri.lelli@redhat.com>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      1eac0075
    • Juri Lelli's avatar
      rcu: Make rcu_normal_after_boot writable again · ebb6d30d
      Juri Lelli authored
      Certain configurations (e.g., systems that make heavy use of netns)
      need to use synchronize_rcu_expedited() to service RCU grace periods
      even after boot.
      
      Even though synchronize_rcu_expedited() has been traditionally
      considered harmful for RT for the heavy use of IPIs, it is perfectly
      usable under certain conditions (e.g. nohz_full).
      
      Make rcupdate.rcu_normal_after_boot= again writeable on RT (if NO_HZ_
      FULL is defined), but keep its default value to 1 (enabled) to avoid
      regressions. Users who need synchronize_rcu_expedited() will boot with
      rcupdate.rcu_normal_after_ boot=0 in the kernel cmdline.
      
      Reflect the change in synchronize_rcu_expedited_wait() by removing the
      WARN related to CONFIG_PREEMPT_RT.
      Signed-off-by: default avatarJuri Lelli <juri.lelli@redhat.com>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      ebb6d30d
    • Paul E. McKenney's avatar
      rcu: Make rcutree_dying_cpu() use its "cpu" parameter · 4aa846f9
      Paul E. McKenney authored
      The CPU-hotplug functions take a "cpu" parameter, but rcutree_dying_cpu()
      ignores it in favor of this_cpu_ptr().  This works at the moment, but
      it would be better to be consistent.  This might also work better given
      some possible future changes.  This commit therefore uses per_cpu_ptr()
      to avoid ignoring the rcutree_dying_cpu() function's argument.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      4aa846f9
    • Paul E. McKenney's avatar
      rcu: Simplify rcu_report_dead() call to rcu_report_exp_rdp() · 768f5d50
      Paul E. McKenney authored
      Currently, rcu_report_dead() disables preemption across its call to
      rcu_report_exp_rdp(), but this is pointless because interrupts are
      already disabled by the caller.  In addition, rcu_report_dead() computes
      the address of the outgoing CPU's rcu_data structure, which is also
      pointless because this address is already present in local variable rdp.
      This commit therefore drops the preemption disabling and passes rdp
      to rcu_report_exp_rdp().
      Signed-off-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      768f5d50
    • Paul E. McKenney's avatar
      rcu: Move rcu_dynticks_eqs_online() to rcu_cpu_starting() · 2caebefb
      Paul E. McKenney authored
      The purpose of rcu_dynticks_eqs_online() is to adjust the ->dynticks
      counter of an incoming CPU when required.  It is currently invoked
      from rcutree_prepare_cpu(), which runs before the incoming CPU is
      running, and thus on some other CPU.  This makes the per-CPU accesses in
      rcu_dynticks_eqs_online() iffy at best, and it all "works" only because
      the running CPU cannot possibly be in dyntick-idle mode, which means
      that rcu_dynticks_eqs_online() never has any effect.
      
      It is currently OK for rcu_dynticks_eqs_online() to have no effect, but
      only because the CPU-offline process just happens to leave ->dynticks in
      the correct state.  After all, if ->dynticks were in the wrong state on a
      just-onlined CPU, rcutorture would complain bitterly the next time that
      CPU went idle, at least in kernels built with CONFIG_RCU_EQS_DEBUG=y,
      for example, those built by rcutorture scenario TREE04.  One could
      argue that this means that rcu_dynticks_eqs_online() is unnecessary,
      however, removing it would make the CPU-online process vulnerable to
      slight changes in the CPU-offline process.
      
      One could also ask why it is safe to move the rcu_dynticks_eqs_online()
      call so late in the CPU-online process.  Indeed, there was a time when it
      would not have been safe, which does much to explain its current location.
      However, the marking of a CPU as online from an RCU perspective has long
      since moved from rcutree_prepare_cpu() to rcu_cpu_starting(), and all
      that is required is that ->dynticks be set correctly by the time that
      the CPU is marked as online from an RCU perspective.  After all, the RCU
      grace-period kthread does not check to see if offline CPUs are also idle.
      (In case you were curious, this is one reason why there is quiescent-state
      reporting as part of the offlining process.)
      
      This commit therefore moves the call to rcu_dynticks_eqs_online() from
      rcutree_prepare_cpu() to rcu_cpu_starting(), this latter being guaranteed
      to be running on the incoming CPU.  The call to this function must of
      course be placed before this rcu_cpu_starting() announces this CPU's
      presence to RCU.
      Reported-by: default avatarMathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      2caebefb
    • Paul E. McKenney's avatar
      rcu: Comment rcu_gp_init() code waiting for CPU-hotplug operations · ebc88ad4
      Paul E. McKenney authored
      Near the beginning of rcu_gp_init() is a per-rcu_node loop that waits
      for CPU-hotplug operations that might have started before the new
      grace period did.  This commit adds a comment explaining that this
      wait does not exclude CPU-hotplug operations.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      ebc88ad4