An error occurred fetching the project authors.
  1. 12 Mar, 2014 3 commits
  2. 31 Aug, 2013 2 commits
    • Paul E. McKenney's avatar
      nohz_full: Force RCU's grace-period kthreads onto timekeeping CPU · eb75767b
      Paul E. McKenney authored
      Because RCU's quiescent-state-forcing mechanism is used to drive the
      full-system-idle state machine, and because this mechanism is executed
      by RCU's grace-period kthreads, this commit forces these kthreads to
      run on the timekeeping CPU (tick_do_timer_cpu).  To do otherwise would
      mean that the RCU grace-period kthreads would force the system into
      non-idle state every time they drove the state machine, which would
      be just a bit on the futile side.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Reviewed-by: default avatarJosh Triplett <josh@joshtriplett.org>
      eb75767b
    • Paul E. McKenney's avatar
      nohz_full: Add full-system-idle state machine · 0edd1b17
      Paul E. McKenney authored
      This commit adds the state machine that takes the per-CPU idle data
      as input and produces a full-system-idle indication as output.  This
      state machine is driven out of RCU's quiescent-state-forcing
      mechanism, which invokes rcu_sysidle_check_cpu() to collect per-CPU
      idle state and then rcu_sysidle_report() to drive the state machine.
      
      The full-system-idle state is sampled using rcu_sys_is_idle(), which
      also drives the state machine if RCU is idle (and does so by forcing
      RCU to become non-idle).  This function returns true if all but the
      timekeeping CPU (tick_do_timer_cpu) are idle and have been idle long
      enough to avoid memory contention on the full_sysidle_state state
      variable.  The rcu_sysidle_force_exit() may be called externally
      to reset the state machine back into non-idle state.
      
      For large systems the state machine is driven out of RCU's
      force-quiescent-state logic, which provides good scalability at the price
      of millisecond-scale latencies on the transition to full-system-idle
      state.  This is not so good for battery-powered systems, which are usually
      small enough that they don't need to care about scalability, but which
      do care deeply about energy efficiency.  Small systems therefore drive
      the state machine directly out of the idle-entry code.  The number of
      CPUs in a "small" system is defined by a new NO_HZ_FULL_SYSIDLE_SMALL
      Kconfig parameter, which defaults to 8.  Note that this is a build-time
      definition.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      [ paulmck: Use true and false for boolean constants per Lai Jiangshan. ]
      Reviewed-by: default avatarJosh Triplett <josh@joshtriplett.org>
      [ paulmck: Simplify logic and provide better comments for memory barriers,
        based on review comments and questions by Lai Jiangshan. ]
      0edd1b17
  3. 19 Aug, 2013 3 commits
    • Paul E. McKenney's avatar
      nohz_full: Add full-system idle states and variables · d4bd54fb
      Paul E. McKenney authored
      This commit adds control variables and states for full-system idle.
      The system will progress through the states in numerical order when
      the system is fully idle (other than the timekeeping CPU), and reset
      down to the initial state if any non-timekeeping CPU goes non-idle.
      The current state is kept in full_sysidle_state.
      
      One flavor of RCU will be in charge of driving the state machine,
      defined by rcu_sysidle_state.  This should be the busiest flavor of RCU.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Reviewed-by: default avatarJosh Triplett <josh@joshtriplett.org>
      d4bd54fb
    • Paul E. McKenney's avatar
      nohz_full: Add per-CPU idle-state tracking · eb348b89
      Paul E. McKenney authored
      This commit adds the code that updates the rcu_dyntick structure's
      new fields to track the per-CPU idle state based on interrupts and
      transitions into and out of the idle loop (NMIs are ignored because NMI
      handlers cannot cleanly read out the time anyway).  This code is similar
      to the code that maintains RCU's idea of per-CPU idleness, but differs
      in that RCU treats CPUs running in user mode as idle, where this new
      code does not.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Acked-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Reviewed-by: default avatarJosh Triplett <josh@joshtriplett.org>
      eb348b89
    • Paul E. McKenney's avatar
      nohz_full: Add rcu_dyntick data for scalable detection of all-idle state · 2333210b
      Paul E. McKenney authored
      This commit adds fields to the rcu_dyntick structure that are used to
      detect idle CPUs.  These new fields differ from the existing ones in
      that the existing ones consider a CPU executing in user mode to be idle,
      where the new ones consider CPUs executing in user mode to be busy.
      The handling of these new fields is otherwise quite similar to that for
      the exiting fields.  This commit also adds the initialization required
      for these fields.
      
      So, why is usermode execution treated differently, with RCU considering
      it a quiescent state equivalent to idle, while in contrast the new
      full-system idle state detection considers usermode execution to be
      non-idle?
      
      It turns out that although one of RCU's quiescent states is usermode
      execution, it is not a full-system idle state.  This is because the
      purpose of the full-system idle state is not RCU, but rather determining
      when accurate timekeeping can safely be disabled.  Whenever accurate
      timekeeping is required in a CONFIG_NO_HZ_FULL kernel, at least one
      CPU must keep the scheduling-clock tick going.  If even one CPU is
      executing in user mode, accurate timekeeping is requires, particularly for
      architectures where gettimeofday() and friends do not enter the kernel.
      Only when all CPUs are really and truly idle can accurate timekeeping be
      disabled, allowing all CPUs to turn off the scheduling clock interrupt,
      thus greatly improving energy efficiency.
      
      This naturally raises the question "Why is this code in RCU rather than in
      timekeeping?", and the answer is that RCU has the data and infrastructure
      to efficiently make this determination.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Acked-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Reviewed-by: default avatarJosh Triplett <josh@joshtriplett.org>
      2333210b
  4. 29 Jul, 2013 2 commits
    • Steven Rostedt (Red Hat)'s avatar
      rcu: Have the RCU tracepoints use the tracepoint_string infrastructure · f7f7bac9
      Steven Rostedt (Red Hat) authored
      Currently, RCU tracepoints save only a pointer to strings in the
      ring buffer. When displayed via the /sys/kernel/debug/tracing/trace file
      they are referenced like the printf "%s" that looks at the address
      in the ring buffer and prints out the string it points too. This requires
      that the strings are constant and persistent in the kernel.
      
      The problem with this is for tools like trace-cmd and perf that read the
      binary data from the buffers but have no access to the kernel memory to
      find out what string is represented by the address in the buffer.
      
      By using the tracepoint_string infrastructure, the RCU tracepoint strings
      can be exported such that userspace tools can map the addresses to
      the strings.
      
       # cat /sys/kernel/debug/tracing/printk_formats
      0xffffffff81a4a0e8 : "rcu_preempt"
      0xffffffff81a4a0f4 : "rcu_bh"
      0xffffffff81a4a100 : "rcu_sched"
      0xffffffff818437a0 : "cpuqs"
      0xffffffff818437a6 : "rcu_sched"
      0xffffffff818437a0 : "cpuqs"
      0xffffffff818437b0 : "rcu_bh"
      0xffffffff818437b7 : "Start context switch"
      0xffffffff818437cc : "End context switch"
      0xffffffff818437a0 : "cpuqs"
      [...]
      
      Now userspaces tools can display:
      
       rcu_utilization:      Start context switch
       rcu_dyntick:          Start 1 0
       rcu_utilization:      End context switch
       rcu_batch_start:      rcu_preempt CBs=0/5 bl=10
       rcu_dyntick:          End 0 140000000000000
       rcu_invoke_callback:  rcu_preempt rhp=0xffff880071c0d600 func=proc_i_callback
       rcu_invoke_callback:  rcu_preempt rhp=0xffff880077b5b230 func=__d_free
       rcu_dyntick:          Start 140000000000000 0
       rcu_invoke_callback:  rcu_preempt rhp=0xffff880077563980 func=file_free_rcu
       rcu_batch_end:        rcu_preempt CBs-invoked=3 idle=>c<>c<>c<>c<
       rcu_utilization:      End RCU core
       rcu_grace_period:     rcu_preempt 9741 start
       rcu_dyntick:          Start 1 0
       rcu_dyntick:          End 0 140000000000000
       rcu_dyntick:          Start 140000000000000 0
      
      Instead of:
      
       rcu_utilization:      ffffffff81843110
       rcu_future_grace_period: ffffffff81842f1d 9939 9939 9940 0 0 3 ffffffff81842f32
       rcu_batch_start:      ffffffff81842f1d CBs=0/4 bl=10
       rcu_future_grace_period: ffffffff81842f1d 9939 9939 9940 0 0 3 ffffffff81842f3c
       rcu_grace_period:     ffffffff81842f1d 9939 ffffffff81842f80
       rcu_invoke_callback:  ffffffff81842f1d rhp=0xffff88007888aac0 func=file_free_rcu
       rcu_grace_period:     ffffffff81842f1d 9939 ffffffff81842f95
       rcu_invoke_callback:  ffffffff81842f1d rhp=0xffff88006aeb4600 func=proc_i_callback
       rcu_future_grace_period: ffffffff81842f1d 9939 9939 9940 0 0 3 ffffffff81842f32
       rcu_future_grace_period: ffffffff81842f1d 9939 9939 9940 0 0 3 ffffffff81842f3c
       rcu_invoke_callback:  ffffffff81842f1d rhp=0xffff880071cb9fc0 func=__d_free
       rcu_grace_period:     ffffffff81842f1d 9939 ffffffff81842f80
       rcu_invoke_callback:  ffffffff81842f1d rhp=0xffff88007888ae80 func=file_free_rcu
       rcu_batch_end:        ffffffff81842f1d CBs-invoked=4 idle=>c<>c<>c<>c<
       rcu_utilization:      ffffffff8184311f
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      f7f7bac9
    • Steven Rostedt (Red Hat)'s avatar
      rcu: Simplify RCU_STATE_INITIALIZER() macro · a41bfeb2
      Steven Rostedt (Red Hat) authored
      The RCU_STATE_INITIALIZER() macro is used only in the rcutree.c file
      as well as the rcutree_plugin.h file. It is passed as a rvalue to
      a variable of a similar name. A per_cpu variable is also created
      with a similar name as well.
      
      The uses of RCU_STATE_INITIALIZER() can be simplified to remove some
      of the duplicate code that is done. Currently the three users of this
      macro has this format:
      
      struct rcu_state rcu_sched_state =
      	RCU_STATE_INITIALIZER(rcu_sched, call_rcu_sched);
      DEFINE_PER_CPU(struct rcu_data, rcu_sched_data);
      
      Notice that "rcu_sched" is called three times. This is the same with
      the other two users. This can be condensed to just:
      
      RCU_STATE_INITIALIZER(rcu_sched, call_rcu_sched);
      
      by moving the rest into the macro itself.
      
      This also opens the door to allow the RCU tracepoint strings and
      their addresses to be exported so that userspace tracing tools can
      translate the contents of the pointers of the RCU tracepoints.
      The change will allow for helper code to be placed in the
      RCU_STATE_INITIALIZER() macro to export the name that is used.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      a41bfeb2
  5. 14 Jul, 2013 1 commit
    • Paul Gortmaker's avatar
      rcu: delete __cpuinit usage from all rcu files · 49fb4c62
      Paul Gortmaker authored
      The __cpuinit type of throwaway sections might have made sense
      some time ago when RAM was more constrained, but now the savings
      do not offset the cost and complications.  For example, the fix in
      commit 5e427ec2 ("x86: Fix bit corruption at CPU resume time")
      is a good example of the nasty type of bugs that can be created
      with improper use of the various __init prefixes.
      
      After a discussion on LKML[1] it was decided that cpuinit should go
      the way of devinit and be phased out.  Once all the users are gone,
      we can then finally remove the macros themselves from linux/init.h.
      
      This removes all the drivers/rcu uses of the __cpuinit macros
      from all C files.
      
      [1] https://lkml.org/lkml/2013/5/20/589
      
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Josh Triplett <josh@freedesktop.org>
      Cc: Dipankar Sarma <dipankar@in.ibm.com>
      Reviewed-by: default avatarJosh Triplett <josh@joshtriplett.org>
      Signed-off-by: default avatarPaul Gortmaker <paul.gortmaker@windriver.com>
      49fb4c62
  6. 10 Jun, 2013 4 commits
  7. 15 May, 2013 1 commit
  8. 14 May, 2013 1 commit
  9. 19 Apr, 2013 1 commit
    • Frederic Weisbecker's avatar
      nohz: Ensure full dynticks CPUs are RCU nocbs · d1e43fa5
      Frederic Weisbecker authored
      We need full dynticks CPU to also be RCU nocb so
      that we don't have to keep the tick to handle RCU
      callbacks.
      
      Make sure the range passed to nohz_full= boot
      parameter is a subset of rcu_nocbs=
      
      The CPUs that fail to meet this requirement will be
      excluded from the nohz_full range. This is checked
      early in boot time, before any CPU has the opportunity
      to stop its tick.
      Suggested-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Reviewed-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Geoff Levand <geoff@infradead.org>
      Cc: Gilad Ben Yossef <gilad@benyossef.com>
      Cc: Hakan Akkan <hakanakkan@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Kevin Hilman <khilman@linaro.org>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      d1e43fa5
  10. 15 Apr, 2013 1 commit
    • Paul E. McKenney's avatar
      rcu: Kick adaptive-ticks CPUs that are holding up RCU grace periods · 65d798f0
      Paul E. McKenney authored
      Adaptive-ticks CPUs inform RCU when they enter kernel mode, but they do
      not necessarily turn the scheduler-clock tick back on.  This state of
      affairs could result in RCU waiting on an adaptive-ticks CPU running
      for an extended period in kernel mode.  Such a CPU will never run the
      RCU state machine, and could therefore indefinitely extend the RCU state
      machine, sooner or later resulting in an OOM condition.
      
      This patch, inspired by an earlier patch by Frederic Weisbecker, therefore
      causes RCU's force-quiescent-state processing to check for this condition
      and to send an IPI to CPUs that remain in that state for too long.
      "Too long" currently means about three jiffies by default, which is
      quite some time for a CPU to remain in the kernel without blocking.
      The rcu_tree.jiffies_till_first_fqs and rcutree.jiffies_till_next_fqs
      sysfs variables may be used to tune "too long" if needed.
      Reported-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: default avatarJosh Triplett <josh@joshtriplett.org>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Geoff Levand <geoff@infradead.org>
      Cc: Gilad Ben Yossef <gilad@benyossef.com>
      Cc: Hakan Akkan <hakanakkan@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Kevin Hilman <khilman@linaro.org>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      65d798f0
  11. 26 Mar, 2013 11 commits
  12. 13 Mar, 2013 1 commit
  13. 12 Mar, 2013 1 commit
    • Paul E. McKenney's avatar
      rcu: Remove restrictions on no-CBs CPUs · 34ed6246
      Paul E. McKenney authored
      Currently, CPU 0 is constrained to not be a no-CBs CPU, and furthermore
      at least one no-CBs CPU must remain online at any given time.  These
      restrictions are problematic in some situations, such as cases where
      all CPUs must run a real-time workload that needs to be insulated from
      OS jitter and latencies due to RCU callback invocation.  This commit
      therefore provides no-CBs CPUs a (very crude and energy-inefficient)
      way to start and to wait for grace periods independently of the normal
      RCU callback mechanisms.  This approach allows any or all of the CPUs to
      be designated as no-CBs CPUs, and allows any proper subset of the CPUs
      (whether no-CBs CPUs or not) to be offlined.
      
      This commit also provides a fix for a locking bug spotted by Xie
      ChanglongX <changlongx.xie@intel.com>.
      Signed-off-by: default avatarPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      34ed6246
  14. 08 Jan, 2013 2 commits
    • Paul Gortmaker's avatar
      rcu: Make rcu_nocb_poll an early_param instead of module_param · 1b0048a4
      Paul Gortmaker authored
      The as-documented rcu_nocb_poll will fail to enable this feature
      for two reasons.  (1) there is an extra "s" in the documented
      name which is not in the code, and (2) since it uses module_param,
      it really is expecting a prefix, akin to "rcutree.fanout_leaf"
      and the prefix isn't documented.
      
      However, there are several reasons why we might not want to
      simply fix the typo and add the prefix:
      
      1) we'd end up with rcutree.rcu_nocb_poll, and rather probably make
      a change to rcutree.nocb_poll
      
      2) if we did #1, then the prefix wouldn't be consistent with the
      rcu_nocbs=<cpumap> parameter (i.e. one with, one without prefix)
      
      3) the use of module_param in a header file is less than desired,
      since it isn't immediately obvious that it will get processed
      via rcutree.c and get the prefix from that (although use of
      module_param_named() could clarify that.)
      
      4) the implied export of /sys/module/rcutree/parameters/rcu_nocb_poll
      data to userspace via module_param() doesn't really buy us anything,
      as it is read-only and we can tell if it is enabled already without
      it, since there is a printk at early boot telling us so.
      
      In light of all that, just change it from a module_param() to an
      early_setup() call, and worry about adding it to /sys later on if
      we decide to allow a dynamic setting of it.
      
      Also change the variable to be tagged as read_mostly, since it
      will only ever be fiddled with at most, once at boot.
      Signed-off-by: default avatarPaul Gortmaker <paul.gortmaker@windriver.com>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      1b0048a4
    • Paul Gortmaker's avatar
      rcu: Prevent soft-lockup complaints about no-CBs CPUs · 353af9c9
      Paul Gortmaker authored
      The wait_event() at the head of the rcu_nocb_kthread() can result in
      soft-lockup complaints if the CPU in question does not register RCU
      callbacks for an extended period.  This commit therefore changes
      the wait_event() to a wait_event_interruptible().
      Reported-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: default avatarPaul Gortmaker <paul.gortmaker@windriver.com>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      353af9c9
  15. 16 Nov, 2012 2 commits
    • Paul E. McKenney's avatar
      rcu: Separate accounting of callbacks from callback-free CPUs · c635a4e1
      Paul E. McKenney authored
      Currently, callback invocations from callback-free CPUs are accounted to
      the CPU that registered the callback, but using the same field that is
      used for normal callbacks.  This makes it impossible to determine from
      debugfs output whether callbacks are in fact being diverted.  This commit
      therefore adds a separate ->n_nocbs_invoked field in the rcu_data structure
      in which diverted callback invocations are counted.  RCU's debugfs tracing
      still displays normal callback invocations using ci=, but displayed
      diverted callbacks with nci=.
      Signed-off-by: default avatarPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      c635a4e1
    • Paul E. McKenney's avatar
      rcu: Add callback-free CPUs · 3fbfbf7a
      Paul E. McKenney authored
      RCU callback execution can add significant OS jitter and also can
      degrade both scheduling latency and, in asymmetric multiprocessors,
      energy efficiency.  This commit therefore adds the ability for selected
      CPUs ("rcu_nocbs=" boot parameter) to have their callbacks offloaded
      to kthreads.  If the "rcu_nocb_poll" boot parameter is also specified,
      these kthreads will do polling, removing the need for the offloaded
      CPUs to do wakeups.  At least one CPU must be doing normal callback
      processing: currently CPU 0 cannot be selected as a no-CBs CPU.
      In addition, attempts to offline the last normal-CBs CPU will fail.
      
      This feature was inspired by Jim Houston's and Joe Korty's JRCU, and
      this commit includes fixes to problems located by Fengguang Wu's
      kbuild test robot.
      
      [ paulmck: Added gfp.h include file as suggested by Fengguang Wu. ]
      Signed-off-by: default avatarPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      3fbfbf7a
  16. 13 Nov, 2012 1 commit
  17. 08 Nov, 2012 1 commit
  18. 23 Oct, 2012 1 commit
  19. 26 Sep, 2012 1 commit