An error occurred fetching the project authors.
- 12 Mar, 2014 3 commits
-
-
Paul E. McKenney authored
commit c229828c upstream. The rcu_try_advance_all_cbs() function is invoked on each attempted entry to and every exit from idle. If this function determines that there are callbacks ready to invoke, the caller will invoke the RCU core, which in turn will result in a pair of context switches. If a CPU enters and exits idle extremely frequently, this can result in an excessive number of context switches and high CPU overhead. This commit therefore causes rcu_try_advance_all_cbs() to throttle itself, refusing to do work more than once per jiffy. Reported-by:
Tibor Billes <tbilles@gmx.com> Signed-off-by:
Paul E. McKenney <paulmck@linux.vnet.ibm.com> Tested-by:
Tibor Billes <tbilles@gmx.com> Reviewed-by:
Josh Triplett <josh@joshtriplett.org> Signed-off-by:
Jiri Slaby <jslaby@suse.cz>
-
Paul E. McKenney authored
commit c337f8f5 upstream. If a non-lazy callback arrives on a CPU that has previously gone idle with no non-lazy callbacks, invoke_rcu_core() forces the RCU core to run. However, it does not update the conditions, which could result in several closely spaced invocations of the RCU core, which in turn could result in an excessively high context-switch rate and resulting high overhead. This commit therefore updates the ->all_lazy and ->nonlazy_posted_snap fields to prevent closely spaced invocations. Reported-by:
Tibor Billes <tbilles@gmx.com> Signed-off-by:
Paul E. McKenney <paulmck@linux.vnet.ibm.com> Tested-by:
Tibor Billes <tbilles@gmx.com> Reviewed-by:
Josh Triplett <josh@joshtriplett.org> Signed-off-by:
Jiri Slaby <jslaby@suse.cz>
-
Thomas Gleixner authored
commit d689fe22 upstream. RCU and the fine grained idle time accounting functions check tick_nohz_enabled. But that variable is merily telling that NOHZ has been enabled in the config and not been disabled on the command line. But it does not tell anything about nohz being active. That's what all this should check for. Matthew reported, that the idle accounting on his old P1 machine showed bogus values, when he enabled NOHZ in the config and did not disable it on the kernel command line. The reason is that his machine uses (refined) jiffies as a clocksource which explains why the "fine" grained accounting went into lala land, because it depends on when the system goes and leaves idle relative to the jiffies increment. Provide a tick_nohz_active indicator and let RCU and the accounting code use this instead of tick_nohz_enable. Reported-and-tested-by:
Matthew Whitehead <tedheadster@gmail.com> Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Reviewed-by:
Steven Rostedt <rostedt@goodmis.org> Reviewed-by:
Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: john.stultz@linaro.org Cc: mwhitehe@redhat.com Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1311132052240.30673@ionos.tec.linutronix.deSigned-off-by:
Jiri Slaby <jslaby@suse.cz>
-
- 31 Aug, 2013 2 commits
-
-
Paul E. McKenney authored
Because RCU's quiescent-state-forcing mechanism is used to drive the full-system-idle state machine, and because this mechanism is executed by RCU's grace-period kthreads, this commit forces these kthreads to run on the timekeeping CPU (tick_do_timer_cpu). To do otherwise would mean that the RCU grace-period kthreads would force the system into non-idle state every time they drove the state machine, which would be just a bit on the futile side. Signed-off-by:
Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Reviewed-by:
Josh Triplett <josh@joshtriplett.org>
-
Paul E. McKenney authored
This commit adds the state machine that takes the per-CPU idle data as input and produces a full-system-idle indication as output. This state machine is driven out of RCU's quiescent-state-forcing mechanism, which invokes rcu_sysidle_check_cpu() to collect per-CPU idle state and then rcu_sysidle_report() to drive the state machine. The full-system-idle state is sampled using rcu_sys_is_idle(), which also drives the state machine if RCU is idle (and does so by forcing RCU to become non-idle). This function returns true if all but the timekeeping CPU (tick_do_timer_cpu) are idle and have been idle long enough to avoid memory contention on the full_sysidle_state state variable. The rcu_sysidle_force_exit() may be called externally to reset the state machine back into non-idle state. For large systems the state machine is driven out of RCU's force-quiescent-state logic, which provides good scalability at the price of millisecond-scale latencies on the transition to full-system-idle state. This is not so good for battery-powered systems, which are usually small enough that they don't need to care about scalability, but which do care deeply about energy efficiency. Small systems therefore drive the state machine directly out of the idle-entry code. The number of CPUs in a "small" system is defined by a new NO_HZ_FULL_SYSIDLE_SMALL Kconfig parameter, which defaults to 8. Note that this is a build-time definition. Signed-off-by:
Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> [ paulmck: Use true and false for boolean constants per Lai Jiangshan. ] Reviewed-by:
Josh Triplett <josh@joshtriplett.org> [ paulmck: Simplify logic and provide better comments for memory barriers, based on review comments and questions by Lai Jiangshan. ]
-
- 19 Aug, 2013 3 commits
-
-
Paul E. McKenney authored
This commit adds control variables and states for full-system idle. The system will progress through the states in numerical order when the system is fully idle (other than the timekeeping CPU), and reset down to the initial state if any non-timekeeping CPU goes non-idle. The current state is kept in full_sysidle_state. One flavor of RCU will be in charge of driving the state machine, defined by rcu_sysidle_state. This should be the busiest flavor of RCU. Signed-off-by:
Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Reviewed-by:
Josh Triplett <josh@joshtriplett.org>
-
Paul E. McKenney authored
This commit adds the code that updates the rcu_dyntick structure's new fields to track the per-CPU idle state based on interrupts and transitions into and out of the idle loop (NMIs are ignored because NMI handlers cannot cleanly read out the time anyway). This code is similar to the code that maintains RCU's idea of per-CPU idleness, but differs in that RCU treats CPUs running in user mode as idle, where this new code does not. Signed-off-by:
Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by:
Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Reviewed-by:
Josh Triplett <josh@joshtriplett.org>
-
Paul E. McKenney authored
This commit adds fields to the rcu_dyntick structure that are used to detect idle CPUs. These new fields differ from the existing ones in that the existing ones consider a CPU executing in user mode to be idle, where the new ones consider CPUs executing in user mode to be busy. The handling of these new fields is otherwise quite similar to that for the exiting fields. This commit also adds the initialization required for these fields. So, why is usermode execution treated differently, with RCU considering it a quiescent state equivalent to idle, while in contrast the new full-system idle state detection considers usermode execution to be non-idle? It turns out that although one of RCU's quiescent states is usermode execution, it is not a full-system idle state. This is because the purpose of the full-system idle state is not RCU, but rather determining when accurate timekeeping can safely be disabled. Whenever accurate timekeeping is required in a CONFIG_NO_HZ_FULL kernel, at least one CPU must keep the scheduling-clock tick going. If even one CPU is executing in user mode, accurate timekeeping is requires, particularly for architectures where gettimeofday() and friends do not enter the kernel. Only when all CPUs are really and truly idle can accurate timekeeping be disabled, allowing all CPUs to turn off the scheduling clock interrupt, thus greatly improving energy efficiency. This naturally raises the question "Why is this code in RCU rather than in timekeeping?", and the answer is that RCU has the data and infrastructure to efficiently make this determination. Signed-off-by:
Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by:
Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Reviewed-by:
Josh Triplett <josh@joshtriplett.org>
-
- 29 Jul, 2013 2 commits
-
-
Steven Rostedt (Red Hat) authored
Currently, RCU tracepoints save only a pointer to strings in the ring buffer. When displayed via the /sys/kernel/debug/tracing/trace file they are referenced like the printf "%s" that looks at the address in the ring buffer and prints out the string it points too. This requires that the strings are constant and persistent in the kernel. The problem with this is for tools like trace-cmd and perf that read the binary data from the buffers but have no access to the kernel memory to find out what string is represented by the address in the buffer. By using the tracepoint_string infrastructure, the RCU tracepoint strings can be exported such that userspace tools can map the addresses to the strings. # cat /sys/kernel/debug/tracing/printk_formats 0xffffffff81a4a0e8 : "rcu_preempt" 0xffffffff81a4a0f4 : "rcu_bh" 0xffffffff81a4a100 : "rcu_sched" 0xffffffff818437a0 : "cpuqs" 0xffffffff818437a6 : "rcu_sched" 0xffffffff818437a0 : "cpuqs" 0xffffffff818437b0 : "rcu_bh" 0xffffffff818437b7 : "Start context switch" 0xffffffff818437cc : "End context switch" 0xffffffff818437a0 : "cpuqs" [...] Now userspaces tools can display: rcu_utilization: Start context switch rcu_dyntick: Start 1 0 rcu_utilization: End context switch rcu_batch_start: rcu_preempt CBs=0/5 bl=10 rcu_dyntick: End 0 140000000000000 rcu_invoke_callback: rcu_preempt rhp=0xffff880071c0d600 func=proc_i_callback rcu_invoke_callback: rcu_preempt rhp=0xffff880077b5b230 func=__d_free rcu_dyntick: Start 140000000000000 0 rcu_invoke_callback: rcu_preempt rhp=0xffff880077563980 func=file_free_rcu rcu_batch_end: rcu_preempt CBs-invoked=3 idle=>c<>c<>c<>c< rcu_utilization: End RCU core rcu_grace_period: rcu_preempt 9741 start rcu_dyntick: Start 1 0 rcu_dyntick: End 0 140000000000000 rcu_dyntick: Start 140000000000000 0 Instead of: rcu_utilization: ffffffff81843110 rcu_future_grace_period: ffffffff81842f1d 9939 9939 9940 0 0 3 ffffffff81842f32 rcu_batch_start: ffffffff81842f1d CBs=0/4 bl=10 rcu_future_grace_period: ffffffff81842f1d 9939 9939 9940 0 0 3 ffffffff81842f3c rcu_grace_period: ffffffff81842f1d 9939 ffffffff81842f80 rcu_invoke_callback: ffffffff81842f1d rhp=0xffff88007888aac0 func=file_free_rcu rcu_grace_period: ffffffff81842f1d 9939 ffffffff81842f95 rcu_invoke_callback: ffffffff81842f1d rhp=0xffff88006aeb4600 func=proc_i_callback rcu_future_grace_period: ffffffff81842f1d 9939 9939 9940 0 0 3 ffffffff81842f32 rcu_future_grace_period: ffffffff81842f1d 9939 9939 9940 0 0 3 ffffffff81842f3c rcu_invoke_callback: ffffffff81842f1d rhp=0xffff880071cb9fc0 func=__d_free rcu_grace_period: ffffffff81842f1d 9939 ffffffff81842f80 rcu_invoke_callback: ffffffff81842f1d rhp=0xffff88007888ae80 func=file_free_rcu rcu_batch_end: ffffffff81842f1d CBs-invoked=4 idle=>c<>c<>c<>c< rcu_utilization: ffffffff8184311f Signed-off-by:
Steven Rostedt <rostedt@goodmis.org>
-
Steven Rostedt (Red Hat) authored
The RCU_STATE_INITIALIZER() macro is used only in the rcutree.c file as well as the rcutree_plugin.h file. It is passed as a rvalue to a variable of a similar name. A per_cpu variable is also created with a similar name as well. The uses of RCU_STATE_INITIALIZER() can be simplified to remove some of the duplicate code that is done. Currently the three users of this macro has this format: struct rcu_state rcu_sched_state = RCU_STATE_INITIALIZER(rcu_sched, call_rcu_sched); DEFINE_PER_CPU(struct rcu_data, rcu_sched_data); Notice that "rcu_sched" is called three times. This is the same with the other two users. This can be condensed to just: RCU_STATE_INITIALIZER(rcu_sched, call_rcu_sched); by moving the rest into the macro itself. This also opens the door to allow the RCU tracepoint strings and their addresses to be exported so that userspace tracing tools can translate the contents of the pointers of the RCU tracepoints. The change will allow for helper code to be placed in the RCU_STATE_INITIALIZER() macro to export the name that is used. Signed-off-by:
Steven Rostedt <rostedt@goodmis.org>
-
- 14 Jul, 2013 1 commit
-
-
Paul Gortmaker authored
The __cpuinit type of throwaway sections might have made sense some time ago when RAM was more constrained, but now the savings do not offset the cost and complications. For example, the fix in commit 5e427ec2 ("x86: Fix bit corruption at CPU resume time") is a good example of the nasty type of bugs that can be created with improper use of the various __init prefixes. After a discussion on LKML[1] it was decided that cpuinit should go the way of devinit and be phased out. Once all the users are gone, we can then finally remove the macros themselves from linux/init.h. This removes all the drivers/rcu uses of the __cpuinit macros from all C files. [1] https://lkml.org/lkml/2013/5/20/589 Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Josh Triplett <josh@freedesktop.org> Cc: Dipankar Sarma <dipankar@in.ibm.com> Reviewed-by:
Josh Triplett <josh@joshtriplett.org> Signed-off-by:
Paul Gortmaker <paul.gortmaker@windriver.com>
-
- 10 Jun, 2013 4 commits
-
-
Paul E. McKenney authored
Now that TINY_PREEMPT_RCU is no more, exit_rcu() is always an empty function. But if TINY_RCU is going to have an empty function, it should be in include/linux/rcutiny.h, where it does not bloat the kernel. This commit therefore moves exit_rcu() out of kernel/rcupdate.c to kernel/rcutree_plugin.h, and places a static inline empty function in include/linux/rcutiny.h in order to shrink TINY_RCU a bit. Signed-off-by:
Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by:
Josh Triplett <josh@joshtriplett.org>
-
Paul E. McKenney authored
After a release or two, features are no longer experimental. Therefore, this commit removes the "Experimental" tag from them. Reported-by:
Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by:
Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by:
Josh Triplett <josh@joshtriplett.org>
-
Paul E. McKenney authored
Because note_gp_changes() now incorporates rcu_process_gp_end() function, this commit switches to the former and eliminates the latter. In addition, this commit changes external calls from __rcu_process_gp_end() to __note_gp_changes(). Signed-off-by:
Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by:
Josh Triplett <josh@joshtriplett.org>
-
Paul E. McKenney authored
This commit converts printk() calls to the corresponding pr_*() calls. Signed-off-by:
Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by:
Josh Triplett <josh@joshtriplett.org>
-
- 15 May, 2013 1 commit
-
-
Sasha Levin authored
When rcu_init() is called we already have slab working, allocating bootmem at that point results in warnings and an allocation from slab. This commit therefore changes alloc_bootmem_cpumask_var() to alloc_cpumask_var() in rcu_bootup_announce_oddness(), which is called from rcu_init(). Signed-off-by:
Sasha Levin <sasha.levin@oracle.com> Signed-off-by:
Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by:
Josh Triplett <josh@joshtriplett.org> Tested-by:
Robin Holt <holt@sgi.com> [paulmck: convert to zalloc_cpumask_var(), as suggested by Yinghai Lu.]
-
- 14 May, 2013 1 commit
-
-
Paul E. McKenney authored
Commit c0f4dfd4 (rcu: Make RCU_FAST_NO_HZ take advantage of numbered callbacks) introduced a bug that can result in excessively long grace periods. This bug reverse the senes of the "if" statement checking for lazy callbacks, so that RCU takes a lazy approach when there are in fact non-lazy callbacks. This can result in excessive boot, suspend, and resume times. This commit therefore fixes the sense of this "if" statement. Reported-by:
Borislav Petkov <bp@alien8.de> Reported-by:
Bjørn Mork <bjorn@mork.no> Reported-by:
Joerg Roedel <joro@8bytes.org> Signed-off-by:
Paul E. McKenney <paulmck@linux.vnet.ibm.com> Tested-by:
Bjørn Mork <bjorn@mork.no> Tested-by:
Joerg Roedel <joro@8bytes.org>
-
- 19 Apr, 2013 1 commit
-
-
Frederic Weisbecker authored
We need full dynticks CPU to also be RCU nocb so that we don't have to keep the tick to handle RCU callbacks. Make sure the range passed to nohz_full= boot parameter is a subset of rcu_nocbs= The CPUs that fail to meet this requirement will be excluded from the nohz_full range. This is checked early in boot time, before any CPU has the opportunity to stop its tick. Suggested-by:
Steven Rostedt <rostedt@goodmis.org> Reviewed-by:
Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by:
Frederic Weisbecker <fweisbec@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Christoph Lameter <cl@linux.com> Cc: Geoff Levand <geoff@infradead.org> Cc: Gilad Ben Yossef <gilad@benyossef.com> Cc: Hakan Akkan <hakanakkan@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Kevin Hilman <khilman@linaro.org> Cc: Li Zhong <zhong@linux.vnet.ibm.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de>
-
- 15 Apr, 2013 1 commit
-
-
Paul E. McKenney authored
Adaptive-ticks CPUs inform RCU when they enter kernel mode, but they do not necessarily turn the scheduler-clock tick back on. This state of affairs could result in RCU waiting on an adaptive-ticks CPU running for an extended period in kernel mode. Such a CPU will never run the RCU state machine, and could therefore indefinitely extend the RCU state machine, sooner or later resulting in an OOM condition. This patch, inspired by an earlier patch by Frederic Weisbecker, therefore causes RCU's force-quiescent-state processing to check for this condition and to send an IPI to CPUs that remain in that state for too long. "Too long" currently means about three jiffies by default, which is quite some time for a CPU to remain in the kernel without blocking. The rcu_tree.jiffies_till_first_fqs and rcutree.jiffies_till_next_fqs sysfs variables may be used to tune "too long" if needed. Reported-by:
Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by:
Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by:
Josh Triplett <josh@joshtriplett.org> Signed-off-by:
Frederic Weisbecker <fweisbec@gmail.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Christoph Lameter <cl@linux.com> Cc: Geoff Levand <geoff@infradead.org> Cc: Gilad Ben Yossef <gilad@benyossef.com> Cc: Hakan Akkan <hakanakkan@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Kevin Hilman <khilman@linaro.org> Cc: Li Zhong <zhong@linux.vnet.ibm.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de>
-
- 26 Mar, 2013 11 commits
-
-
Paul E. McKenney authored
CPUs going idle will need to record the need for a future grace period, but won't actually need to block waiting on it. This commit therefore splits rcu_start_future_gp(), which does the recording, from rcu_nocb_wait_gp(), which now invokes rcu_start_future_gp() to do the recording, after which rcu_nocb_wait_gp() does the waiting. Signed-off-by:
Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by:
Paul E. McKenney <paulmck@linux.vnet.ibm.com>
-
Paul E. McKenney authored
CPUs going idle need to be able to indicate their need for future grace periods. A mechanism for doing this already exists for no-callbacks CPUs, so the idea is to re-use that mechanism. This commit therefore moves the ->n_nocb_gp_requests field of the rcu_node structure out from under the CONFIG_RCU_NOCB_CPU #ifdef and renames it to ->need_future_gp. Signed-off-by:
Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by:
Paul E. McKenney <paulmck@linux.vnet.ibm.com>
-
Paul E. McKenney authored
If CPUs are to give prior notice of needed grace periods, it will be necessary to invoke rcu_start_gp() without dropping the root rcu_node structure's ->lock. This commit takes a second step in this direction by moving the release of this lock to rcu_start_gp()'s callers. Signed-off-by:
Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by:
Paul E. McKenney <paulmck@linux.vnet.ibm.com>
-
Paul E. McKenney authored
Dyntick-idle CPUs need to be able to pre-announce their need for grace periods. This can be done using something similar to the mechanism used by no-CB CPUs to announce their need for grace periods. This commit moves in this direction by renaming the no-CBs grace-period event tracing to suit the new future-grace-period needs. Signed-off-by:
Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by:
Paul E. McKenney <paulmck@linux.vnet.ibm.com>
-
Paul E. McKenney authored
Because RCU callbacks are now associated with the number of the grace period that they must wait for, CPUs can now take advance callbacks corresponding to grace periods that ended while a given CPU was in dyntick-idle mode. This eliminates the need to try forcing the RCU state machine while entering idle, thus reducing the CPU intensiveness of RCU_FAST_NO_HZ, which should increase its energy efficiency. Signed-off-by:
Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by:
Paul E. McKenney <paulmck@linux.vnet.ibm.com>
-
Paul E. McKenney authored
RCU_FAST_NO_HZ operation is controlled by four compile-time C-preprocessor macros, but some use cases benefit greatly from runtime adjustment, particularly when tuning devices. This commit therefore creates the corresponding sysfs entries. Reported-by:
Robin Randhawa <robin.randhawa@arm.com> Signed-off-by:
Paul E. McKenney <paulmck@linux.vnet.ibm.com>
-
Paul E. McKenney authored
Currently, the per-no-CBs-CPU kthreads are named "rcuo" followed by the CPU number, for example, "rcuo". This is problematic given that there are either two or three RCU flavors, each of which gets a per-CPU kthread with exactly the same name. This commit therefore introduces a one-letter abbreviation for each RCU flavor, namely 'b' for RCU-bh, 'p' for RCU-preempt, and 's' for RCU-sched. This abbreviation is used to distinguish the "rcuo" kthreads, for example, for CPU 0 we would have "rcuob/0", "rcuop/0", and "rcuos/0". Signed-off-by:
Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by:
Paul E. McKenney <paulmck@linux.vnet.ibm.com> Tested-by:
Dietmar Eggemann <dietmar.eggemann@arm.com>
-
Paul E. McKenney authored
Signed-off-by:
Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by:
Paul E. McKenney <paulmck@linux.vnet.ibm.com>
-
Paul E. McKenney authored
Signed-off-by:
Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by:
Paul E. McKenney <paulmck@linux.vnet.ibm.com>
-
Paul E. McKenney authored
Currently, the no-CBs kthreads do repeated timed waits for grace periods to elapse. This is crude and energy inefficient, so this commit allows no-CBs kthreads to specify exactly which grace period they are waiting for and also allows them to block for the entire duration until the desired grace period completes. Signed-off-by:
Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by:
Paul E. McKenney <paulmck@linux.vnet.ibm.com>
-
Paul E. McKenney authored
Currently, the only way to specify no-CBs CPUs is via the rcu_nocbs kernel command-line parameter. This is inconvenient in some cases, particularly for randconfig testing, so this commit adds a new set of kernel configuration parameters. CONFIG_RCU_NOCB_CPU_NONE (the default) retains the old behavior, CONFIG_RCU_NOCB_CPU_ZERO offloads callback processing from CPU 0 (along with any other CPUs specified by the rcu_nocbs boot-time parameter), and CONFIG_RCU_NOCB_CPU_ALL offloads callback processing from all CPUs. Signed-off-by:
Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by:
Paul E. McKenney <paulmck@linux.vnet.ibm.com>
-
- 13 Mar, 2013 1 commit
-
-
Paul E. McKenney authored
If RCU's softirq handler is prevented from executing, an RCU CPU stall warning can result. Ways to prevent RCU's softirq handler from executing include: (1) CPU spinning with interrupts disabled, (2) infinite loop in some softirq handler, and (3) in -rt kernels, an infinite loop in a set of real-time threads running at priorities higher than that of RCU's softirq handler. Because this situation can be difficult to track down, this commit causes the count of RCU softirq handler invocations to be printed with RCU CPU stall warnings. This information does require some interpretation, as now documented in Documentation/RCU/stallwarn.txt. Reported-by:
Thomas Gleixner <tglx@linutronix.de> Signed-off-by:
Paul E. McKenney <paulmck@linux.vnet.ibm.com> Tested-by:
Paul Gortmaker <paul.gortmaker@windriver.com>
-
- 12 Mar, 2013 1 commit
-
-
Paul E. McKenney authored
Currently, CPU 0 is constrained to not be a no-CBs CPU, and furthermore at least one no-CBs CPU must remain online at any given time. These restrictions are problematic in some situations, such as cases where all CPUs must run a real-time workload that needs to be insulated from OS jitter and latencies due to RCU callback invocation. This commit therefore provides no-CBs CPUs a (very crude and energy-inefficient) way to start and to wait for grace periods independently of the normal RCU callback mechanisms. This approach allows any or all of the CPUs to be designated as no-CBs CPUs, and allows any proper subset of the CPUs (whether no-CBs CPUs or not) to be offlined. This commit also provides a fix for a locking bug spotted by Xie ChanglongX <changlongx.xie@intel.com>. Signed-off-by:
Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by:
Paul E. McKenney <paulmck@linux.vnet.ibm.com>
-
- 08 Jan, 2013 2 commits
-
-
Paul Gortmaker authored
The as-documented rcu_nocb_poll will fail to enable this feature for two reasons. (1) there is an extra "s" in the documented name which is not in the code, and (2) since it uses module_param, it really is expecting a prefix, akin to "rcutree.fanout_leaf" and the prefix isn't documented. However, there are several reasons why we might not want to simply fix the typo and add the prefix: 1) we'd end up with rcutree.rcu_nocb_poll, and rather probably make a change to rcutree.nocb_poll 2) if we did #1, then the prefix wouldn't be consistent with the rcu_nocbs=<cpumap> parameter (i.e. one with, one without prefix) 3) the use of module_param in a header file is less than desired, since it isn't immediately obvious that it will get processed via rcutree.c and get the prefix from that (although use of module_param_named() could clarify that.) 4) the implied export of /sys/module/rcutree/parameters/rcu_nocb_poll data to userspace via module_param() doesn't really buy us anything, as it is read-only and we can tell if it is enabled already without it, since there is a printk at early boot telling us so. In light of all that, just change it from a module_param() to an early_setup() call, and worry about adding it to /sys later on if we decide to allow a dynamic setting of it. Also change the variable to be tagged as read_mostly, since it will only ever be fiddled with at most, once at boot. Signed-off-by:
Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by:
Paul E. McKenney <paulmck@linux.vnet.ibm.com>
-
Paul Gortmaker authored
The wait_event() at the head of the rcu_nocb_kthread() can result in soft-lockup complaints if the CPU in question does not register RCU callbacks for an extended period. This commit therefore changes the wait_event() to a wait_event_interruptible(). Reported-by:
Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by:
Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by:
Paul E. McKenney <paulmck@linux.vnet.ibm.com>
-
- 16 Nov, 2012 2 commits
-
-
Paul E. McKenney authored
Currently, callback invocations from callback-free CPUs are accounted to the CPU that registered the callback, but using the same field that is used for normal callbacks. This makes it impossible to determine from debugfs output whether callbacks are in fact being diverted. This commit therefore adds a separate ->n_nocbs_invoked field in the rcu_data structure in which diverted callback invocations are counted. RCU's debugfs tracing still displays normal callback invocations using ci=, but displayed diverted callbacks with nci=. Signed-off-by:
Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by:
Paul E. McKenney <paulmck@linux.vnet.ibm.com>
-
Paul E. McKenney authored
RCU callback execution can add significant OS jitter and also can degrade both scheduling latency and, in asymmetric multiprocessors, energy efficiency. This commit therefore adds the ability for selected CPUs ("rcu_nocbs=" boot parameter) to have their callbacks offloaded to kthreads. If the "rcu_nocb_poll" boot parameter is also specified, these kthreads will do polling, removing the need for the offloaded CPUs to do wakeups. At least one CPU must be doing normal callback processing: currently CPU 0 cannot be selected as a no-CBs CPU. In addition, attempts to offline the last normal-CBs CPU will fail. This feature was inspired by Jim Houston's and Joe Korty's JRCU, and this commit includes fixes to problems located by Fengguang Wu's kbuild test robot. [ paulmck: Added gfp.h include file as suggested by Fengguang Wu. ] Signed-off-by:
Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by:
Paul E. McKenney <paulmck@linux.vnet.ibm.com>
-
- 13 Nov, 2012 1 commit
-
-
Paul E. McKenney authored
This commit explicitly states the memory-ordering properties of the RCU grace-period primitives. Although these properties were in some sense implied by the fundmental property of RCU ("a grace period must wait for all pre-existing RCU read-side critical sections to complete"), stating it explicitly will be a great labor-saving device. Reported-by:
Oleg Nesterov <oleg@redhat.com> Signed-off-by:
Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by:
Oleg Nesterov <oleg@redhat.com>
-
- 08 Nov, 2012 1 commit
-
-
Paul E. McKenney authored
The ->onofflock field in the rcu_state structure at one time synchronized CPU-hotplug operations for RCU. However, its scope has decreased over time so that it now only protects the lists of orphaned RCU callbacks. This commit therefore renames it to ->orphan_lock to reflect its current use. Signed-off-by:
Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by:
Paul E. McKenney <paulmck@linux.vnet.ibm.com>
-
- 23 Oct, 2012 1 commit
-
-
Antti P Miettinen authored
There have been some embedded applications that would benefit from use of expedited grace-period primitives. In some ways, this is similar to synchronize_net() doing either a normal or an expedited grace period depending on lock state, but with control outside of the kernel. This commit therefore adds rcu_expedited boot and sysfs parameters that cause the kernel to substitute expedited primitives for the normal grace-period primitives. [ paulmck: Add trace/event/rcu.h to kernel/srcu.c to avoid build error. Get rid of infinite loop through contention path.] Signed-off-by:
Antti P Miettinen <amiettinen@nvidia.com> Signed-off-by:
Paul E. McKenney <paulmck@linux.vnet.ibm.com>
-
- 26 Sep, 2012 1 commit
-
-
Paul E. McKenney authored
The current implementation of RCU_FAST_NO_HZ tries reasonably hard to rid the current CPU of RCU callbacks. This is appropriate when the CPU is entering idle, where it doesn't have much useful to do anyway, but is most definitely not what you want when transitioning to user-mode execution. This commit therefore detects the adaptive-tick case, and refrains from burning CPU time getting rid of RCU callbacks in that case. Signed-off-by:
Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by:
Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by:
Josh Triplett <josh@joshtriplett.org>
-