- 03 Sep, 2020 4 commits
-
-
Paul E. McKenney authored
scftorture.2020.08.24a: Torture tests for smp_call_function() and friends.
-
Paul E. McKenney authored
doc.2020.08.24a: Documentation updates. fixes.2020.09.03b: Miscellaneous fixes. torture.2020.08.24a: Torture-test updates.
-
Zqiang authored
CPUs can go offline shortly after kfree_call_rcu() has been invoked, which can leave memory stranded until those CPUs come back online. This commit therefore drains the kcrp of each CPU, not just the ones that happen to be online. Acked-by: Joel Fernandes <joel@joelfernandes.org> Signed-off-by: Zqiang <qiang.zhang@windriver.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
-
Joel Fernandes (Google) authored
The rcu_segcblist_accelerate() function returns true iff it is necessary to request another grace period. A tracing session showed that this function unnecessarily requests grace periods. For example, consider the following sequence of events: 1. Callbacks are queued only on the NEXT segment of CPU A's callback list. 2. CPU A runs RCU_SOFTIRQ, accelerating these callbacks from NEXT to WAIT. 3. Thus rcu_segcblist_accelerate() returns true, requesting grace period N. 4. RCU's grace-period kthread wakes up on CPU B and starts grace period N. 4. CPU A notices the new grace period and invokes RCU_SOFTIRQ. 5. CPU A's RCU_SOFTIRQ again invokes rcu_segcblist_accelerate(), but there are no new callbacks. However, rcu_segcblist_accelerate() nevertheless (uselessly) requests a new grace period N+1. This extra grace period results in additional lock contention and also additional wakeups, all for no good reason. This commit therefore adds a check to rcu_segcblist_accelerate() that prevents the return of true when there are no new callbacks. This change reduces the number of grace periods (GPs) and wakeups in each of eleven five-second rcutorture runs as follows: +----+-------------------+-------------------+ | # | Number of GPs | Number of Wakeups | +====+=========+=========+=========+=========+ | 1 | With | Without | With | Without | +----+---------+---------+---------+---------+ | 2 | 75 | 89 | 113 | 119 | +----+---------+---------+---------+---------+ | 3 | 62 | 91 | 105 | 123 | +----+---------+---------+---------+---------+ | 4 | 60 | 79 | 98 | 110 | +----+---------+---------+---------+---------+ | 5 | 63 | 79 | 99 | 112 | +----+---------+---------+---------+---------+ | 6 | 57 | 89 | 96 | 123 | +----+---------+---------+---------+---------+ | 7 | 64 | 85 | 97 | 118 | +----+---------+---------+---------+---------+ | 8 | 58 | 83 | 98 | 113 | +----+---------+---------+---------+---------+ | 9 | 57 | 77 | 89 | 104 | +----+---------+---------+---------+---------+ | 10 | 66 | 82 | 98 | 119 | +----+---------+---------+---------+---------+ | 11 | 52 | 82 | 83 | 117 | +----+---------+---------+---------+---------+ The reduction in the number of wakeups ranges from 5% to 40%. Cc: urezki@gmail.com [ paulmck: Rework commit log and comment. ] Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
-
- 25 Aug, 2020 36 commits
-
-
Paul E. McKenney authored
This commit adds a "--gdb" parameter to kvm.sh, which causes "CONFIG_DEBUG_INFO=y" to be added to the Kconfig options, "nokaslr" to be added to the boot parameters, and "-s -S" to be added to the qemu arguments. Furthermore, the scripting prints messages telling the user how to start up gdb for the run in question. Because of the interactive nature of gdb sessions, only one "--configs" scenario is permitted when "--gdb" is specified. For most torture types, this means that a "--configs" argument is required, and that argument must specify the single scenario of interest. The usual cautions about breakpoints and timing apply, for example, staring at your gdb prompt for too long will likely get you many complaints, including RCU CPU stall warnings. Omar Sandoval further suggests using gdb's "hbreak" command instead of the "break" command on systems supporting hardware breakpoints, and further using the "commands" option because the resulting non-interactive breakpoints are less likely to get you RCU CPU stall warnings. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
-
Paul E. McKenney authored
This commit adds an rcutorture.leakpointer module parameter that intentionally leaks an RCU-protected pointer out of the RCU read-side critical section and checks to see if the corresponding grace period has elapsed, emitting a WARN_ON_ONCE() if so. This module parameter can be used to test facilities like CONFIG_RCU_STRICT_GRACE_PERIOD that end grace periods quickly. While in the area, also document rcutorture.irqreader, which was previously left out. Reported-by Jann Horn <jannh@google.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
-
Paul E. McKenney authored
Currently, registering and unregistering the OOM notifier is done right before and after the test, respectively. This will not work well for multi-threaded tests, so this commit hoists this registering and unregistering up into the rcu_torture_fwd_prog_init() and rcu_torture_fwd_prog_cleanup() functions. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
-
Colin Ian King authored
Currently in the unlikely event that buf fails to be allocated it is dereferenced a few times. Use the errexit flag to determine if buf should be written to to avoid the null pointer dereferences. Addresses-Coverity: ("Dereference after null check") Fixes: f518f154 ("refperf: Dynamically allocate experiment-summary output buffer") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
-
Paul E. McKenney authored
The current rcutorture forward-progress code assumes that it is the only cause of out-of-memory (OOM) events. For script-based rcutorture testing, this assumption is in fact correct. However, testing based on modprobe/rmmod might well encounter external OOM events, which could happen at any time. This commit therefore properly synchronizes the interaction between rcutorture's forward-progress testing and its OOM notifier by adding a global mutex. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
-
Paul E. McKenney authored
The conversion of rcu_fwds to dynamic allocation failed to actually allocate the required structure. This commit therefore allocates it, frees it, and updates rcu_fwds accordingly. While in the area, it abstracts the cleanup actions into rcu_torture_fwd_prog_cleanup(). Fixes: 5155be99 ("rcutorture: Dynamically allocate rcu_fwds structure") Reported-by: kernel test robot <rong.a.chen@intel.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
-
Paul E. McKenney authored
This commit adds a --help argument (along with its synonym -h) to display the help text. While in the area, this commit also updates the help text. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
-
Paul E. McKenney authored
Currently, the CONFIG_PROVE_RCU_LIST=y case is untested. This commit therefore adds CONFIG_PROVE_RCU_LIST=y to rcutorture's TREE05 scenario. Cc: Madhuparna Bhowmik <madhuparnabhowmik10@gmail.com> Cc: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
-
Paul E. McKenney authored
The rcu-test-image.txt documentation covers a very uncommon case where a real userspace environment is required. However, someone reading this document might reasonably conclude that this is in fact a prerequisite. In addition, the initrd.txt file mentions dracut, which is no longer used. This commit therefore provides the needed updates. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
-
Alexander A. Klimov authored
Rationale: Reduces attack surface on kernel devs opening the links for MITM as HTTPS traffic is much harder to manipulate. Deterministic algorithm: For each file: If not .svg: For each line: If doesn't contain `\bxmlns\b`: For each link, `\bhttp://[^# \t\r\n]*(?:\w|/)`: If neither `\bgnu\.org/license`, nor `\bmozilla\.org/MPL\b`: If both the HTTP and HTTPS versions return 200 OK and serve the same content: Replace HTTP with HTTPS. Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
-
Wei Yongjun authored
The sparse tool complains as follows: kernel/locking/locktorture.c:569:6: warning: symbol 'torture_percpu_rwsem_init' was not declared. Should it be static? And this function is not used outside of locktorture.c, so this commit marks it static. Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
-
Paul Gortmaker authored
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
-
Joel Fernandes (Google) authored
This commit adds code to print the grace-period number at the start of the test along with both the grace-period number and the number of elapsed grace periods at the end of the test. Note that variants of RCU)without the notion of a grace-period number (for example, Tiny RCU) just print zeroes. [ paulmck: Adjust commit log. ] Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
-
Paul E. McKenney authored
KCSAN is now in mainline, so this commit removes the stubs for the data_race(), ASSERT_EXCLUSIVE_WRITER(), and ASSERT_EXCLUSIVE_ACCESS() macros. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
-
Paul E. McKenney authored
This commit further avoids conflation of rcuperf with the kernel's perf feature by renaming kernel/rcu/rcuperf.c to kernel/rcu/rcuscale.c, and also by similarly renaming the functions and variables inside this file. This has the side effect of changing the names of the kernel boot parameters, so kernel-parameters.txt and ver_functions.sh are also updated. The rcutorture --torture type was also updated from rcuperf to rcuscale. [ paulmck: Fix bugs located by Stephen Rothwell. ] Reported-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
-
Paul E. McKenney authored
Although the test loop does randomly delay, which would provide quiescent states and so forth, it is possible for there to be a series of long smp_call_function*() handler runtimes with no delays, which results in softlockup and RCU CPU stall warning messages. This commit therefore inserts a cond_resched() into the main test loop. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
-
Paul E. McKenney authored
On uniprocessor systems, smp_call_function() does nothing. This commit therefore avoids complaining about the lack of handler accesses in the single-CPU case where there is no handler. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
-
Paul E. McKenney authored
Currently, CPU-hotplug operations might result in all but two of (say) 100 CPUs being offline, which in turn might result in false-positive diagnostics due to overload. This commit therefore causes scftorture_invoker() kthreads for offline CPUs to loop blocking for 200 milliseconds at a time, thus continuously adjusting the number of threads to match the number of online CPUs. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
-
Paul E. McKenney authored
This commit adds a "default" case to the switch statement in scftorture_invoke_one() which contains a WARN_ON_ONCE() and an assignment to ->scfc_out to suppress knock-on warnings. These knock-on warnings could otherwise cause the user to think that there was a memory-ordering problem in smp_call_function() instead of a bug in scftorture.c itself. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
-
Wei Yongjun authored
The sparse tool complains as follows kernel/scftorture.c:124:1: warning: symbol '__pcpu_scope_scf_torture_rand' was not declared. Should it be static? And this per-CPU variable is not used outside of scftorture.c, so this commit marks it static. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
-
Paul E. McKenney authored
Detecting smp_call_function() memory misordering requires close timing, so it is necessary to have the checks immediately before and after the call to the smp_call_function*() function under test. This commit therefore inserts barrier() calls to prevent the compiler from optimizing memory-misordering detection down into the zone of extreme improbability. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
-
Paul E. McKenney authored
This commit prints error counts on the statistics line and also adds a "!!!" if any of the counters are non-zero. Allocation failures are (somewhat) forgiven, but all other errors result in a "FAILURE" print at the end of the test. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
-
Paul E. McKenney authored
This commit hoists much of the initialization of the scf_check structure out of the switch statement, thus saving a few lines of code. The initialization of the ->scfc_in field remains in each leg of the switch statement in order to more heavily stress memory ordering. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
-
Paul E. McKenney authored
This commit moves checking of the ->scfc_out field and the freeing of the scf_check structure down below the end of switch statement, thus saving a few lines of code. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
-
Paul E. McKenney authored
This commit adds checks for memory misordering across calls to and returns from smp_call_function() in the case where the caller waits. Misordering results in a splat. Note that in contrast to smp_call_function_single(), this code does not test memory ordering into the handler in the no-wait case because none of the handlers would be able to free the scf_check structure without introducing heavy synchronization to work out which was last. [ paulmck: s/GFP_KERNEL/GFP_ATOMIC/ per kernel test robot feedback. ] Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
-
Paul E. McKenney authored
This commit adds checks for memory misordering across calls to and returns from smp_call_function_many() in the case where the caller waits. Misordering results in a splat. Note that in contrast to smp_call_function_single(), this code does not test memory ordering into the handler in the no-wait case because none of the handlers would be able to free the scf_check structure without introducing heavy synchronization to work out which was last. [ paulmck: s/GFP_KERNEL/GFP_ATOMIC/ per kernel test robot feedback. ] Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
-
Paul E. McKenney authored
This commit adds checks for memory misordering across calls to smp_call_function_single() and also across returns in the case where the caller waits. Misordering results in a splat. [ paulmck: s/GFP_KERNEL/GFP_ATOMIC/ per kernel test robot feedback. ] Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
-
Paul E. McKenney authored
This commit summarizes the per-thread statistics, providing counts of the number of single, many, and all calls, both no-wait and wait, and, for the single case, the number where the target CPU was offline. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
-
Paul E. McKenney authored
Currently, can_stop_idle_tick() prints "NOHZ: local_softirq_pending HH" (where "HH" is the hexadecimal softirq vector number) when one or more non-RCU softirq handlers are still enabled when checking to stop the scheduler-tick interrupt. This message is not as enlightening as one might hope, so this commit changes it to "NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #HH". Reported-by: Andy Lutomirski <luto@kernel.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
-
Paul E. McKenney authored
This commit uses the scftorture.weight* kernel parameters to randomly chooses between smp_call_function_single(), smp_call_function_many(), and smp_call_function(). For each variant, it also randomly chooses whether to invoke it synchronously (wait=1) or asynchronously (wait=0). The percentage weighting for each option are dumped to the console log (search for "scf_sel_dump"). This accumulates statistics, which a later commit will dump out at the end of the run. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
-
Paul E. McKenney authored
This commit updates the rcutorture scripting to include the new scftorture torture-test module. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
-
Paul E. McKenney authored
Currently, parse-torture.sh looks at the fifth field of torture-test console output for the version number. This works fine for rcutorture, but not for scftorture, which lacks the pointer field. This commit therefore adjusts matching lines so that the parse-console.sh awk script always sees the version number as the first field in the lines passed to it. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
-
Paul E. McKenney authored
This commit adds an smp_call_function() torture test that repeatedly invokes this function and complains if things go badly awry. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
-
Paul E. McKenney authored
The x86/entry work removed all uses of __rcu_is_watching(), therefore this commit removes it entirely. Cc: Andy Lutomirski <luto@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: <x86@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
-
Joel Fernandes (Google) authored
The RCU grace-period kthread's force-quiescent state (FQS) loop should never see an offline CPU that has not yet reported a quiescent state. After all, the offline CPU should have reported a quiescent state during the CPU-offline process, or, failing that, by rcu_gp_init() if it ran concurrently with either the CPU going offline or the last task on a leaf rcu_node structure exiting its RCU read-side critical section while all CPUs corresponding to that structure are offline. The FQS loop should therefore complain if it does see an offline CPU that has not yet reported a quiescent state. And it does, but only once the grace period has been in force for a full second. This commit therefore makes this warning more aggressive, so that it will trigger as soon as the condition makes its appearance. Light testing with TREE03 and hotplug shows no warnings. This commit also converts the warning to WARN_ON_ONCE() in order to stave off possible log spam. Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
-
Joel Fernandes (Google) authored
Since at least v4.19, the FQS loop no longer reports quiescent states for offline CPUs except in emergency situations. This commit therefore fixes the comment in rcu_gp_init() to match the current code. Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
-