Commit 355debb8 authored by Neeraj Upadhyay's avatar Neeraj Upadhyay

Merge branches 'context_tracking.15.08.24a', 'csd.lock.15.08.24a',...

Merge branches 'context_tracking.15.08.24a', 'csd.lock.15.08.24a', 'nocb.09.09.24a', 'rcutorture.14.08.24a', 'rcustall.09.09.24a', 'srcu.12.08.24a', 'rcu.tasks.14.08.24a', 'rcu_scaling_tests.15.08.24a', 'fixes.12.08.24a' and 'misc.11.08.24a' into next.09.09.24a
......@@ -2649,8 +2649,7 @@ those that are idle from RCU's perspective) and then Tasks Rude RCU can
be removed from the kernel.
The tasks-rude-RCU API is also reader-marking-free and thus quite compact,
consisting of call_rcu_tasks_rude(), synchronize_rcu_tasks_rude(),
and rcu_barrier_tasks_rude().
consisting solely of synchronize_rcu_tasks_rude().
Tasks Trace RCU
~~~~~~~~~~~~~~~
......
......@@ -194,14 +194,13 @@ over a rather long period of time, but improvements are always welcome!
when publicizing a pointer to a structure that can
be traversed by an RCU read-side critical section.
5. If any of call_rcu(), call_srcu(), call_rcu_tasks(),
call_rcu_tasks_rude(), or call_rcu_tasks_trace() is used,
the callback function may be invoked from softirq context,
and in any case with bottom halves disabled. In particular,
this callback function cannot block. If you need the callback
to block, run that code in a workqueue handler scheduled from
the callback. The queue_rcu_work() function does this for you
in the case of call_rcu().
5. If any of call_rcu(), call_srcu(), call_rcu_tasks(), or
call_rcu_tasks_trace() is used, the callback function may be
invoked from softirq context, and in any case with bottom halves
disabled. In particular, this callback function cannot block.
If you need the callback to block, run that code in a workqueue
handler scheduled from the callback. The queue_rcu_work()
function does this for you in the case of call_rcu().
6. Since synchronize_rcu() can block, it cannot be called
from any sort of irq context. The same rule applies
......@@ -254,10 +253,10 @@ over a rather long period of time, but improvements are always welcome!
corresponding readers must use rcu_read_lock_trace()
and rcu_read_unlock_trace().
c. If an updater uses call_rcu_tasks_rude() or
synchronize_rcu_tasks_rude(), then the corresponding
readers must use anything that disables preemption,
for example, preempt_disable() and preempt_enable().
c. If an updater uses synchronize_rcu_tasks_rude(),
then the corresponding readers must use anything that
disables preemption, for example, preempt_disable()
and preempt_enable().
Mixing things up will result in confusion and broken kernels, and
has even resulted in an exploitable security issue. Therefore,
......@@ -326,11 +325,9 @@ over a rather long period of time, but improvements are always welcome!
d. Periodically invoke rcu_barrier(), permitting a limited
number of updates per grace period.
The same cautions apply to call_srcu(), call_rcu_tasks(),
call_rcu_tasks_rude(), and call_rcu_tasks_trace(). This is
why there is an srcu_barrier(), rcu_barrier_tasks(),
rcu_barrier_tasks_rude(), and rcu_barrier_tasks_rude(),
respectively.
The same cautions apply to call_srcu(), call_rcu_tasks(), and
call_rcu_tasks_trace(). This is why there is an srcu_barrier(),
rcu_barrier_tasks(), and rcu_barrier_tasks_trace(), respectively.
Note that although these primitives do take action to avoid
memory exhaustion when any given CPU has too many callbacks,
......@@ -383,17 +380,17 @@ over a rather long period of time, but improvements are always welcome!
must use whatever locking or other synchronization is required
to safely access and/or modify that data structure.
Do not assume that RCU callbacks will be executed on
the same CPU that executed the corresponding call_rcu(),
call_srcu(), call_rcu_tasks(), call_rcu_tasks_rude(), or
call_rcu_tasks_trace(). For example, if a given CPU goes offline
while having an RCU callback pending, then that RCU callback
will execute on some surviving CPU. (If this was not the case,
a self-spawning RCU callback would prevent the victim CPU from
ever going offline.) Furthermore, CPUs designated by rcu_nocbs=
might well *always* have their RCU callbacks executed on some
other CPUs, in fact, for some real-time workloads, this is the
whole point of using the rcu_nocbs= kernel boot parameter.
Do not assume that RCU callbacks will be executed on the same
CPU that executed the corresponding call_rcu(), call_srcu(),
call_rcu_tasks(), or call_rcu_tasks_trace(). For example, if
a given CPU goes offline while having an RCU callback pending,
then that RCU callback will execute on some surviving CPU.
(If this was not the case, a self-spawning RCU callback would
prevent the victim CPU from ever going offline.) Furthermore,
CPUs designated by rcu_nocbs= might well *always* have their
RCU callbacks executed on some other CPUs, in fact, for some
real-time workloads, this is the whole point of using the
rcu_nocbs= kernel boot parameter.
In addition, do not assume that callbacks queued in a given order
will be invoked in that order, even if they all are queued on the
......@@ -507,9 +504,9 @@ over a rather long period of time, but improvements are always welcome!
These debugging aids can help you find problems that are
otherwise extremely difficult to spot.
17. If you pass a callback function defined within a module to one of
call_rcu(), call_srcu(), call_rcu_tasks(), call_rcu_tasks_rude(),
or call_rcu_tasks_trace(), then it is necessary to wait for all
17. If you pass a callback function defined within a module
to one of call_rcu(), call_srcu(), call_rcu_tasks(), or
call_rcu_tasks_trace(), then it is necessary to wait for all
pending callbacks to be invoked before unloading that module.
Note that it is absolutely *not* sufficient to wait for a grace
period! For example, synchronize_rcu() implementation is *not*
......@@ -522,7 +519,6 @@ over a rather long period of time, but improvements are always welcome!
- call_rcu() -> rcu_barrier()
- call_srcu() -> srcu_barrier()
- call_rcu_tasks() -> rcu_barrier_tasks()
- call_rcu_tasks_rude() -> rcu_barrier_tasks_rude()
- call_rcu_tasks_trace() -> rcu_barrier_tasks_trace()
However, these barrier functions are absolutely *not* guaranteed
......@@ -539,7 +535,6 @@ over a rather long period of time, but improvements are always welcome!
- Either synchronize_srcu() or synchronize_srcu_expedited(),
together with and srcu_barrier()
- synchronize_rcu_tasks() and rcu_barrier_tasks()
- synchronize_tasks_rude() and rcu_barrier_tasks_rude()
- synchronize_tasks_trace() and rcu_barrier_tasks_trace()
If necessary, you can use something like workqueues to execute
......
......@@ -1103,7 +1103,7 @@ RCU-Tasks-Rude::
Critical sections Grace period Barrier
N/A call_rcu_tasks_rude rcu_barrier_tasks_rude
N/A N/A
synchronize_rcu_tasks_rude
......
......@@ -4937,6 +4937,10 @@
Set maximum number of finished RCU callbacks to
process in one batch.
rcutree.csd_lock_suppress_rcu_stall= [KNL]
Do only a one-line RCU CPU stall warning when
there is an ongoing too-long CSD-lock wait.
rcutree.do_rcu_barrier= [KNL]
Request a call to rcu_barrier(). This is
throttled so that userspace tests can safely
......@@ -5384,7 +5388,13 @@
Time to wait (s) after boot before inducing stall.
rcutorture.stall_cpu_irqsoff= [KNL]
Disable interrupts while stalling if set.
Disable interrupts while stalling if set, but only
on the first stall in the set.
rcutorture.stall_cpu_repeat= [KNL]
Number of times to repeat the stall sequence,
so that rcutorture.stall_cpu_repeat=3 will result
in four stall sequences.
rcutorture.stall_gp_kthread= [KNL]
Duration (s) of forced sleep within RCU
......@@ -5572,14 +5582,6 @@
of zero will disable batching. Batching is
always disabled for synchronize_rcu_tasks().
rcupdate.rcu_tasks_rude_lazy_ms= [KNL]
Set timeout in milliseconds RCU Tasks
Rude asynchronous callback batching for
call_rcu_tasks_rude(). A negative value
will take the default. A value of zero will
disable batching. Batching is always disabled
for synchronize_rcu_tasks_rude().
rcupdate.rcu_tasks_trace_lazy_ms= [KNL]
Set timeout in milliseconds RCU Tasks
Trace asynchronous callback batching for
......
......@@ -185,11 +185,7 @@ struct rcu_cblist {
* ----------------------------------------------------------------------------
*/
#define SEGCBLIST_ENABLED BIT(0)
#define SEGCBLIST_RCU_CORE BIT(1)
#define SEGCBLIST_LOCKING BIT(2)
#define SEGCBLIST_KTHREAD_CB BIT(3)
#define SEGCBLIST_KTHREAD_GP BIT(4)
#define SEGCBLIST_OFFLOADED BIT(5)
#define SEGCBLIST_OFFLOADED BIT(1)
struct rcu_segcblist {
struct rcu_head *head;
......
......@@ -191,7 +191,10 @@ static inline void hlist_del_init_rcu(struct hlist_node *n)
* @old : the element to be replaced
* @new : the new element to insert
*
* The @old entry will be replaced with the @new entry atomically.
* The @old entry will be replaced with the @new entry atomically from
* the perspective of concurrent readers. It is the caller's responsibility
* to synchronize with concurrent updaters, if any.
*
* Note: @old should not be empty.
*/
static inline void list_replace_rcu(struct list_head *old,
......@@ -519,7 +522,9 @@ static inline void hlist_del_rcu(struct hlist_node *n)
* @old : the element to be replaced
* @new : the new element to insert
*
* The @old entry will be replaced with the @new entry atomically.
* The @old entry will be replaced with the @new entry atomically from
* the perspective of concurrent readers. It is the caller's responsibility
* to synchronize with concurrent updaters, if any.
*/
static inline void hlist_replace_rcu(struct hlist_node *old,
struct hlist_node *new)
......
......@@ -34,10 +34,12 @@
#define ULONG_CMP_GE(a, b) (ULONG_MAX / 2 >= (a) - (b))
#define ULONG_CMP_LT(a, b) (ULONG_MAX / 2 < (a) - (b))
#define RCU_SEQ_CTR_SHIFT 2
#define RCU_SEQ_STATE_MASK ((1 << RCU_SEQ_CTR_SHIFT) - 1)
/* Exported common interfaces */
void call_rcu(struct rcu_head *head, rcu_callback_t func);
void rcu_barrier_tasks(void);
void rcu_barrier_tasks_rude(void);
void synchronize_rcu(void);
struct rcu_gp_oldstate;
......@@ -144,11 +146,18 @@ void rcu_init_nohz(void);
int rcu_nocb_cpu_offload(int cpu);
int rcu_nocb_cpu_deoffload(int cpu);
void rcu_nocb_flush_deferred_wakeup(void);
#define RCU_NOCB_LOCKDEP_WARN(c, s) RCU_LOCKDEP_WARN(c, s)
#else /* #ifdef CONFIG_RCU_NOCB_CPU */
static inline void rcu_init_nohz(void) { }
static inline int rcu_nocb_cpu_offload(int cpu) { return -EINVAL; }
static inline int rcu_nocb_cpu_deoffload(int cpu) { return 0; }
static inline void rcu_nocb_flush_deferred_wakeup(void) { }
#define RCU_NOCB_LOCKDEP_WARN(c, s)
#endif /* #else #ifdef CONFIG_RCU_NOCB_CPU */
/*
......@@ -165,6 +174,7 @@ static inline void rcu_nocb_flush_deferred_wakeup(void) { }
} while (0)
void call_rcu_tasks(struct rcu_head *head, rcu_callback_t func);
void synchronize_rcu_tasks(void);
void rcu_tasks_torture_stats_print(char *tt, char *tf);
# else
# define rcu_tasks_classic_qs(t, preempt) do { } while (0)
# define call_rcu_tasks call_rcu
......@@ -191,6 +201,7 @@ void rcu_tasks_trace_qs_blkd(struct task_struct *t);
rcu_tasks_trace_qs_blkd(t); \
} \
} while (0)
void rcu_tasks_trace_torture_stats_print(char *tt, char *tf);
# else
# define rcu_tasks_trace_qs(t) do { } while (0)
# endif
......@@ -202,8 +213,8 @@ do { \
} while (0)
# ifdef CONFIG_TASKS_RUDE_RCU
void call_rcu_tasks_rude(struct rcu_head *head, rcu_callback_t func);
void synchronize_rcu_tasks_rude(void);
void rcu_tasks_rude_torture_stats_print(char *tt, char *tf);
# endif
#define rcu_note_voluntary_context_switch(t) rcu_tasks_qs(t, false)
......
......@@ -294,4 +294,10 @@ int smpcfd_prepare_cpu(unsigned int cpu);
int smpcfd_dead_cpu(unsigned int cpu);
int smpcfd_dying_cpu(unsigned int cpu);
#ifdef CONFIG_CSD_LOCK_WAIT_DEBUG
bool csd_lock_is_stuck(void);
#else
static inline bool csd_lock_is_stuck(void) { return false; }
#endif
#endif /* __LINUX_SMP_H */
......@@ -129,10 +129,23 @@ struct srcu_struct {
#define SRCU_STATE_SCAN1 1
#define SRCU_STATE_SCAN2 2
/*
* Values for initializing gp sequence fields. Higher values allow wrap arounds to
* occur earlier.
* The second value with state is useful in the case of static initialization of
* srcu_usage where srcu_gp_seq_needed is expected to have some state value in its
* lower bits (or else it will appear to be already initialized within
* the call check_init_srcu_struct()).
*/
#define SRCU_GP_SEQ_INITIAL_VAL ((0UL - 100UL) << RCU_SEQ_CTR_SHIFT)
#define SRCU_GP_SEQ_INITIAL_VAL_WITH_STATE (SRCU_GP_SEQ_INITIAL_VAL - 1)
#define __SRCU_USAGE_INIT(name) \
{ \
.lock = __SPIN_LOCK_UNLOCKED(name.lock), \
.srcu_gp_seq_needed = -1UL, \
.srcu_gp_seq = SRCU_GP_SEQ_INITIAL_VAL, \
.srcu_gp_seq_needed = SRCU_GP_SEQ_INITIAL_VAL_WITH_STATE, \
.srcu_gp_seq_needed_exp = SRCU_GP_SEQ_INITIAL_VAL, \
.work = __DELAYED_WORK_INITIALIZER(name.work, NULL, 0), \
}
......
......@@ -54,9 +54,6 @@
* grace-period sequence number.
*/
#define RCU_SEQ_CTR_SHIFT 2
#define RCU_SEQ_STATE_MASK ((1 << RCU_SEQ_CTR_SHIFT) - 1)
/* Low-order bit definition for polled grace-period APIs. */
#define RCU_GET_STATE_COMPLETED 0x1
......@@ -255,6 +252,11 @@ static inline void debug_rcu_head_callback(struct rcu_head *rhp)
kmem_dump_obj(rhp);
}
static inline bool rcu_barrier_cb_is_done(struct rcu_head *rhp)
{
return rhp->next == rhp;
}
extern int rcu_cpu_stall_suppress_at_boot;
static inline bool rcu_stall_is_suppressed_at_boot(void)
......
......@@ -260,17 +260,6 @@ void rcu_segcblist_disable(struct rcu_segcblist *rsclp)
rcu_segcblist_clear_flags(rsclp, SEGCBLIST_ENABLED);
}
/*
* Mark the specified rcu_segcblist structure as offloaded (or not)
*/
void rcu_segcblist_offload(struct rcu_segcblist *rsclp, bool offload)
{
if (offload)
rcu_segcblist_set_flags(rsclp, SEGCBLIST_LOCKING | SEGCBLIST_OFFLOADED);
else
rcu_segcblist_clear_flags(rsclp, SEGCBLIST_OFFLOADED);
}
/*
* Does the specified rcu_segcblist structure contain callbacks that
* are ready to be invoked?
......
......@@ -89,16 +89,7 @@ static inline bool rcu_segcblist_is_enabled(struct rcu_segcblist *rsclp)
static inline bool rcu_segcblist_is_offloaded(struct rcu_segcblist *rsclp)
{
if (IS_ENABLED(CONFIG_RCU_NOCB_CPU) &&
rcu_segcblist_test_flags(rsclp, SEGCBLIST_LOCKING))
return true;
return false;
}
static inline bool rcu_segcblist_completely_offloaded(struct rcu_segcblist *rsclp)
{
if (IS_ENABLED(CONFIG_RCU_NOCB_CPU) &&
!rcu_segcblist_test_flags(rsclp, SEGCBLIST_RCU_CORE))
rcu_segcblist_test_flags(rsclp, SEGCBLIST_OFFLOADED))
return true;
return false;
......
This diff is collapsed.
This diff is collapsed.
......@@ -28,6 +28,7 @@
#include <linux/rcupdate_trace.h>
#include <linux/reboot.h>
#include <linux/sched.h>
#include <linux/seq_buf.h>
#include <linux/spinlock.h>
#include <linux/smp.h>
#include <linux/stat.h>
......@@ -134,7 +135,7 @@ struct ref_scale_ops {
const char *name;
};
static struct ref_scale_ops *cur_ops;
static const struct ref_scale_ops *cur_ops;
static void un_delay(const int udl, const int ndl)
{
......@@ -170,7 +171,7 @@ static bool rcu_sync_scale_init(void)
return true;
}
static struct ref_scale_ops rcu_ops = {
static const struct ref_scale_ops rcu_ops = {
.init = rcu_sync_scale_init,
.readsection = ref_rcu_read_section,
.delaysection = ref_rcu_delay_section,
......@@ -204,7 +205,7 @@ static void srcu_ref_scale_delay_section(const int nloops, const int udl, const
}
}
static struct ref_scale_ops srcu_ops = {
static const struct ref_scale_ops srcu_ops = {
.init = rcu_sync_scale_init,
.readsection = srcu_ref_scale_read_section,
.delaysection = srcu_ref_scale_delay_section,
......@@ -231,7 +232,7 @@ static void rcu_tasks_ref_scale_delay_section(const int nloops, const int udl, c
un_delay(udl, ndl);
}
static struct ref_scale_ops rcu_tasks_ops = {
static const struct ref_scale_ops rcu_tasks_ops = {
.init = rcu_sync_scale_init,
.readsection = rcu_tasks_ref_scale_read_section,
.delaysection = rcu_tasks_ref_scale_delay_section,
......@@ -270,7 +271,7 @@ static void rcu_trace_ref_scale_delay_section(const int nloops, const int udl, c
}
}
static struct ref_scale_ops rcu_trace_ops = {
static const struct ref_scale_ops rcu_trace_ops = {
.init = rcu_sync_scale_init,
.readsection = rcu_trace_ref_scale_read_section,
.delaysection = rcu_trace_ref_scale_delay_section,
......@@ -309,7 +310,7 @@ static void ref_refcnt_delay_section(const int nloops, const int udl, const int
}
}
static struct ref_scale_ops refcnt_ops = {
static const struct ref_scale_ops refcnt_ops = {
.init = rcu_sync_scale_init,
.readsection = ref_refcnt_section,
.delaysection = ref_refcnt_delay_section,
......@@ -346,7 +347,7 @@ static void ref_rwlock_delay_section(const int nloops, const int udl, const int
}
}
static struct ref_scale_ops rwlock_ops = {
static const struct ref_scale_ops rwlock_ops = {
.init = ref_rwlock_init,
.readsection = ref_rwlock_section,
.delaysection = ref_rwlock_delay_section,
......@@ -383,7 +384,7 @@ static void ref_rwsem_delay_section(const int nloops, const int udl, const int n
}
}
static struct ref_scale_ops rwsem_ops = {
static const struct ref_scale_ops rwsem_ops = {
.init = ref_rwsem_init,
.readsection = ref_rwsem_section,
.delaysection = ref_rwsem_delay_section,
......@@ -418,7 +419,7 @@ static void ref_lock_delay_section(const int nloops, const int udl, const int nd
preempt_enable();
}
static struct ref_scale_ops lock_ops = {
static const struct ref_scale_ops lock_ops = {
.readsection = ref_lock_section,
.delaysection = ref_lock_delay_section,
.name = "lock"
......@@ -453,7 +454,7 @@ static void ref_lock_irq_delay_section(const int nloops, const int udl, const in
preempt_enable();
}
static struct ref_scale_ops lock_irq_ops = {
static const struct ref_scale_ops lock_irq_ops = {
.readsection = ref_lock_irq_section,
.delaysection = ref_lock_irq_delay_section,
.name = "lock-irq"
......@@ -489,7 +490,7 @@ static void ref_acqrel_delay_section(const int nloops, const int udl, const int
preempt_enable();
}
static struct ref_scale_ops acqrel_ops = {
static const struct ref_scale_ops acqrel_ops = {
.readsection = ref_acqrel_section,
.delaysection = ref_acqrel_delay_section,
.name = "acqrel"
......@@ -523,7 +524,7 @@ static void ref_clock_delay_section(const int nloops, const int udl, const int n
stopopts = x;
}
static struct ref_scale_ops clock_ops = {
static const struct ref_scale_ops clock_ops = {
.readsection = ref_clock_section,
.delaysection = ref_clock_delay_section,
.name = "clock"
......@@ -555,7 +556,7 @@ static void ref_jiffies_delay_section(const int nloops, const int udl, const int
stopopts = x;
}
static struct ref_scale_ops jiffies_ops = {
static const struct ref_scale_ops jiffies_ops = {
.readsection = ref_jiffies_section,
.delaysection = ref_jiffies_delay_section,
.name = "jiffies"
......@@ -705,9 +706,9 @@ static void refscale_typesafe_ctor(void *rtsp_in)
preempt_enable();
}
static struct ref_scale_ops typesafe_ref_ops;
static struct ref_scale_ops typesafe_lock_ops;
static struct ref_scale_ops typesafe_seqlock_ops;
static const struct ref_scale_ops typesafe_ref_ops;
static const struct ref_scale_ops typesafe_lock_ops;
static const struct ref_scale_ops typesafe_seqlock_ops;
// Initialize for a typesafe test.
static bool typesafe_init(void)
......@@ -768,7 +769,7 @@ static void typesafe_cleanup(void)
}
// The typesafe_init() function distinguishes these structures by address.
static struct ref_scale_ops typesafe_ref_ops = {
static const struct ref_scale_ops typesafe_ref_ops = {
.init = typesafe_init,
.cleanup = typesafe_cleanup,
.readsection = typesafe_read_section,
......@@ -776,7 +777,7 @@ static struct ref_scale_ops typesafe_ref_ops = {
.name = "typesafe_ref"
};
static struct ref_scale_ops typesafe_lock_ops = {
static const struct ref_scale_ops typesafe_lock_ops = {
.init = typesafe_init,
.cleanup = typesafe_cleanup,
.readsection = typesafe_read_section,
......@@ -784,7 +785,7 @@ static struct ref_scale_ops typesafe_lock_ops = {
.name = "typesafe_lock"
};
static struct ref_scale_ops typesafe_seqlock_ops = {
static const struct ref_scale_ops typesafe_seqlock_ops = {
.init = typesafe_init,
.cleanup = typesafe_cleanup,
.readsection = typesafe_read_section,
......@@ -891,32 +892,34 @@ static u64 process_durations(int n)
{
int i;
struct reader_task *rt;
char buf1[64];
struct seq_buf s;
char *buf;
u64 sum = 0;
buf = kmalloc(800 + 64, GFP_KERNEL);
if (!buf)
return 0;
buf[0] = 0;
sprintf(buf, "Experiment #%d (Format: <THREAD-NUM>:<Total loop time in ns>)",
exp_idx);
seq_buf_init(&s, buf, 800 + 64);
seq_buf_printf(&s, "Experiment #%d (Format: <THREAD-NUM>:<Total loop time in ns>)",
exp_idx);
for (i = 0; i < n && !torture_must_stop(); i++) {
rt = &(reader_tasks[i]);
sprintf(buf1, "%d: %llu\t", i, rt->last_duration_ns);
if (i % 5 == 0)
strcat(buf, "\n");
if (strlen(buf) >= 800) {
pr_alert("%s", buf);
buf[0] = 0;
seq_buf_putc(&s, '\n');
if (seq_buf_used(&s) >= 800) {
pr_alert("%s", seq_buf_str(&s));
seq_buf_clear(&s);
}
strcat(buf, buf1);
seq_buf_printf(&s, "%d: %llu\t", i, rt->last_duration_ns);
sum += rt->last_duration_ns;
}
pr_alert("%s\n", buf);
pr_alert("%s\n", seq_buf_str(&s));
kfree(buf);
return sum;
......@@ -1023,7 +1026,7 @@ static int main_func(void *arg)
}
static void
ref_scale_print_module_parms(struct ref_scale_ops *cur_ops, const char *tag)
ref_scale_print_module_parms(const struct ref_scale_ops *cur_ops, const char *tag)
{
pr_alert("%s" SCALE_FLAG
"--- %s: verbose=%d verbose_batched=%d shutdown=%d holdoff=%d lookup_instances=%ld loops=%ld nreaders=%d nruns=%d readdelay=%d\n", scale_type, tag,
......@@ -1078,7 +1081,7 @@ ref_scale_init(void)
{
long i;
int firsterr = 0;
static struct ref_scale_ops *scale_ops[] = {
static const struct ref_scale_ops *scale_ops[] = {
&rcu_ops, &srcu_ops, RCU_TRACE_OPS RCU_TASKS_OPS &refcnt_ops, &rwlock_ops,
&rwsem_ops, &lock_ops, &lock_irq_ops, &acqrel_ops, &clock_ops, &jiffies_ops,
&typesafe_ref_ops, &typesafe_lock_ops, &typesafe_seqlock_ops,
......
......@@ -137,6 +137,7 @@ static void init_srcu_struct_data(struct srcu_struct *ssp)
sdp->srcu_cblist_invoking = false;
sdp->srcu_gp_seq_needed = ssp->srcu_sup->srcu_gp_seq;
sdp->srcu_gp_seq_needed_exp = ssp->srcu_sup->srcu_gp_seq;
sdp->srcu_barrier_head.next = &sdp->srcu_barrier_head;
sdp->mynode = NULL;
sdp->cpu = cpu;
INIT_WORK(&sdp->work, srcu_invoke_callbacks);
......@@ -247,7 +248,7 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
mutex_init(&ssp->srcu_sup->srcu_cb_mutex);
mutex_init(&ssp->srcu_sup->srcu_gp_mutex);
ssp->srcu_idx = 0;
ssp->srcu_sup->srcu_gp_seq = 0;
ssp->srcu_sup->srcu_gp_seq = SRCU_GP_SEQ_INITIAL_VAL;
ssp->srcu_sup->srcu_barrier_seq = 0;
mutex_init(&ssp->srcu_sup->srcu_barrier_mutex);
atomic_set(&ssp->srcu_sup->srcu_barrier_cpu_cnt, 0);
......@@ -258,7 +259,7 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
if (!ssp->sda)
goto err_free_sup;
init_srcu_struct_data(ssp);
ssp->srcu_sup->srcu_gp_seq_needed_exp = 0;
ssp->srcu_sup->srcu_gp_seq_needed_exp = SRCU_GP_SEQ_INITIAL_VAL;
ssp->srcu_sup->srcu_last_gp_end = ktime_get_mono_fast_ns();
if (READ_ONCE(ssp->srcu_sup->srcu_size_state) == SRCU_SIZE_SMALL && SRCU_SIZING_IS_INIT()) {
if (!init_srcu_struct_nodes(ssp, GFP_ATOMIC))
......@@ -266,7 +267,8 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
WRITE_ONCE(ssp->srcu_sup->srcu_size_state, SRCU_SIZE_BIG);
}
ssp->srcu_sup->srcu_ssp = ssp;
smp_store_release(&ssp->srcu_sup->srcu_gp_seq_needed, 0); /* Init done. */
smp_store_release(&ssp->srcu_sup->srcu_gp_seq_needed,
SRCU_GP_SEQ_INITIAL_VAL); /* Init done. */
return 0;
err_free_sda:
......@@ -628,6 +630,7 @@ static unsigned long srcu_get_delay(struct srcu_struct *ssp)
if (time_after(j, gpstart))
jbase += j - gpstart;
if (!jbase) {
ASSERT_EXCLUSIVE_WRITER(sup->srcu_n_exp_nodelay);
WRITE_ONCE(sup->srcu_n_exp_nodelay, READ_ONCE(sup->srcu_n_exp_nodelay) + 1);
if (READ_ONCE(sup->srcu_n_exp_nodelay) > srcu_max_nodelay_phase)
jbase = 1;
......@@ -1560,6 +1563,7 @@ static void srcu_barrier_cb(struct rcu_head *rhp)
struct srcu_data *sdp;
struct srcu_struct *ssp;
rhp->next = rhp; // Mark the callback as having been invoked.
sdp = container_of(rhp, struct srcu_data, srcu_barrier_head);
ssp = sdp->ssp;
if (atomic_dec_and_test(&ssp->srcu_sup->srcu_barrier_cpu_cnt))
......@@ -1818,6 +1822,7 @@ static void process_srcu(struct work_struct *work)
} else {
j = jiffies;
if (READ_ONCE(sup->reschedule_jiffies) == j) {
ASSERT_EXCLUSIVE_WRITER(sup->reschedule_count);
WRITE_ONCE(sup->reschedule_count, READ_ONCE(sup->reschedule_count) + 1);
if (READ_ONCE(sup->reschedule_count) > srcu_max_nodelay)
curdelay = 1;
......
This diff is collapsed.
......@@ -79,9 +79,6 @@ static void rcu_sr_normal_gp_cleanup_work(struct work_struct *);
static DEFINE_PER_CPU_SHARED_ALIGNED(struct rcu_data, rcu_data) = {
.gpwrap = true,
#ifdef CONFIG_RCU_NOCB_CPU
.cblist.flags = SEGCBLIST_RCU_CORE,
#endif
};
static struct rcu_state rcu_state = {
.level = { &rcu_state.node[0] },
......@@ -97,6 +94,9 @@ static struct rcu_state rcu_state = {
.srs_cleanup_work = __WORK_INITIALIZER(rcu_state.srs_cleanup_work,
rcu_sr_normal_gp_cleanup_work),
.srs_cleanups_pending = ATOMIC_INIT(0),
#ifdef CONFIG_RCU_NOCB_CPU
.nocb_mutex = __MUTEX_INITIALIZER(rcu_state.nocb_mutex),
#endif
};
/* Dump rcu_node combining tree at boot to verify correct setup. */
......@@ -1660,7 +1660,7 @@ static void rcu_sr_normal_gp_cleanup_work(struct work_struct *work)
* the done tail list manipulations are protected here.
*/
done = smp_load_acquire(&rcu_state.srs_done_tail);
if (!done)
if (WARN_ON_ONCE(!done))
return;
WARN_ON_ONCE(!rcu_sr_is_wait_head(done));
......@@ -2394,7 +2394,6 @@ rcu_report_qs_rdp(struct rcu_data *rdp)
{
unsigned long flags;
unsigned long mask;
bool needacc = false;
struct rcu_node *rnp;
WARN_ON_ONCE(rdp->cpu != smp_processor_id());
......@@ -2431,23 +2430,11 @@ rcu_report_qs_rdp(struct rcu_data *rdp)
* to return true. So complain, but don't awaken.
*/
WARN_ON_ONCE(rcu_accelerate_cbs(rnp, rdp));
} else if (!rcu_segcblist_completely_offloaded(&rdp->cblist)) {
/*
* ...but NOCB kthreads may miss or delay callbacks acceleration
* if in the middle of a (de-)offloading process.
*/
needacc = true;
}
rcu_disable_urgency_upon_qs(rdp);
rcu_report_qs_rnp(mask, rnp, rnp->gp_seq, flags);
/* ^^^ Released rnp->lock */
if (needacc) {
rcu_nocb_lock_irqsave(rdp, flags);
rcu_accelerate_cbs_unlocked(rnp, rdp);
rcu_nocb_unlock_irqrestore(rdp, flags);
}
}
}
......@@ -2802,24 +2789,6 @@ static __latent_entropy void rcu_core(void)
unsigned long flags;
struct rcu_data *rdp = raw_cpu_ptr(&rcu_data);
struct rcu_node *rnp = rdp->mynode;
/*
* On RT rcu_core() can be preempted when IRQs aren't disabled.
* Therefore this function can race with concurrent NOCB (de-)offloading
* on this CPU and the below condition must be considered volatile.
* However if we race with:
*
* _ Offloading: In the worst case we accelerate or process callbacks
* concurrently with NOCB kthreads. We are guaranteed to
* call rcu_nocb_lock() if that happens.
*
* _ Deoffloading: In the worst case we miss callbacks acceleration or
* processing. This is fine because the early stage
* of deoffloading invokes rcu_core() after setting
* SEGCBLIST_RCU_CORE. So we guarantee that we'll process
* what could have been dismissed without the need to wait
* for the next rcu_pending() check in the next jiffy.
*/
const bool do_batch = !rcu_segcblist_completely_offloaded(&rdp->cblist);
if (cpu_is_offline(smp_processor_id()))
return;
......@@ -2839,17 +2808,17 @@ static __latent_entropy void rcu_core(void)
/* No grace period and unregistered callbacks? */
if (!rcu_gp_in_progress() &&
rcu_segcblist_is_enabled(&rdp->cblist) && do_batch) {
rcu_nocb_lock_irqsave(rdp, flags);
rcu_segcblist_is_enabled(&rdp->cblist) && !rcu_rdp_is_offloaded(rdp)) {
local_irq_save(flags);
if (!rcu_segcblist_restempty(&rdp->cblist, RCU_NEXT_READY_TAIL))
rcu_accelerate_cbs_unlocked(rnp, rdp);
rcu_nocb_unlock_irqrestore(rdp, flags);
local_irq_restore(flags);
}
rcu_check_gp_start_stall(rnp, rdp, rcu_jiffies_till_stall_check());
/* If there are callbacks ready, invoke them. */
if (do_batch && rcu_segcblist_ready_cbs(&rdp->cblist) &&
if (!rcu_rdp_is_offloaded(rdp) && rcu_segcblist_ready_cbs(&rdp->cblist) &&
likely(READ_ONCE(rcu_scheduler_fully_active))) {
rcu_do_batch(rdp);
/* Re-invoke RCU core processing if there are callbacks remaining. */
......@@ -3238,7 +3207,7 @@ struct kvfree_rcu_bulk_data {
struct list_head list;
struct rcu_gp_oldstate gp_snap;
unsigned long nr_records;
void *records[];
void *records[] __counted_by(nr_records);
};
/*
......@@ -3550,10 +3519,10 @@ schedule_delayed_monitor_work(struct kfree_rcu_cpu *krcp)
if (delayed_work_pending(&krcp->monitor_work)) {
delay_left = krcp->monitor_work.timer.expires - jiffies;
if (delay < delay_left)
mod_delayed_work(system_wq, &krcp->monitor_work, delay);
mod_delayed_work(system_unbound_wq, &krcp->monitor_work, delay);
return;
}
queue_delayed_work(system_wq, &krcp->monitor_work, delay);
queue_delayed_work(system_unbound_wq, &krcp->monitor_work, delay);
}
static void
......@@ -3645,7 +3614,7 @@ static void kfree_rcu_monitor(struct work_struct *work)
// be that the work is in the pending state when
// channels have been detached following by each
// other.
queue_rcu_work(system_wq, &krwp->rcu_work);
queue_rcu_work(system_unbound_wq, &krwp->rcu_work);
}
}
......@@ -3715,7 +3684,7 @@ run_page_cache_worker(struct kfree_rcu_cpu *krcp)
if (rcu_scheduler_active == RCU_SCHEDULER_RUNNING &&
!atomic_xchg(&krcp->work_in_progress, 1)) {
if (atomic_read(&krcp->backoff_page_cache_fill)) {
queue_delayed_work(system_wq,
queue_delayed_work(system_unbound_wq,
&krcp->page_cache_work,
msecs_to_jiffies(rcu_delay_page_cache_fill_msec));
} else {
......@@ -3778,7 +3747,8 @@ add_ptr_to_bulk_krc_lock(struct kfree_rcu_cpu **krcp,
}
// Finally insert and update the GP for this page.
bnode->records[bnode->nr_records++] = ptr;
bnode->nr_records++;
bnode->records[bnode->nr_records - 1] = ptr;
get_state_synchronize_rcu_full(&bnode->gp_snap);
atomic_inc(&(*krcp)->bulk_count[idx]);
......@@ -4414,6 +4384,7 @@ static void rcu_barrier_callback(struct rcu_head *rhp)
{
unsigned long __maybe_unused s = rcu_state.barrier_sequence;
rhp->next = rhp; // Mark the callback as having been invoked.
if (atomic_dec_and_test(&rcu_state.barrier_cpu_count)) {
rcu_barrier_trace(TPS("LastCB"), -1, s);
complete(&rcu_state.barrier_completion);
......@@ -5435,6 +5406,8 @@ static void __init rcu_init_one(void)
while (i > rnp->grphi)
rnp++;
per_cpu_ptr(&rcu_data, i)->mynode = rnp;
per_cpu_ptr(&rcu_data, i)->barrier_head.next =
&per_cpu_ptr(&rcu_data, i)->barrier_head;
rcu_boot_init_percpu_data(i);
}
}
......
......@@ -411,7 +411,6 @@ struct rcu_state {
arch_spinlock_t ofl_lock ____cacheline_internodealigned_in_smp;
/* Synchronize offline with */
/* GP pre-initialization. */
int nocb_is_setup; /* nocb is setup from boot */
/* synchronize_rcu() part. */
struct llist_head srs_next; /* request a GP users. */
......@@ -420,6 +419,11 @@ struct rcu_state {
struct sr_wait_node srs_wait_nodes[SR_NORMAL_GP_WAIT_HEAD_MAX];
struct work_struct srs_cleanup_work;
atomic_t srs_cleanups_pending; /* srs inflight worker cleanups. */
#ifdef CONFIG_RCU_NOCB_CPU
struct mutex nocb_mutex; /* Guards (de-)offloading */
int nocb_is_setup; /* nocb is setup from boot */
#endif
};
/* Values for rcu_state structure's gp_flags field. */
......
......@@ -542,6 +542,67 @@ static bool synchronize_rcu_expedited_wait_once(long tlimit)
return false;
}
/*
* Print out an expedited RCU CPU stall warning message.
*/
static void synchronize_rcu_expedited_stall(unsigned long jiffies_start, unsigned long j)
{
int cpu;
unsigned long mask;
int ndetected;
struct rcu_node *rnp;
struct rcu_node *rnp_root = rcu_get_root();
if (READ_ONCE(csd_lock_suppress_rcu_stall) && csd_lock_is_stuck()) {
pr_err("INFO: %s detected expedited stalls, but suppressed full report due to a stuck CSD-lock.\n", rcu_state.name);
return;
}
pr_err("INFO: %s detected expedited stalls on CPUs/tasks: {", rcu_state.name);
ndetected = 0;
rcu_for_each_leaf_node(rnp) {
ndetected += rcu_print_task_exp_stall(rnp);
for_each_leaf_node_possible_cpu(rnp, cpu) {
struct rcu_data *rdp;
mask = leaf_node_cpu_bit(rnp, cpu);
if (!(READ_ONCE(rnp->expmask) & mask))
continue;
ndetected++;
rdp = per_cpu_ptr(&rcu_data, cpu);
pr_cont(" %d-%c%c%c%c", cpu,
"O."[!!cpu_online(cpu)],
"o."[!!(rdp->grpmask & rnp->expmaskinit)],
"N."[!!(rdp->grpmask & rnp->expmaskinitnext)],
"D."[!!data_race(rdp->cpu_no_qs.b.exp)]);
}
}
pr_cont(" } %lu jiffies s: %lu root: %#lx/%c\n",
j - jiffies_start, rcu_state.expedited_sequence, data_race(rnp_root->expmask),
".T"[!!data_race(rnp_root->exp_tasks)]);
if (ndetected) {
pr_err("blocking rcu_node structures (internal RCU debug):");
rcu_for_each_node_breadth_first(rnp) {
if (rnp == rnp_root)
continue; /* printed unconditionally */
if (sync_rcu_exp_done_unlocked(rnp))
continue;
pr_cont(" l=%u:%d-%d:%#lx/%c",
rnp->level, rnp->grplo, rnp->grphi, data_race(rnp->expmask),
".T"[!!data_race(rnp->exp_tasks)]);
}
pr_cont("\n");
}
rcu_for_each_leaf_node(rnp) {
for_each_leaf_node_possible_cpu(rnp, cpu) {
mask = leaf_node_cpu_bit(rnp, cpu);
if (!(READ_ONCE(rnp->expmask) & mask))
continue;
dump_cpu_task(cpu);
}
rcu_exp_print_detail_task_stall_rnp(rnp);
}
}
/*
* Wait for the expedited grace period to elapse, issuing any needed
* RCU CPU stall warnings along the way.
......@@ -553,10 +614,8 @@ static void synchronize_rcu_expedited_wait(void)
unsigned long jiffies_stall;
unsigned long jiffies_start;
unsigned long mask;
int ndetected;
struct rcu_data *rdp;
struct rcu_node *rnp;
struct rcu_node *rnp_root = rcu_get_root();
unsigned long flags;
trace_rcu_exp_grace_period(rcu_state.name, rcu_exp_gp_seq_endval(), TPS("startwait"));
......@@ -593,55 +652,7 @@ static void synchronize_rcu_expedited_wait(void)
j = jiffies;
rcu_stall_notifier_call_chain(RCU_STALL_NOTIFY_EXP, (void *)(j - jiffies_start));
trace_rcu_stall_warning(rcu_state.name, TPS("ExpeditedStall"));
pr_err("INFO: %s detected expedited stalls on CPUs/tasks: {",
rcu_state.name);
ndetected = 0;
rcu_for_each_leaf_node(rnp) {
ndetected += rcu_print_task_exp_stall(rnp);
for_each_leaf_node_possible_cpu(rnp, cpu) {
struct rcu_data *rdp;
mask = leaf_node_cpu_bit(rnp, cpu);
if (!(READ_ONCE(rnp->expmask) & mask))
continue;
ndetected++;
rdp = per_cpu_ptr(&rcu_data, cpu);
pr_cont(" %d-%c%c%c%c", cpu,
"O."[!!cpu_online(cpu)],
"o."[!!(rdp->grpmask & rnp->expmaskinit)],
"N."[!!(rdp->grpmask & rnp->expmaskinitnext)],
"D."[!!data_race(rdp->cpu_no_qs.b.exp)]);
}
}
pr_cont(" } %lu jiffies s: %lu root: %#lx/%c\n",
j - jiffies_start, rcu_state.expedited_sequence,
data_race(rnp_root->expmask),
".T"[!!data_race(rnp_root->exp_tasks)]);
if (ndetected) {
pr_err("blocking rcu_node structures (internal RCU debug):");
rcu_for_each_node_breadth_first(rnp) {
if (rnp == rnp_root)
continue; /* printed unconditionally */
if (sync_rcu_exp_done_unlocked(rnp))
continue;
pr_cont(" l=%u:%d-%d:%#lx/%c",
rnp->level, rnp->grplo, rnp->grphi,
data_race(rnp->expmask),
".T"[!!data_race(rnp->exp_tasks)]);
}
pr_cont("\n");
}
rcu_for_each_leaf_node(rnp) {
for_each_leaf_node_possible_cpu(rnp, cpu) {
mask = leaf_node_cpu_bit(rnp, cpu);
if (!(READ_ONCE(rnp->expmask) & mask))
continue;
preempt_disable(); // For smp_processor_id() in dump_cpu_task().
dump_cpu_task(cpu);
preempt_enable();
}
rcu_exp_print_detail_task_stall_rnp(rnp);
}
synchronize_rcu_expedited_stall(jiffies_start, j);
jiffies_stall = 3 * rcu_exp_jiffies_till_stall_check() + 3;
panic_on_rcu_stall();
}
......
This diff is collapsed.
......@@ -24,10 +24,11 @@ static bool rcu_rdp_is_offloaded(struct rcu_data *rdp)
* timers have their own means of synchronization against the
* offloaded state updaters.
*/
RCU_LOCKDEP_WARN(
RCU_NOCB_LOCKDEP_WARN(
!(lockdep_is_held(&rcu_state.barrier_mutex) ||
(IS_ENABLED(CONFIG_HOTPLUG_CPU) && lockdep_is_cpus_held()) ||
rcu_lockdep_is_held_nocb(rdp) ||
lockdep_is_held(&rdp->nocb_lock) ||
lockdep_is_held(&rcu_state.nocb_mutex) ||
(!(IS_ENABLED(CONFIG_PREEMPT_COUNT) && preemptible()) &&
rdp == this_cpu_ptr(&rcu_data)) ||
rcu_current_is_nocb_kthread(rdp)),
......
......@@ -9,6 +9,7 @@
#include <linux/kvm_para.h>
#include <linux/rcu_notifier.h>
#include <linux/smp.h>
//////////////////////////////////////////////////////////////////////////////
//
......@@ -370,6 +371,7 @@ static void rcu_dump_cpu_stacks(void)
struct rcu_node *rnp;
rcu_for_each_leaf_node(rnp) {
printk_deferred_enter();
raw_spin_lock_irqsave_rcu_node(rnp, flags);
for_each_leaf_node_possible_cpu(rnp, cpu)
if (rnp->qsmask & leaf_node_cpu_bit(rnp, cpu)) {
......@@ -379,6 +381,7 @@ static void rcu_dump_cpu_stacks(void)
dump_cpu_task(cpu);
}
raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
printk_deferred_exit();
}
}
......@@ -719,6 +722,9 @@ static void print_cpu_stall(unsigned long gps)
set_preempt_need_resched();
}
static bool csd_lock_suppress_rcu_stall;
module_param(csd_lock_suppress_rcu_stall, bool, 0644);
static void check_cpu_stall(struct rcu_data *rdp)
{
bool self_detected;
......@@ -791,7 +797,9 @@ static void check_cpu_stall(struct rcu_data *rdp)
return;
rcu_stall_notifier_call_chain(RCU_STALL_NOTIFY_NORM, (void *)j - gps);
if (self_detected) {
if (READ_ONCE(csd_lock_suppress_rcu_stall) && csd_lock_is_stuck()) {
pr_err("INFO: %s detected stall, but suppressed full report due to a stuck CSD-lock.\n", rcu_state.name);
} else if (self_detected) {
/* We haven't checked in, so go dump stack. */
print_cpu_stall(gps);
} else {
......
......@@ -9726,7 +9726,7 @@ struct cgroup_subsys cpu_cgrp_subsys = {
void dump_cpu_task(int cpu)
{
if (cpu == smp_processor_id() && in_hardirq()) {
if (in_hardirq() && cpu == smp_processor_id()) {
struct pt_regs *regs;
regs = get_irq_regs();
......
......@@ -208,12 +208,25 @@ static int csd_lock_wait_getcpu(call_single_data_t *csd)
return -1;
}
static atomic_t n_csd_lock_stuck;
/**
* csd_lock_is_stuck - Has a CSD-lock acquisition been stuck too long?
*
* Returns @true if a CSD-lock acquisition is stuck and has been stuck
* long enough for a "non-responsive CSD lock" message to be printed.
*/
bool csd_lock_is_stuck(void)
{
return !!atomic_read(&n_csd_lock_stuck);
}
/*
* Complain if too much time spent waiting. Note that only
* the CSD_TYPE_SYNC/ASYNC types provide the destination CPU,
* so waiting on other types gets much less information.
*/
static bool csd_lock_wait_toolong(call_single_data_t *csd, u64 ts0, u64 *ts1, int *bug_id)
static bool csd_lock_wait_toolong(call_single_data_t *csd, u64 ts0, u64 *ts1, int *bug_id, unsigned long *nmessages)
{
int cpu = -1;
int cpux;
......@@ -229,15 +242,26 @@ static bool csd_lock_wait_toolong(call_single_data_t *csd, u64 ts0, u64 *ts1, in
cpu = csd_lock_wait_getcpu(csd);
pr_alert("csd: CSD lock (#%d) got unstuck on CPU#%02d, CPU#%02d released the lock.\n",
*bug_id, raw_smp_processor_id(), cpu);
atomic_dec(&n_csd_lock_stuck);
return true;
}
ts2 = sched_clock();
/* How long since we last checked for a stuck CSD lock.*/
ts_delta = ts2 - *ts1;
if (likely(ts_delta <= csd_lock_timeout_ns || csd_lock_timeout_ns == 0))
if (likely(ts_delta <= csd_lock_timeout_ns * (*nmessages + 1) *
(!*nmessages ? 1 : (ilog2(num_online_cpus()) / 2 + 1)) ||
csd_lock_timeout_ns == 0))
return false;
if (ts0 > ts2) {
/* Our own sched_clock went backward; don't blame another CPU. */
ts_delta = ts0 - ts2;
pr_alert("sched_clock on CPU %d went backward by %llu ns\n", raw_smp_processor_id(), ts_delta);
*ts1 = ts2;
return false;
}
firsttime = !*bug_id;
if (firsttime)
*bug_id = atomic_inc_return(&csd_bug_count);
......@@ -249,9 +273,12 @@ static bool csd_lock_wait_toolong(call_single_data_t *csd, u64 ts0, u64 *ts1, in
cpu_cur_csd = smp_load_acquire(&per_cpu(cur_csd, cpux)); /* Before func and info. */
/* How long since this CSD lock was stuck. */
ts_delta = ts2 - ts0;
pr_alert("csd: %s non-responsive CSD lock (#%d) on CPU#%d, waiting %llu ns for CPU#%02d %pS(%ps).\n",
firsttime ? "Detected" : "Continued", *bug_id, raw_smp_processor_id(), ts_delta,
pr_alert("csd: %s non-responsive CSD lock (#%d) on CPU#%d, waiting %lld ns for CPU#%02d %pS(%ps).\n",
firsttime ? "Detected" : "Continued", *bug_id, raw_smp_processor_id(), (s64)ts_delta,
cpu, csd->func, csd->info);
(*nmessages)++;
if (firsttime)
atomic_inc(&n_csd_lock_stuck);
/*
* If the CSD lock is still stuck after 5 minutes, it is unlikely
* to become unstuck. Use a signed comparison to avoid triggering
......@@ -290,12 +317,13 @@ static bool csd_lock_wait_toolong(call_single_data_t *csd, u64 ts0, u64 *ts1, in
*/
static void __csd_lock_wait(call_single_data_t *csd)
{
unsigned long nmessages = 0;
int bug_id = 0;
u64 ts0, ts1;
ts1 = ts0 = sched_clock();
for (;;) {
if (csd_lock_wait_toolong(csd, ts0, &ts1, &bug_id))
if (csd_lock_wait_toolong(csd, ts0, &ts1, &bug_id, &nmessages))
break;
cpu_relax();
}
......
......@@ -1614,6 +1614,7 @@ config SCF_TORTURE_TEST
config CSD_LOCK_WAIT_DEBUG
bool "Debugging for csd_lock_wait(), called from smp_call_function*()"
depends on DEBUG_KERNEL
depends on SMP
depends on 64BIT
default n
help
......
......@@ -21,12 +21,10 @@ fi
bpftrace -e 'kprobe:kvfree_call_rcu,
kprobe:call_rcu,
kprobe:call_rcu_tasks,
kprobe:call_rcu_tasks_rude,
kprobe:call_rcu_tasks_trace,
kprobe:call_srcu,
kprobe:rcu_barrier,
kprobe:rcu_barrier_tasks,
kprobe:rcu_barrier_tasks_rude,
kprobe:rcu_barrier_tasks_trace,
kprobe:srcu_barrier,
kprobe:synchronize_rcu,
......
......@@ -68,6 +68,8 @@ config_override_param "--gdb options" KcList "$TORTURE_KCONFIG_GDB_ARG"
config_override_param "--kasan options" KcList "$TORTURE_KCONFIG_KASAN_ARG"
config_override_param "--kcsan options" KcList "$TORTURE_KCONFIG_KCSAN_ARG"
config_override_param "--kconfig argument" KcList "$TORTURE_KCONFIG_ARG"
config_override_param "$config_dir/CFcommon.$(uname -m)" KcList \
"`cat $config_dir/CFcommon.$(uname -m) 2> /dev/null`"
cp $T/KcList $resdir/ConfigFragment
base_resdir=`echo $resdir | sed -e 's/\.[0-9]\+$//'`
......
......@@ -19,10 +19,10 @@ PATH=${RCUTORTURE}/bin:$PATH; export PATH
TORTURE_ALLOTED_CPUS="`identify_qemu_vcpus`"
MAKE_ALLOTED_CPUS=$((TORTURE_ALLOTED_CPUS*2))
HALF_ALLOTED_CPUS=$((TORTURE_ALLOTED_CPUS/2))
if test "$HALF_ALLOTED_CPUS" -lt 1
SCALE_ALLOTED_CPUS=$((TORTURE_ALLOTED_CPUS/2))
if test "$SCALE_ALLOTED_CPUS" -lt 1
then
HALF_ALLOTED_CPUS=1
SCALE_ALLOTED_CPUS=1
fi
VERBOSE_BATCH_CPUS=$((TORTURE_ALLOTED_CPUS/16))
if test "$VERBOSE_BATCH_CPUS" -lt 2
......@@ -90,6 +90,7 @@ usage () {
echo " --do-scftorture / --do-no-scftorture / --no-scftorture"
echo " --do-srcu-lockdep / --do-no-srcu-lockdep / --no-srcu-lockdep"
echo " --duration [ <minutes> | <hours>h | <days>d ]"
echo " --guest-cpu-limit N"
echo " --kcsan-kmake-arg kernel-make-arguments"
exit 1
}
......@@ -203,6 +204,21 @@ do
duration_base=$(($ts*mult))
shift
;;
--guest-cpu-limit|--guest-cpu-lim)
checkarg --guest-cpu-limit "(number)" "$#" "$2" '^[0-9]*$' '^--'
if (("$2" <= "$TORTURE_ALLOTED_CPUS" / 2))
then
SCALE_ALLOTED_CPUS="$2"
VERBOSE_BATCH_CPUS="$((SCALE_ALLOTED_CPUS/8))"
if (("$VERBOSE_BATCH_CPUS" < 2))
then
VERBOSE_BATCH_CPUS=0
fi
else
echo "Ignoring value of $2 for --guest-cpu-limit which is greater than (("$TORTURE_ALLOTED_CPUS" / 2))."
fi
shift
;;
--kcsan-kmake-arg|--kcsan-kmake-args)
checkarg --kcsan-kmake-arg "(kernel make arguments)" $# "$2" '.*' '^error$'
kcsan_kmake_args="`echo "$kcsan_kmake_args $2" | sed -e 's/^ *//' -e 's/ *$//'`"
......@@ -425,9 +441,9 @@ fi
if test "$do_scftorture" = "yes"
then
# Scale memory based on the number of CPUs.
scfmem=$((3+HALF_ALLOTED_CPUS/16))
torture_bootargs="scftorture.nthreads=$HALF_ALLOTED_CPUS torture.disable_onoff_at_boot csdlock_debug=1"
torture_set "scftorture" tools/testing/selftests/rcutorture/bin/kvm.sh --torture scf --allcpus --duration "$duration_scftorture" --configs "$configs_scftorture" --kconfig "CONFIG_NR_CPUS=$HALF_ALLOTED_CPUS" --memory ${scfmem}G --trust-make
scfmem=$((3+SCALE_ALLOTED_CPUS/16))
torture_bootargs="scftorture.nthreads=$SCALE_ALLOTED_CPUS torture.disable_onoff_at_boot csdlock_debug=1"
torture_set "scftorture" tools/testing/selftests/rcutorture/bin/kvm.sh --torture scf --allcpus --duration "$duration_scftorture" --configs "$configs_scftorture" --kconfig "CONFIG_NR_CPUS=$SCALE_ALLOTED_CPUS" --memory ${scfmem}G --trust-make
fi
if test "$do_rt" = "yes"
......@@ -471,8 +487,8 @@ for prim in $primlist
do
if test -n "$firsttime"
then
torture_bootargs="refscale.scale_type="$prim" refscale.nreaders=$HALF_ALLOTED_CPUS refscale.loops=10000 refscale.holdoff=20 torture.disable_onoff_at_boot"
torture_set "refscale-$prim" tools/testing/selftests/rcutorture/bin/kvm.sh --torture refscale --allcpus --duration 5 --kconfig "CONFIG_TASKS_TRACE_RCU=y CONFIG_NR_CPUS=$HALF_ALLOTED_CPUS" --bootargs "refscale.verbose_batched=$VERBOSE_BATCH_CPUS torture.verbose_sleep_frequency=8 torture.verbose_sleep_duration=$VERBOSE_BATCH_CPUS" --trust-make
torture_bootargs="refscale.scale_type="$prim" refscale.nreaders=$SCALE_ALLOTED_CPUS refscale.loops=10000 refscale.holdoff=20 torture.disable_onoff_at_boot"
torture_set "refscale-$prim" tools/testing/selftests/rcutorture/bin/kvm.sh --torture refscale --allcpus --duration 5 --kconfig "CONFIG_TASKS_TRACE_RCU=y CONFIG_NR_CPUS=$SCALE_ALLOTED_CPUS" --bootargs "refscale.verbose_batched=$VERBOSE_BATCH_CPUS torture.verbose_sleep_frequency=8 torture.verbose_sleep_duration=$VERBOSE_BATCH_CPUS" --trust-make
mv $T/last-resdir-nodebug $T/first-resdir-nodebug || :
if test -f "$T/last-resdir-kasan"
then
......@@ -520,8 +536,8 @@ for prim in $primlist
do
if test -n "$firsttime"
then
torture_bootargs="rcuscale.scale_type="$prim" rcuscale.nwriters=$HALF_ALLOTED_CPUS rcuscale.holdoff=20 torture.disable_onoff_at_boot"
torture_set "rcuscale-$prim" tools/testing/selftests/rcutorture/bin/kvm.sh --torture rcuscale --allcpus --duration 5 --kconfig "CONFIG_TASKS_TRACE_RCU=y CONFIG_NR_CPUS=$HALF_ALLOTED_CPUS" --trust-make
torture_bootargs="rcuscale.scale_type="$prim" rcuscale.nwriters=$SCALE_ALLOTED_CPUS rcuscale.holdoff=20 torture.disable_onoff_at_boot"
torture_set "rcuscale-$prim" tools/testing/selftests/rcutorture/bin/kvm.sh --torture rcuscale --allcpus --duration 5 --kconfig "CONFIG_TASKS_TRACE_RCU=y CONFIG_NR_CPUS=$SCALE_ALLOTED_CPUS" --trust-make
mv $T/last-resdir-nodebug $T/first-resdir-nodebug || :
if test -f "$T/last-resdir-kasan"
then
......@@ -559,7 +575,7 @@ do_kcsan="$do_kcsan_save"
if test "$do_kvfree" = "yes"
then
torture_bootargs="rcuscale.kfree_rcu_test=1 rcuscale.kfree_nthreads=16 rcuscale.holdoff=20 rcuscale.kfree_loops=10000 torture.disable_onoff_at_boot"
torture_set "rcuscale-kvfree" tools/testing/selftests/rcutorture/bin/kvm.sh --torture rcuscale --allcpus --duration $duration_rcutorture --kconfig "CONFIG_NR_CPUS=$HALF_ALLOTED_CPUS" --memory 2G --trust-make
torture_set "rcuscale-kvfree" tools/testing/selftests/rcutorture/bin/kvm.sh --torture rcuscale --allcpus --duration $duration_rcutorture --kconfig "CONFIG_NR_CPUS=$SCALE_ALLOTED_CPUS" --memory 2G --trust-make
fi
if test "$do_clocksourcewd" = "yes"
......
CONFIG_RCU_TORTURE_TEST=y
CONFIG_PRINTK_TIME=y
CONFIG_HYPERVISOR_GUEST=y
CONFIG_PARAVIRT=y
CONFIG_KVM_GUEST=y
CONFIG_KCSAN_ASSUME_PLAIN_WRITES_ATOMIC=n
CONFIG_KCSAN_REPORT_VALUE_CHANGE_ONLY=n
CONFIG_HYPERVISOR_GUEST=y
CONFIG_KVM_GUEST=y
CONFIG_HYPERVISOR_GUEST=y
CONFIG_KVM_GUEST=y
......@@ -2,3 +2,4 @@ nohz_full=2-9
rcutorture.stall_cpu=14
rcutorture.stall_cpu_holdoff=90
rcutorture.fwd_progress=0
rcutree.nohz_full_patience_delay=1000
CONFIG_SMP=n
CONFIG_PREEMPT_NONE=y
CONFIG_PREEMPT_VOLUNTARY=n
CONFIG_PREEMPT=n
CONFIG_PREEMPT_DYNAMIC=n
#CHECK#CONFIG_PREEMPT_RCU=n
CONFIG_HZ_PERIODIC=n
CONFIG_NO_HZ_IDLE=y
CONFIG_NO_HZ_FULL=n
CONFIG_HOTPLUG_CPU=n
CONFIG_SUSPEND=n
CONFIG_HIBERNATION=n
CONFIG_RCU_NOCB_CPU=n
CONFIG_DEBUG_LOCK_ALLOC=n
CONFIG_PROVE_LOCKING=n
CONFIG_RCU_BOOST=n
CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
CONFIG_RCU_EXPERT=y
CONFIG_KPROBES=n
CONFIG_FTRACE=n
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment