ftrace: Add use of synchronize_rcu_tasks() with dynamic trampolines

The function tracer needs to be more careful than other subsystems when it
comes to freeing data. Especially if that data is actually executable code.
When a single function is traced, a trampoline can be dynamically allocated
which is called to jump to the function trace callback. When the callback is
no longer needed, the dynamic allocated trampoline needs to be freed. This
is where the issues arise. The dynamically allocated trampoline must not be
used again. As function tracing can trace all subsystems, including
subsystems that are used to serialize aspects of freeing (namely RCU), it
must take extra care when doing the freeing.

Before synchronize_rcu_tasks() was around, there was no way for the function
tracer to know that nothing was using the dynamically allocated trampoline
when CONFIG_PREEMPT was enabled. That's because a task could be indefinitely
preempted while sitting on the trampoline. Now with synchronize_rcu_tasks(),
it will wait till all tasks have either voluntarily scheduled (not on the
trampoline) or goes into userspace (not on the trampoline). Then it is safe
to free the trampoline even with CONFIG_PREEMPT set.
Acked-by: default avatar"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
parent 696ced4f
...@@ -134,7 +134,8 @@ config FUNCTION_TRACER ...@@ -134,7 +134,8 @@ config FUNCTION_TRACER
select KALLSYMS select KALLSYMS
select GENERIC_TRACER select GENERIC_TRACER
select CONTEXT_SWITCH_TRACER select CONTEXT_SWITCH_TRACER
select GLOB select GLOB
select TASKS_RCU if PREEMPT
help help
Enable the kernel to trace every kernel function. This is done Enable the kernel to trace every kernel function. This is done
by using a compiler feature to insert a small, 5-byte No-Operation by using a compiler feature to insert a small, 5-byte No-Operation
......
...@@ -2808,18 +2808,28 @@ static int ftrace_shutdown(struct ftrace_ops *ops, int command) ...@@ -2808,18 +2808,28 @@ static int ftrace_shutdown(struct ftrace_ops *ops, int command)
* callers are done before leaving this function. * callers are done before leaving this function.
* The same goes for freeing the per_cpu data of the per_cpu * The same goes for freeing the per_cpu data of the per_cpu
* ops. * ops.
*
* Again, normal synchronize_sched() is not good enough.
* We need to do a hard force of sched synchronization.
* This is because we use preempt_disable() to do RCU, but
* the function tracers can be called where RCU is not watching
* (like before user_exit()). We can not rely on the RCU
* infrastructure to do the synchronization, thus we must do it
* ourselves.
*/ */
if (ops->flags & (FTRACE_OPS_FL_DYNAMIC | FTRACE_OPS_FL_PER_CPU)) { if (ops->flags & (FTRACE_OPS_FL_DYNAMIC | FTRACE_OPS_FL_PER_CPU)) {
/*
* We need to do a hard force of sched synchronization.
* This is because we use preempt_disable() to do RCU, but
* the function tracers can be called where RCU is not watching
* (like before user_exit()). We can not rely on the RCU
* infrastructure to do the synchronization, thus we must do it
* ourselves.
*/
schedule_on_each_cpu(ftrace_sync); schedule_on_each_cpu(ftrace_sync);
/*
* When the kernel is preeptive, tasks can be preempted
* while on a ftrace trampoline. Just scheduling a task on
* a CPU is not good enough to flush them. Calling
* synchornize_rcu_tasks() will wait for those tasks to
* execute and either schedule voluntarily or enter user space.
*/
if (IS_ENABLED(CONFIG_PREEMPT))
synchronize_rcu_tasks();
arch_ftrace_trampoline_free(ops); arch_ftrace_trampoline_free(ops);
if (ops->flags & FTRACE_OPS_FL_PER_CPU) if (ops->flags & FTRACE_OPS_FL_PER_CPU)
...@@ -5366,22 +5376,6 @@ void __weak arch_ftrace_update_trampoline(struct ftrace_ops *ops) ...@@ -5366,22 +5376,6 @@ void __weak arch_ftrace_update_trampoline(struct ftrace_ops *ops)
static void ftrace_update_trampoline(struct ftrace_ops *ops) static void ftrace_update_trampoline(struct ftrace_ops *ops)
{ {
/*
* Currently there's no safe way to free a trampoline when the kernel
* is configured with PREEMPT. That is because a task could be preempted
* when it jumped to the trampoline, it may be preempted for a long time
* depending on the system load, and currently there's no way to know
* when it will be off the trampoline. If the trampoline is freed
* too early, when the task runs again, it will be executing on freed
* memory and crash.
*/
#ifdef CONFIG_PREEMPT
/* Currently, only non dynamic ops can have a trampoline */
if (ops->flags & FTRACE_OPS_FL_DYNAMIC)
return;
#endif
arch_ftrace_update_trampoline(ops); arch_ftrace_update_trampoline(ops);
} }
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment