Commit b72b5fec authored by Linus Torvalds's avatar Linus Torvalds

Merge tag 'trace-v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull tracing updates from Steven Rostedt:

 - Add function names as a way to filter function addresses

 - Add sample module to test ftrace ops and dynamic trampolines

 - Allow stack traces to be passed from beginning event to end event for
   synthetic events. This will allow seeing the stack trace of when a
   task is scheduled out and recorded when it gets scheduled back in.

 - Add trace event helper __get_buf() to use as a temporary buffer when
   printing out trace event output.

 - Add kernel command line to create trace instances on boot up.

 - Add enabling of events to instances created at boot up.

 - Add trace_array_puts() to write into instances.

 - Allow boot instances to take a snapshot at the end of boot up.

 - Allow live patch modules to include trace events

 - Minor fixes and clean ups

* tag 'trace-v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (31 commits)
  tracing: Remove unnecessary NULL assignment
  tracepoint: Allow livepatch module add trace event
  tracing: Always use canonical ftrace path
  tracing/histogram: Fix stacktrace histogram Documententation
  tracing/histogram: Fix stacktrace key
  tracing/histogram: Fix a few problems with stacktrace variable printing
  tracing: Add BUILD_BUG() to make sure stacktrace fits in strings
  tracing/histogram: Don't use strlen to find length of stacktrace variables
  tracing: Allow boot instances to have snapshot buffers
  tracing: Add trace_array_puts() to write into instance
  tracing: Add enabling of events to boot instances
  tracing: Add creation of instances at boot command line
  tracing: Fix trace_event_raw_event_synth() if else statement
  samples: ftrace: Make some global variables static
  ftrace: sample: avoid open-coded 64-bit division
  samples: ftrace: Include the nospec-branch.h only for x86
  tracing: Acquire buffer from temparary trace sequence
  tracing/histogram: Wrap remaining shell snippets in code blocks
  tracing/osnoise: No need for schedule_hrtimeout range
  bpf/tracing: Use stage6 of tracing to not duplicate macros
  ...
parents 91914238 7568a21e
...@@ -1509,6 +1509,15 @@ ...@@ -1509,6 +1509,15 @@
boot up that is likely to be overridden by user space boot up that is likely to be overridden by user space
start up functionality. start up functionality.
Optionally, the snapshot can also be defined for a tracing
instance that was created by the trace_instance= command
line parameter.
trace_instance=foo,sched_switch ftrace_boot_snapshot=foo
The above will cause the "foo" tracing instance to trigger
a snapshot at the end of boot up.
ftrace_dump_on_oops[=orig_cpu] ftrace_dump_on_oops[=orig_cpu]
[FTRACE] will dump the trace buffers on oops. [FTRACE] will dump the trace buffers on oops.
If no parameter is passed, ftrace will dump If no parameter is passed, ftrace will dump
...@@ -6283,6 +6292,26 @@ ...@@ -6283,6 +6292,26 @@
comma-separated list of trace events to enable. See comma-separated list of trace events to enable. See
also Documentation/trace/events.rst also Documentation/trace/events.rst
trace_instance=[instance-info]
[FTRACE] Create a ring buffer instance early in boot up.
This will be listed in:
/sys/kernel/tracing/instances
Events can be enabled at the time the instance is created
via:
trace_instance=<name>,<system1>:<event1>,<system2>:<event2>
Note, the "<system*>:" portion is optional if the event is
unique.
trace_instance=foo,sched:sched_switch,irq_handler_entry,initcall
will enable the "sched_switch" event (note, the "sched:" is optional, and
the same thing would happen if it was left off). The irq_handler_entry
event, and all events under the "initcall" system.
trace_options=[option-list] trace_options=[option-list]
[FTRACE] Enable or disable tracer options at boot. [FTRACE] Enable or disable tracer options at boot.
The option-list is a comma delimited list of options The option-list is a comma delimited list of options
......
...@@ -207,6 +207,18 @@ field name:: ...@@ -207,6 +207,18 @@ field name::
As the kernel will have to know how to retrieve the memory that the pointer As the kernel will have to know how to retrieve the memory that the pointer
is at from user space. is at from user space.
You can convert any long type to a function address and search by function name::
call_site.function == security_prepare_creds
The above will filter when the field "call_site" falls on the address within
"security_prepare_creds". That is, it will compare the value of "call_site" and
the filter will return true if it is greater than or equal to the start of
the function "security_prepare_creds" and less than the end of that function.
The ".function" postfix can only be attached to values of size long, and can only
be compared with "==" or "!=".
5.2 Setting filters 5.2 Setting filters
------------------- -------------------
......
This diff is collapsed.
...@@ -297,7 +297,7 @@ bool mac_pton(const char *s, u8 *mac); ...@@ -297,7 +297,7 @@ bool mac_pton(const char *s, u8 *mac);
* *
* Use tracing_on/tracing_off when you want to quickly turn on or off * Use tracing_on/tracing_off when you want to quickly turn on or off
* tracing. It simply enables or disables the recording of the trace events. * tracing. It simply enables or disables the recording of the trace events.
* This also corresponds to the user space /sys/kernel/debug/tracing/tracing_on * This also corresponds to the user space /sys/kernel/tracing/tracing_on
* file, which gives a means for the kernel and userspace to interact. * file, which gives a means for the kernel and userspace to interact.
* Place a tracing_off() in the kernel where you want tracing to end. * Place a tracing_off() in the kernel where you want tracing to end.
* From user space, examine the trace, and then echo 1 > tracing_on * From user space, examine the trace, and then echo 1 > tracing_on
......
...@@ -33,6 +33,18 @@ struct trace_array; ...@@ -33,6 +33,18 @@ struct trace_array;
int register_ftrace_export(struct trace_export *export); int register_ftrace_export(struct trace_export *export);
int unregister_ftrace_export(struct trace_export *export); int unregister_ftrace_export(struct trace_export *export);
/**
* trace_array_puts - write a constant string into the trace buffer.
* @tr: The trace array to write to
* @str: The constant string to write
*/
#define trace_array_puts(tr, str) \
({ \
str ? __trace_array_puts(tr, _THIS_IP_, str, strlen(str)) : -1; \
})
int __trace_array_puts(struct trace_array *tr, unsigned long ip,
const char *str, int size);
void trace_printk_init_buffers(void); void trace_printk_init_buffers(void);
__printf(3, 4) __printf(3, 4)
int trace_array_printk(struct trace_array *tr, unsigned long ip, int trace_array_printk(struct trace_array *tr, unsigned long ip,
......
...@@ -95,6 +95,7 @@ extern void trace_seq_bitmask(struct trace_seq *s, const unsigned long *maskp, ...@@ -95,6 +95,7 @@ extern void trace_seq_bitmask(struct trace_seq *s, const unsigned long *maskp,
extern int trace_seq_hex_dump(struct trace_seq *s, const char *prefix_str, extern int trace_seq_hex_dump(struct trace_seq *s, const char *prefix_str,
int prefix_type, int rowsize, int groupsize, int prefix_type, int rowsize, int groupsize,
const void *buf, size_t len, bool ascii); const void *buf, size_t len, bool ascii);
char *trace_seq_acquire(struct trace_seq *s, unsigned int len);
#else /* CONFIG_TRACING */ #else /* CONFIG_TRACING */
static inline __printf(2, 3) static inline __printf(2, 3)
...@@ -139,6 +140,10 @@ static inline int trace_seq_path(struct trace_seq *s, const struct path *path) ...@@ -139,6 +140,10 @@ static inline int trace_seq_path(struct trace_seq *s, const struct path *path)
{ {
return 0; return 0;
} }
static inline char *trace_seq_acquire(struct trace_seq *s, unsigned int len)
{
return NULL;
}
#endif /* CONFIG_TRACING */ #endif /* CONFIG_TRACING */
#endif /* _LINUX_TRACE_SEQ_H */ #endif /* _LINUX_TRACE_SEQ_H */
...@@ -482,7 +482,7 @@ static inline struct tracepoint *tracepoint_ptr_deref(tracepoint_ptr_t *p) ...@@ -482,7 +482,7 @@ static inline struct tracepoint *tracepoint_ptr_deref(tracepoint_ptr_t *p)
* * This is how the trace record is structured and will * * This is how the trace record is structured and will
* * be saved into the ring buffer. These are the fields * * be saved into the ring buffer. These are the fields
* * that will be exposed to user-space in * * that will be exposed to user-space in
* * /sys/kernel/debug/tracing/events/<*>/format. * * /sys/kernel/tracing/events/<*>/format.
* * * *
* * The declared 'local variable' is called '__entry' * * The declared 'local variable' is called '__entry'
* * * *
...@@ -542,7 +542,7 @@ static inline struct tracepoint *tracepoint_ptr_deref(tracepoint_ptr_t *p) ...@@ -542,7 +542,7 @@ static inline struct tracepoint *tracepoint_ptr_deref(tracepoint_ptr_t *p)
* tracepoint callback (this is used by programmatic plugins and * tracepoint callback (this is used by programmatic plugins and
* can also by used by generic instrumentation like SystemTap), and * can also by used by generic instrumentation like SystemTap), and
* it is also used to expose a structured trace record in * it is also used to expose a structured trace record in
* /sys/kernel/debug/tracing/events/. * /sys/kernel/tracing/events/.
* *
* A set of (un)registration functions can be passed to the variant * A set of (un)registration functions can be passed to the variant
* TRACE_EVENT_FN to perform any (un)registration work. * TRACE_EVENT_FN to perform any (un)registration work.
......
...@@ -4,50 +4,7 @@ ...@@ -4,50 +4,7 @@
#ifdef CONFIG_BPF_EVENTS #ifdef CONFIG_BPF_EVENTS
#undef __entry #include "stages/stage6_event_callback.h"
#define __entry entry
#undef __get_dynamic_array
#define __get_dynamic_array(field) \
((void *)__entry + (__entry->__data_loc_##field & 0xffff))
#undef __get_dynamic_array_len
#define __get_dynamic_array_len(field) \
((__entry->__data_loc_##field >> 16) & 0xffff)
#undef __get_str
#define __get_str(field) ((char *)__get_dynamic_array(field))
#undef __get_bitmask
#define __get_bitmask(field) (char *)__get_dynamic_array(field)
#undef __get_cpumask
#define __get_cpumask(field) (char *)__get_dynamic_array(field)
#undef __get_sockaddr
#define __get_sockaddr(field) ((struct sockaddr *)__get_dynamic_array(field))
#undef __get_rel_dynamic_array
#define __get_rel_dynamic_array(field) \
((void *)(&__entry->__rel_loc_##field) + \
sizeof(__entry->__rel_loc_##field) + \
(__entry->__rel_loc_##field & 0xffff))
#undef __get_rel_dynamic_array_len
#define __get_rel_dynamic_array_len(field) \
((__entry->__rel_loc_##field >> 16) & 0xffff)
#undef __get_rel_str
#define __get_rel_str(field) ((char *)__get_rel_dynamic_array(field))
#undef __get_rel_bitmask
#define __get_rel_bitmask(field) (char *)__get_rel_dynamic_array(field)
#undef __get_rel_cpumask
#define __get_rel_cpumask(field) (char *)__get_rel_dynamic_array(field)
#undef __get_rel_sockaddr
#define __get_rel_sockaddr(field) ((struct sockaddr *)__get_rel_dynamic_array(field))
#undef __perf_count #undef __perf_count
#define __perf_count(c) (c) #define __perf_count(c) (c)
......
...@@ -4,51 +4,7 @@ ...@@ -4,51 +4,7 @@
#ifdef CONFIG_PERF_EVENTS #ifdef CONFIG_PERF_EVENTS
#undef __entry #include "stages/stage6_event_callback.h"
#define __entry entry
#undef __get_dynamic_array
#define __get_dynamic_array(field) \
((void *)__entry + (__entry->__data_loc_##field & 0xffff))
#undef __get_dynamic_array_len
#define __get_dynamic_array_len(field) \
((__entry->__data_loc_##field >> 16) & 0xffff)
#undef __get_str
#define __get_str(field) ((char *)__get_dynamic_array(field))
#undef __get_bitmask
#define __get_bitmask(field) (char *)__get_dynamic_array(field)
#undef __get_cpumask
#define __get_cpumask(field) (char *)__get_dynamic_array(field)
#undef __get_sockaddr
#define __get_sockaddr(field) ((struct sockaddr *)__get_dynamic_array(field))
#undef __get_rel_dynamic_array
#define __get_rel_dynamic_array(field) \
((void *)__entry + \
offsetof(typeof(*__entry), __rel_loc_##field) + \
sizeof(__entry->__rel_loc_##field) + \
(__entry->__rel_loc_##field & 0xffff))
#undef __get_rel_dynamic_array_len
#define __get_rel_dynamic_array_len(field) \
((__entry->__rel_loc_##field >> 16) & 0xffff)
#undef __get_rel_str
#define __get_rel_str(field) ((char *)__get_rel_dynamic_array(field))
#undef __get_rel_bitmask
#define __get_rel_bitmask(field) (char *)__get_rel_dynamic_array(field)
#undef __get_rel_cpumask
#define __get_rel_cpumask(field) (char *)__get_rel_dynamic_array(field)
#undef __get_rel_sockaddr
#define __get_rel_sockaddr(field) ((struct sockaddr *)__get_rel_dynamic_array(field))
#undef __perf_count #undef __perf_count
#define __perf_count(c) (__count = (c)) #define __perf_count(c) (__count = (c))
......
...@@ -139,3 +139,6 @@ ...@@ -139,3 +139,6 @@
u64 ____val = (u64)(value); \ u64 ____val = (u64)(value); \
(u32) do_div(____val, NSEC_PER_SEC); \ (u32) do_div(____val, NSEC_PER_SEC); \
}) })
#undef __get_buf
#define __get_buf(len) trace_seq_acquire(p, (len))
...@@ -2,6 +2,9 @@ ...@@ -2,6 +2,9 @@
/* Stage 6 definitions for creating trace events */ /* Stage 6 definitions for creating trace events */
/* Reuse some of the stage 3 macros */
#include "stage3_trace_output.h"
#undef __entry #undef __entry
#define __entry entry #define __entry entry
......
...@@ -23,6 +23,7 @@ ...@@ -23,6 +23,7 @@
#undef __get_rel_sockaddr #undef __get_rel_sockaddr
#undef __print_array #undef __print_array
#undef __print_hex_dump #undef __print_hex_dump
#undef __get_buf
/* /*
* The below is not executed in the kernel. It is only what is * The below is not executed in the kernel. It is only what is
......
...@@ -242,7 +242,7 @@ config DYNAMIC_FTRACE ...@@ -242,7 +242,7 @@ config DYNAMIC_FTRACE
enabled, and the functions not enabled will not affect enabled, and the functions not enabled will not affect
performance of the system. performance of the system.
See the files in /sys/kernel/debug/tracing: See the files in /sys/kernel/tracing:
available_filter_functions available_filter_functions
set_ftrace_filter set_ftrace_filter
set_ftrace_notrace set_ftrace_notrace
...@@ -306,7 +306,7 @@ config STACK_TRACER ...@@ -306,7 +306,7 @@ config STACK_TRACER
select KALLSYMS select KALLSYMS
help help
This special tracer records the maximum stack footprint of the This special tracer records the maximum stack footprint of the
kernel and displays it in /sys/kernel/debug/tracing/stack_trace. kernel and displays it in /sys/kernel/tracing/stack_trace.
This tracer works by hooking into every function call that the This tracer works by hooking into every function call that the
kernel executes, and keeping a maximum stack depth value and kernel executes, and keeping a maximum stack depth value and
...@@ -346,7 +346,7 @@ config IRQSOFF_TRACER ...@@ -346,7 +346,7 @@ config IRQSOFF_TRACER
disabled by default and can be runtime (re-)started disabled by default and can be runtime (re-)started
via: via:
echo 0 > /sys/kernel/debug/tracing/tracing_max_latency echo 0 > /sys/kernel/tracing/tracing_max_latency
(Note that kernel size and overhead increase with this option (Note that kernel size and overhead increase with this option
enabled. This option and the preempt-off timing option can be enabled. This option and the preempt-off timing option can be
...@@ -370,7 +370,7 @@ config PREEMPT_TRACER ...@@ -370,7 +370,7 @@ config PREEMPT_TRACER
disabled by default and can be runtime (re-)started disabled by default and can be runtime (re-)started
via: via:
echo 0 > /sys/kernel/debug/tracing/tracing_max_latency echo 0 > /sys/kernel/tracing/tracing_max_latency
(Note that kernel size and overhead increase with this option (Note that kernel size and overhead increase with this option
enabled. This option and the irqs-off timing option can be enabled. This option and the irqs-off timing option can be
...@@ -522,7 +522,7 @@ config TRACER_SNAPSHOT ...@@ -522,7 +522,7 @@ config TRACER_SNAPSHOT
Allow tracing users to take snapshot of the current buffer using the Allow tracing users to take snapshot of the current buffer using the
ftrace interface, e.g.: ftrace interface, e.g.:
echo 1 > /sys/kernel/debug/tracing/snapshot echo 1 > /sys/kernel/tracing/snapshot
cat snapshot cat snapshot
config TRACER_SNAPSHOT_PER_CPU_SWAP config TRACER_SNAPSHOT_PER_CPU_SWAP
...@@ -534,7 +534,7 @@ config TRACER_SNAPSHOT_PER_CPU_SWAP ...@@ -534,7 +534,7 @@ config TRACER_SNAPSHOT_PER_CPU_SWAP
full swap (all buffers). If this is set, then the following is full swap (all buffers). If this is set, then the following is
allowed: allowed:
echo 1 > /sys/kernel/debug/tracing/per_cpu/cpu2/snapshot echo 1 > /sys/kernel/tracing/per_cpu/cpu2/snapshot
After which, only the tracing buffer for CPU 2 was swapped with After which, only the tracing buffer for CPU 2 was swapped with
the main tracing buffer, and the other CPU buffers remain the same. the main tracing buffer, and the other CPU buffers remain the same.
...@@ -581,7 +581,7 @@ config PROFILE_ANNOTATED_BRANCHES ...@@ -581,7 +581,7 @@ config PROFILE_ANNOTATED_BRANCHES
This tracer profiles all likely and unlikely macros This tracer profiles all likely and unlikely macros
in the kernel. It will display the results in: in the kernel. It will display the results in:
/sys/kernel/debug/tracing/trace_stat/branch_annotated /sys/kernel/tracing/trace_stat/branch_annotated
Note: this will add a significant overhead; only turn this Note: this will add a significant overhead; only turn this
on if you need to profile the system's use of these macros. on if you need to profile the system's use of these macros.
...@@ -594,7 +594,7 @@ config PROFILE_ALL_BRANCHES ...@@ -594,7 +594,7 @@ config PROFILE_ALL_BRANCHES
taken in the kernel is recorded whether it hit or miss. taken in the kernel is recorded whether it hit or miss.
The results will be displayed in: The results will be displayed in:
/sys/kernel/debug/tracing/trace_stat/branch_all /sys/kernel/tracing/trace_stat/branch_all
This option also enables the likely/unlikely profiler. This option also enables the likely/unlikely profiler.
...@@ -645,8 +645,8 @@ config BLK_DEV_IO_TRACE ...@@ -645,8 +645,8 @@ config BLK_DEV_IO_TRACE
Tracing also is possible using the ftrace interface, e.g.: Tracing also is possible using the ftrace interface, e.g.:
echo 1 > /sys/block/sda/sda1/trace/enable echo 1 > /sys/block/sda/sda1/trace/enable
echo blk > /sys/kernel/debug/tracing/current_tracer echo blk > /sys/kernel/tracing/current_tracer
cat /sys/kernel/debug/tracing/trace_pipe cat /sys/kernel/tracing/trace_pipe
If unsure, say N. If unsure, say N.
......
...@@ -21,7 +21,7 @@ ...@@ -21,7 +21,7 @@
* Then: * Then:
* *
* # insmod kernel/trace/kprobe_event_gen_test.ko * # insmod kernel/trace/kprobe_event_gen_test.ko
* # cat /sys/kernel/debug/tracing/trace * # cat /sys/kernel/tracing/trace
* *
* You should see many instances of the "gen_kprobe_test" and * You should see many instances of the "gen_kprobe_test" and
* "gen_kretprobe_test" events in the trace buffer. * "gen_kretprobe_test" events in the trace buffer.
......
...@@ -2864,7 +2864,7 @@ rb_check_timestamp(struct ring_buffer_per_cpu *cpu_buffer, ...@@ -2864,7 +2864,7 @@ rb_check_timestamp(struct ring_buffer_per_cpu *cpu_buffer,
sched_clock_stable() ? "" : sched_clock_stable() ? "" :
"If you just came from a suspend/resume,\n" "If you just came from a suspend/resume,\n"
"please switch to the trace global clock:\n" "please switch to the trace global clock:\n"
" echo global > /sys/kernel/debug/tracing/trace_clock\n" " echo global > /sys/kernel/tracing/trace_clock\n"
"or add trace_clock=global to the kernel command line\n"); "or add trace_clock=global to the kernel command line\n");
} }
...@@ -5604,11 +5604,16 @@ EXPORT_SYMBOL_GPL(ring_buffer_alloc_read_page); ...@@ -5604,11 +5604,16 @@ EXPORT_SYMBOL_GPL(ring_buffer_alloc_read_page);
*/ */
void ring_buffer_free_read_page(struct trace_buffer *buffer, int cpu, void *data) void ring_buffer_free_read_page(struct trace_buffer *buffer, int cpu, void *data)
{ {
struct ring_buffer_per_cpu *cpu_buffer = buffer->buffers[cpu]; struct ring_buffer_per_cpu *cpu_buffer;
struct buffer_data_page *bpage = data; struct buffer_data_page *bpage = data;
struct page *page = virt_to_page(bpage); struct page *page = virt_to_page(bpage);
unsigned long flags; unsigned long flags;
if (!buffer || !buffer->buffers || !buffer->buffers[cpu])
return;
cpu_buffer = buffer->buffers[cpu];
/* If the page is still in use someplace else, we can't reuse it */ /* If the page is still in use someplace else, we can't reuse it */
if (page_ref_count(page) > 1) if (page_ref_count(page) > 1)
goto out; goto out;
......
...@@ -22,7 +22,7 @@ ...@@ -22,7 +22,7 @@
* Then: * Then:
* *
* # insmod kernel/trace/synth_event_gen_test.ko * # insmod kernel/trace/synth_event_gen_test.ko
* # cat /sys/kernel/debug/tracing/trace * # cat /sys/kernel/tracing/trace
* *
* You should see several events in the trace buffer - * You should see several events in the trace buffer -
* "create_synth_test", "empty_synth_test", and several instances of * "create_synth_test", "empty_synth_test", and several instances of
......
...@@ -49,6 +49,8 @@ ...@@ -49,6 +49,8 @@
#include <linux/irq_work.h> #include <linux/irq_work.h>
#include <linux/workqueue.h> #include <linux/workqueue.h>
#include <asm/setup.h> /* COMMAND_LINE_SIZE */
#include "trace.h" #include "trace.h"
#include "trace_output.h" #include "trace_output.h"
...@@ -186,6 +188,12 @@ static char *default_bootup_tracer; ...@@ -186,6 +188,12 @@ static char *default_bootup_tracer;
static bool allocate_snapshot; static bool allocate_snapshot;
static bool snapshot_at_boot; static bool snapshot_at_boot;
static char boot_instance_info[COMMAND_LINE_SIZE] __initdata;
static int boot_instance_index;
static char boot_snapshot_info[COMMAND_LINE_SIZE] __initdata;
static int boot_snapshot_index;
static int __init set_cmdline_ftrace(char *str) static int __init set_cmdline_ftrace(char *str)
{ {
strlcpy(bootup_tracer_buf, str, MAX_TRACER_SIZE); strlcpy(bootup_tracer_buf, str, MAX_TRACER_SIZE);
...@@ -222,9 +230,22 @@ __setup("traceoff_on_warning", stop_trace_on_warning); ...@@ -222,9 +230,22 @@ __setup("traceoff_on_warning", stop_trace_on_warning);
static int __init boot_alloc_snapshot(char *str) static int __init boot_alloc_snapshot(char *str)
{ {
allocate_snapshot = true; char *slot = boot_snapshot_info + boot_snapshot_index;
/* We also need the main ring buffer expanded */ int left = sizeof(boot_snapshot_info) - boot_snapshot_index;
ring_buffer_expanded = true; int ret;
if (str[0] == '=') {
str++;
if (strlen(str) >= left)
return -1;
ret = snprintf(slot, left, "%s\t", str);
boot_snapshot_index += ret;
} else {
allocate_snapshot = true;
/* We also need the main ring buffer expanded */
ring_buffer_expanded = true;
}
return 1; return 1;
} }
__setup("alloc_snapshot", boot_alloc_snapshot); __setup("alloc_snapshot", boot_alloc_snapshot);
...@@ -239,6 +260,23 @@ static int __init boot_snapshot(char *str) ...@@ -239,6 +260,23 @@ static int __init boot_snapshot(char *str)
__setup("ftrace_boot_snapshot", boot_snapshot); __setup("ftrace_boot_snapshot", boot_snapshot);
static int __init boot_instance(char *str)
{
char *slot = boot_instance_info + boot_instance_index;
int left = sizeof(boot_instance_info) - boot_instance_index;
int ret;
if (strlen(str) >= left)
return -1;
ret = snprintf(slot, left, "%s\t", str);
boot_instance_index += ret;
return 1;
}
__setup("trace_instance=", boot_instance);
static char trace_boot_options_buf[MAX_TRACER_SIZE] __initdata; static char trace_boot_options_buf[MAX_TRACER_SIZE] __initdata;
static int __init set_trace_boot_options(char *str) static int __init set_trace_boot_options(char *str)
...@@ -1001,13 +1039,8 @@ __buffer_unlock_commit(struct trace_buffer *buffer, struct ring_buffer_event *ev ...@@ -1001,13 +1039,8 @@ __buffer_unlock_commit(struct trace_buffer *buffer, struct ring_buffer_event *ev
ring_buffer_unlock_commit(buffer); ring_buffer_unlock_commit(buffer);
} }
/** int __trace_array_puts(struct trace_array *tr, unsigned long ip,
* __trace_puts - write a constant string into the trace buffer. const char *str, int size)
* @ip: The address of the caller
* @str: The constant string to write
* @size: The size of the string.
*/
int __trace_puts(unsigned long ip, const char *str, int size)
{ {
struct ring_buffer_event *event; struct ring_buffer_event *event;
struct trace_buffer *buffer; struct trace_buffer *buffer;
...@@ -1015,7 +1048,7 @@ int __trace_puts(unsigned long ip, const char *str, int size) ...@@ -1015,7 +1048,7 @@ int __trace_puts(unsigned long ip, const char *str, int size)
unsigned int trace_ctx; unsigned int trace_ctx;
int alloc; int alloc;
if (!(global_trace.trace_flags & TRACE_ITER_PRINTK)) if (!(tr->trace_flags & TRACE_ITER_PRINTK))
return 0; return 0;
if (unlikely(tracing_selftest_running || tracing_disabled)) if (unlikely(tracing_selftest_running || tracing_disabled))
...@@ -1024,7 +1057,7 @@ int __trace_puts(unsigned long ip, const char *str, int size) ...@@ -1024,7 +1057,7 @@ int __trace_puts(unsigned long ip, const char *str, int size)
alloc = sizeof(*entry) + size + 2; /* possible \n added */ alloc = sizeof(*entry) + size + 2; /* possible \n added */
trace_ctx = tracing_gen_ctx(); trace_ctx = tracing_gen_ctx();
buffer = global_trace.array_buffer.buffer; buffer = tr->array_buffer.buffer;
ring_buffer_nest_start(buffer); ring_buffer_nest_start(buffer);
event = __trace_buffer_lock_reserve(buffer, TRACE_PRINT, alloc, event = __trace_buffer_lock_reserve(buffer, TRACE_PRINT, alloc,
trace_ctx); trace_ctx);
...@@ -1046,11 +1079,23 @@ int __trace_puts(unsigned long ip, const char *str, int size) ...@@ -1046,11 +1079,23 @@ int __trace_puts(unsigned long ip, const char *str, int size)
entry->buf[size] = '\0'; entry->buf[size] = '\0';
__buffer_unlock_commit(buffer, event); __buffer_unlock_commit(buffer, event);
ftrace_trace_stack(&global_trace, buffer, trace_ctx, 4, NULL); ftrace_trace_stack(tr, buffer, trace_ctx, 4, NULL);
out: out:
ring_buffer_nest_end(buffer); ring_buffer_nest_end(buffer);
return size; return size;
} }
EXPORT_SYMBOL_GPL(__trace_array_puts);
/**
* __trace_puts - write a constant string into the trace buffer.
* @ip: The address of the caller
* @str: The constant string to write
* @size: The size of the string.
*/
int __trace_puts(unsigned long ip, const char *str, int size)
{
return __trace_array_puts(&global_trace, ip, str, size);
}
EXPORT_SYMBOL_GPL(__trace_puts); EXPORT_SYMBOL_GPL(__trace_puts);
/** /**
...@@ -1142,7 +1187,7 @@ void tracing_snapshot_instance(struct trace_array *tr) ...@@ -1142,7 +1187,7 @@ void tracing_snapshot_instance(struct trace_array *tr)
* *
* Note, make sure to allocate the snapshot with either * Note, make sure to allocate the snapshot with either
* a tracing_snapshot_alloc(), or by doing it manually * a tracing_snapshot_alloc(), or by doing it manually
* with: echo 1 > /sys/kernel/debug/tracing/snapshot * with: echo 1 > /sys/kernel/tracing/snapshot
* *
* If the snapshot buffer is not allocated, it will stop tracing. * If the snapshot buffer is not allocated, it will stop tracing.
* Basically making a permanent snapshot. * Basically making a permanent snapshot.
...@@ -5760,7 +5805,7 @@ static const char readme_msg[] = ...@@ -5760,7 +5805,7 @@ static const char readme_msg[] =
#ifdef CONFIG_SYNTH_EVENTS #ifdef CONFIG_SYNTH_EVENTS
" events/synthetic_events\t- Create/append/remove/show synthetic events\n" " events/synthetic_events\t- Create/append/remove/show synthetic events\n"
"\t Write into this file to define/undefine new synthetic events.\n" "\t Write into this file to define/undefine new synthetic events.\n"
"\t example: echo 'myevent u64 lat; char name[]' >> synthetic_events\n" "\t example: echo 'myevent u64 lat; char name[]; long[] stack' >> synthetic_events\n"
#endif #endif
#endif #endif
; ;
...@@ -9225,10 +9270,6 @@ static int allocate_trace_buffers(struct trace_array *tr, int size) ...@@ -9225,10 +9270,6 @@ static int allocate_trace_buffers(struct trace_array *tr, int size)
} }
tr->allocated_snapshot = allocate_snapshot; tr->allocated_snapshot = allocate_snapshot;
/*
* Only the top level trace array gets its snapshot allocated
* from the kernel command line.
*/
allocate_snapshot = false; allocate_snapshot = false;
#endif #endif
...@@ -10144,6 +10185,79 @@ ssize_t trace_parse_run_command(struct file *file, const char __user *buffer, ...@@ -10144,6 +10185,79 @@ ssize_t trace_parse_run_command(struct file *file, const char __user *buffer,
return ret; return ret;
} }
#ifdef CONFIG_TRACER_MAX_TRACE
__init static bool tr_needs_alloc_snapshot(const char *name)
{
char *test;
int len = strlen(name);
bool ret;
if (!boot_snapshot_index)
return false;
if (strncmp(name, boot_snapshot_info, len) == 0 &&
boot_snapshot_info[len] == '\t')
return true;
test = kmalloc(strlen(name) + 3, GFP_KERNEL);
if (!test)
return false;
sprintf(test, "\t%s\t", name);
ret = strstr(boot_snapshot_info, test) == NULL;
kfree(test);
return ret;
}
__init static void do_allocate_snapshot(const char *name)
{
if (!tr_needs_alloc_snapshot(name))
return;
/*
* When allocate_snapshot is set, the next call to
* allocate_trace_buffers() (called by trace_array_get_by_name())
* will allocate the snapshot buffer. That will alse clear
* this flag.
*/
allocate_snapshot = true;
}
#else
static inline void do_allocate_snapshot(const char *name) { }
#endif
__init static void enable_instances(void)
{
struct trace_array *tr;
char *curr_str;
char *str;
char *tok;
/* A tab is always appended */
boot_instance_info[boot_instance_index - 1] = '\0';
str = boot_instance_info;
while ((curr_str = strsep(&str, "\t"))) {
tok = strsep(&curr_str, ",");
if (IS_ENABLED(CONFIG_TRACER_MAX_TRACE))
do_allocate_snapshot(tok);
tr = trace_array_get_by_name(tok);
if (!tr) {
pr_warn("Failed to create instance buffer %s\n", curr_str);
continue;
}
/* Allow user space to delete it */
trace_array_put(tr);
while ((tok = strsep(&curr_str, ","))) {
early_enable_events(tr, tok, true);
}
}
}
__init static int tracer_alloc_buffers(void) __init static int tracer_alloc_buffers(void)
{ {
int ring_buf_size; int ring_buf_size;
...@@ -10277,10 +10391,19 @@ __init static int tracer_alloc_buffers(void) ...@@ -10277,10 +10391,19 @@ __init static int tracer_alloc_buffers(void)
void __init ftrace_boot_snapshot(void) void __init ftrace_boot_snapshot(void)
{ {
struct trace_array *tr;
if (snapshot_at_boot) { if (snapshot_at_boot) {
tracing_snapshot(); tracing_snapshot();
internal_trace_puts("** Boot snapshot taken **\n"); internal_trace_puts("** Boot snapshot taken **\n");
} }
list_for_each_entry(tr, &ftrace_trace_arrays, list) {
if (tr == &global_trace)
continue;
trace_array_puts(tr, "** Boot snapshot taken **\n");
tracing_snapshot_instance(tr);
}
} }
void __init early_trace_init(void) void __init early_trace_init(void)
...@@ -10302,6 +10425,9 @@ void __init early_trace_init(void) ...@@ -10302,6 +10425,9 @@ void __init early_trace_init(void)
void __init trace_init(void) void __init trace_init(void)
{ {
trace_event_init(); trace_event_init();
if (boot_instance_index)
enable_instances();
} }
__init static void clear_boot_tracer(void) __init static void clear_boot_tracer(void)
......
...@@ -113,6 +113,10 @@ enum trace_type { ...@@ -113,6 +113,10 @@ enum trace_type {
#define MEM_FAIL(condition, fmt, ...) \ #define MEM_FAIL(condition, fmt, ...) \
DO_ONCE_LITE_IF(condition, pr_err, "ERROR: " fmt, ##__VA_ARGS__) DO_ONCE_LITE_IF(condition, pr_err, "ERROR: " fmt, ##__VA_ARGS__)
#define HIST_STACKTRACE_DEPTH 16
#define HIST_STACKTRACE_SIZE (HIST_STACKTRACE_DEPTH * sizeof(unsigned long))
#define HIST_STACKTRACE_SKIP 5
/* /*
* syscalls are special, and need special handling, this is why * syscalls are special, and need special handling, this is why
* they are not included in trace_entries.h * they are not included in trace_entries.h
...@@ -1331,6 +1335,8 @@ DECLARE_PER_CPU(int, trace_buffered_event_cnt); ...@@ -1331,6 +1335,8 @@ DECLARE_PER_CPU(int, trace_buffered_event_cnt);
void trace_buffered_event_disable(void); void trace_buffered_event_disable(void);
void trace_buffered_event_enable(void); void trace_buffered_event_enable(void);
void early_enable_events(struct trace_array *tr, char *buf, bool disable_first);
static inline void static inline void
__trace_event_discard_commit(struct trace_buffer *buffer, __trace_event_discard_commit(struct trace_buffer *buffer,
struct ring_buffer_event *event) struct ring_buffer_event *event)
......
...@@ -2281,8 +2281,6 @@ create_new_subsystem(const char *name) ...@@ -2281,8 +2281,6 @@ create_new_subsystem(const char *name)
if (!system->name) if (!system->name)
goto out_free; goto out_free;
system->filter = NULL;
system->filter = kzalloc(sizeof(struct event_filter), GFP_KERNEL); system->filter = kzalloc(sizeof(struct event_filter), GFP_KERNEL);
if (!system->filter) if (!system->filter)
goto out_free; goto out_free;
...@@ -2843,7 +2841,7 @@ static __init int setup_trace_triggers(char *str) ...@@ -2843,7 +2841,7 @@ static __init int setup_trace_triggers(char *str)
if (!trigger) if (!trigger)
break; break;
bootup_triggers[i].event = strsep(&trigger, "."); bootup_triggers[i].event = strsep(&trigger, ".");
bootup_triggers[i].trigger = strsep(&trigger, "."); bootup_triggers[i].trigger = trigger;
if (!bootup_triggers[i].trigger) if (!bootup_triggers[i].trigger)
break; break;
} }
...@@ -3771,10 +3769,9 @@ static __init int event_trace_memsetup(void) ...@@ -3771,10 +3769,9 @@ static __init int event_trace_memsetup(void)
return 0; return 0;
} }
static __init void __init void
early_enable_events(struct trace_array *tr, bool disable_first) early_enable_events(struct trace_array *tr, char *buf, bool disable_first)
{ {
char *buf = bootup_event_buf;
char *token; char *token;
int ret; int ret;
...@@ -3827,7 +3824,7 @@ static __init int event_trace_enable(void) ...@@ -3827,7 +3824,7 @@ static __init int event_trace_enable(void)
*/ */
__trace_early_add_events(tr); __trace_early_add_events(tr);
early_enable_events(tr, false); early_enable_events(tr, bootup_event_buf, false);
trace_printk_start_comm(); trace_printk_start_comm();
...@@ -3855,7 +3852,7 @@ static __init int event_trace_enable_again(void) ...@@ -3855,7 +3852,7 @@ static __init int event_trace_enable_again(void)
if (!tr) if (!tr)
return -ENODEV; return -ENODEV;
early_enable_events(tr, true); early_enable_events(tr, bootup_event_buf, true);
return 0; return 0;
} }
......
...@@ -64,6 +64,7 @@ enum filter_pred_fn { ...@@ -64,6 +64,7 @@ enum filter_pred_fn {
FILTER_PRED_FN_PCHAR_USER, FILTER_PRED_FN_PCHAR_USER,
FILTER_PRED_FN_PCHAR, FILTER_PRED_FN_PCHAR,
FILTER_PRED_FN_CPU, FILTER_PRED_FN_CPU,
FILTER_PRED_FN_FUNCTION,
FILTER_PRED_FN_, FILTER_PRED_FN_,
FILTER_PRED_TEST_VISITED, FILTER_PRED_TEST_VISITED,
}; };
...@@ -71,6 +72,7 @@ enum filter_pred_fn { ...@@ -71,6 +72,7 @@ enum filter_pred_fn {
struct filter_pred { struct filter_pred {
enum filter_pred_fn fn_num; enum filter_pred_fn fn_num;
u64 val; u64 val;
u64 val2;
struct regex regex; struct regex regex;
unsigned short *ops; unsigned short *ops;
struct ftrace_event_field *field; struct ftrace_event_field *field;
...@@ -103,6 +105,7 @@ struct filter_pred { ...@@ -103,6 +105,7 @@ struct filter_pred {
C(INVALID_FILTER, "Meaningless filter expression"), \ C(INVALID_FILTER, "Meaningless filter expression"), \
C(IP_FIELD_ONLY, "Only 'ip' field is supported for function trace"), \ C(IP_FIELD_ONLY, "Only 'ip' field is supported for function trace"), \
C(INVALID_VALUE, "Invalid value (did you forget quotes)?"), \ C(INVALID_VALUE, "Invalid value (did you forget quotes)?"), \
C(NO_FUNCTION, "Function not found"), \
C(ERRNO, "Error"), \ C(ERRNO, "Error"), \
C(NO_FILTER, "No filter found") C(NO_FILTER, "No filter found")
...@@ -876,6 +879,17 @@ static int filter_pred_comm(struct filter_pred *pred, void *event) ...@@ -876,6 +879,17 @@ static int filter_pred_comm(struct filter_pred *pred, void *event)
return cmp ^ pred->not; return cmp ^ pred->not;
} }
/* Filter predicate for functions. */
static int filter_pred_function(struct filter_pred *pred, void *event)
{
unsigned long *addr = (unsigned long *)(event + pred->offset);
unsigned long start = (unsigned long)pred->val;
unsigned long end = (unsigned long)pred->val2;
int ret = *addr >= start && *addr < end;
return pred->op == OP_EQ ? ret : !ret;
}
/* /*
* regex_match_foo - Basic regex callbacks * regex_match_foo - Basic regex callbacks
* *
...@@ -1335,6 +1349,8 @@ static int filter_pred_fn_call(struct filter_pred *pred, void *event) ...@@ -1335,6 +1349,8 @@ static int filter_pred_fn_call(struct filter_pred *pred, void *event)
return filter_pred_pchar(pred, event); return filter_pred_pchar(pred, event);
case FILTER_PRED_FN_CPU: case FILTER_PRED_FN_CPU:
return filter_pred_cpu(pred, event); return filter_pred_cpu(pred, event);
case FILTER_PRED_FN_FUNCTION:
return filter_pred_function(pred, event);
case FILTER_PRED_TEST_VISITED: case FILTER_PRED_TEST_VISITED:
return test_pred_visited_fn(pred, event); return test_pred_visited_fn(pred, event);
default: default:
...@@ -1350,8 +1366,13 @@ static int parse_pred(const char *str, void *data, ...@@ -1350,8 +1366,13 @@ static int parse_pred(const char *str, void *data,
struct trace_event_call *call = data; struct trace_event_call *call = data;
struct ftrace_event_field *field; struct ftrace_event_field *field;
struct filter_pred *pred = NULL; struct filter_pred *pred = NULL;
unsigned long offset;
unsigned long size;
unsigned long ip;
char num_buf[24]; /* Big enough to hold an address */ char num_buf[24]; /* Big enough to hold an address */
char *field_name; char *field_name;
char *name;
bool function = false;
bool ustring = false; bool ustring = false;
char q; char q;
u64 val; u64 val;
...@@ -1393,6 +1414,12 @@ static int parse_pred(const char *str, void *data, ...@@ -1393,6 +1414,12 @@ static int parse_pred(const char *str, void *data,
i += len; i += len;
} }
/* See if the field is a kernel function name */
if ((len = str_has_prefix(str + i, ".function"))) {
function = true;
i += len;
}
while (isspace(str[i])) while (isspace(str[i]))
i++; i++;
...@@ -1423,7 +1450,71 @@ static int parse_pred(const char *str, void *data, ...@@ -1423,7 +1450,71 @@ static int parse_pred(const char *str, void *data,
pred->offset = field->offset; pred->offset = field->offset;
pred->op = op; pred->op = op;
if (ftrace_event_is_function(call)) { if (function) {
/* The field must be the same size as long */
if (field->size != sizeof(long)) {
parse_error(pe, FILT_ERR_ILLEGAL_FIELD_OP, pos + i);
goto err_free;
}
/* Function only works with '==' or '!=' and an unquoted string */
switch (op) {
case OP_NE:
case OP_EQ:
break;
default:
parse_error(pe, FILT_ERR_INVALID_OP, pos + i);
goto err_free;
}
if (isdigit(str[i])) {
/* We allow 0xDEADBEEF */
while (isalnum(str[i]))
i++;
len = i - s;
/* 0xfeedfacedeadbeef is 18 chars max */
if (len >= sizeof(num_buf)) {
parse_error(pe, FILT_ERR_OPERAND_TOO_LONG, pos + i);
goto err_free;
}
strncpy(num_buf, str + s, len);
num_buf[len] = 0;
ret = kstrtoul(num_buf, 0, &ip);
if (ret) {
parse_error(pe, FILT_ERR_INVALID_VALUE, pos + i);
goto err_free;
}
} else {
s = i;
for (; str[i] && !isspace(str[i]); i++)
;
len = i - s;
name = kmemdup_nul(str + s, len, GFP_KERNEL);
if (!name)
goto err_mem;
ip = kallsyms_lookup_name(name);
kfree(name);
if (!ip) {
parse_error(pe, FILT_ERR_NO_FUNCTION, pos + i);
goto err_free;
}
}
/* Now find the function start and end address */
if (!kallsyms_lookup_size_offset(ip, &size, &offset)) {
parse_error(pe, FILT_ERR_NO_FUNCTION, pos + i);
goto err_free;
}
pred->fn_num = FILTER_PRED_FN_FUNCTION;
pred->val = ip - offset;
pred->val2 = pred->val + size;
} else if (ftrace_event_is_function(call)) {
/* /*
* Perf does things different with function events. * Perf does things different with function events.
* It only allows an "ip" field, and expects a string. * It only allows an "ip" field, and expects a string.
......
...@@ -135,6 +135,7 @@ enum hist_field_fn { ...@@ -135,6 +135,7 @@ enum hist_field_fn {
HIST_FIELD_FN_DIV_NOT_POWER2, HIST_FIELD_FN_DIV_NOT_POWER2,
HIST_FIELD_FN_DIV_MULT_SHIFT, HIST_FIELD_FN_DIV_MULT_SHIFT,
HIST_FIELD_FN_EXECNAME, HIST_FIELD_FN_EXECNAME,
HIST_FIELD_FN_STACK,
}; };
/* /*
...@@ -480,10 +481,6 @@ DEFINE_HIST_FIELD_FN(u8); ...@@ -480,10 +481,6 @@ DEFINE_HIST_FIELD_FN(u8);
#define for_each_hist_key_field(i, hist_data) \ #define for_each_hist_key_field(i, hist_data) \
for ((i) = (hist_data)->n_vals; (i) < (hist_data)->n_fields; (i)++) for ((i) = (hist_data)->n_vals; (i) < (hist_data)->n_fields; (i)++)
#define HIST_STACKTRACE_DEPTH 16
#define HIST_STACKTRACE_SIZE (HIST_STACKTRACE_DEPTH * sizeof(unsigned long))
#define HIST_STACKTRACE_SKIP 5
#define HITCOUNT_IDX 0 #define HITCOUNT_IDX 0
#define HIST_KEY_SIZE_MAX (MAX_FILTER_STR_VAL + HIST_STACKTRACE_SIZE) #define HIST_KEY_SIZE_MAX (MAX_FILTER_STR_VAL + HIST_STACKTRACE_SIZE)
...@@ -1360,7 +1357,12 @@ static const char *hist_field_name(struct hist_field *field, ...@@ -1360,7 +1357,12 @@ static const char *hist_field_name(struct hist_field *field,
field_name = field->name; field_name = field->name;
} else if (field->flags & HIST_FIELD_FL_TIMESTAMP) } else if (field->flags & HIST_FIELD_FL_TIMESTAMP)
field_name = "common_timestamp"; field_name = "common_timestamp";
else if (field->flags & HIST_FIELD_FL_HITCOUNT) else if (field->flags & HIST_FIELD_FL_STACKTRACE) {
if (field->field)
field_name = field->field->name;
else
field_name = "stacktrace";
} else if (field->flags & HIST_FIELD_FL_HITCOUNT)
field_name = "hitcount"; field_name = "hitcount";
if (field_name == NULL) if (field_name == NULL)
...@@ -1718,6 +1720,8 @@ static const char *get_hist_field_flags(struct hist_field *hist_field) ...@@ -1718,6 +1720,8 @@ static const char *get_hist_field_flags(struct hist_field *hist_field)
flags_str = "percent"; flags_str = "percent";
else if (hist_field->flags & HIST_FIELD_FL_GRAPH) else if (hist_field->flags & HIST_FIELD_FL_GRAPH)
flags_str = "graph"; flags_str = "graph";
else if (hist_field->flags & HIST_FIELD_FL_STACKTRACE)
flags_str = "stacktrace";
return flags_str; return flags_str;
} }
...@@ -1979,7 +1983,14 @@ static struct hist_field *create_hist_field(struct hist_trigger_data *hist_data, ...@@ -1979,7 +1983,14 @@ static struct hist_field *create_hist_field(struct hist_trigger_data *hist_data,
} }
if (flags & HIST_FIELD_FL_STACKTRACE) { if (flags & HIST_FIELD_FL_STACKTRACE) {
hist_field->fn_num = HIST_FIELD_FN_NOP; if (field)
hist_field->fn_num = HIST_FIELD_FN_STACK;
else
hist_field->fn_num = HIST_FIELD_FN_NOP;
hist_field->size = HIST_STACKTRACE_SIZE;
hist_field->type = kstrdup_const("unsigned long[]", GFP_KERNEL);
if (!hist_field->type)
goto free;
goto out; goto out;
} }
...@@ -2312,6 +2323,8 @@ parse_field(struct hist_trigger_data *hist_data, struct trace_event_file *file, ...@@ -2312,6 +2323,8 @@ parse_field(struct hist_trigger_data *hist_data, struct trace_event_file *file,
*flags |= HIST_FIELD_FL_EXECNAME; *flags |= HIST_FIELD_FL_EXECNAME;
else if (strcmp(modifier, "syscall") == 0) else if (strcmp(modifier, "syscall") == 0)
*flags |= HIST_FIELD_FL_SYSCALL; *flags |= HIST_FIELD_FL_SYSCALL;
else if (strcmp(modifier, "stacktrace") == 0)
*flags |= HIST_FIELD_FL_STACKTRACE;
else if (strcmp(modifier, "log2") == 0) else if (strcmp(modifier, "log2") == 0)
*flags |= HIST_FIELD_FL_LOG2; *flags |= HIST_FIELD_FL_LOG2;
else if (strcmp(modifier, "usecs") == 0) else if (strcmp(modifier, "usecs") == 0)
...@@ -2351,6 +2364,8 @@ parse_field(struct hist_trigger_data *hist_data, struct trace_event_file *file, ...@@ -2351,6 +2364,8 @@ parse_field(struct hist_trigger_data *hist_data, struct trace_event_file *file,
hist_data->enable_timestamps = true; hist_data->enable_timestamps = true;
if (*flags & HIST_FIELD_FL_TIMESTAMP_USECS) if (*flags & HIST_FIELD_FL_TIMESTAMP_USECS)
hist_data->attrs->ts_in_usecs = true; hist_data->attrs->ts_in_usecs = true;
} else if (strcmp(field_name, "stacktrace") == 0) {
*flags |= HIST_FIELD_FL_STACKTRACE;
} else if (strcmp(field_name, "common_cpu") == 0) } else if (strcmp(field_name, "common_cpu") == 0)
*flags |= HIST_FIELD_FL_CPU; *flags |= HIST_FIELD_FL_CPU;
else if (strcmp(field_name, "hitcount") == 0) else if (strcmp(field_name, "hitcount") == 0)
...@@ -3111,6 +3126,9 @@ static inline void __update_field_vars(struct tracing_map_elt *elt, ...@@ -3111,6 +3126,9 @@ static inline void __update_field_vars(struct tracing_map_elt *elt,
unsigned int i, j, var_idx; unsigned int i, j, var_idx;
u64 var_val; u64 var_val;
/* Make sure stacktrace can fit in the string variable length */
BUILD_BUG_ON((HIST_STACKTRACE_DEPTH + 1) * sizeof(long) >= STR_VAR_LEN_MAX);
for (i = 0, j = field_var_str_start; i < n_field_vars; i++) { for (i = 0, j = field_var_str_start; i < n_field_vars; i++) {
struct field_var *field_var = field_vars[i]; struct field_var *field_var = field_vars[i];
struct hist_field *var = field_var->var; struct hist_field *var = field_var->var;
...@@ -3119,13 +3137,26 @@ static inline void __update_field_vars(struct tracing_map_elt *elt, ...@@ -3119,13 +3137,26 @@ static inline void __update_field_vars(struct tracing_map_elt *elt,
var_val = hist_fn_call(val, elt, buffer, rbe, rec); var_val = hist_fn_call(val, elt, buffer, rbe, rec);
var_idx = var->var.idx; var_idx = var->var.idx;
if (val->flags & HIST_FIELD_FL_STRING) { if (val->flags & (HIST_FIELD_FL_STRING |
HIST_FIELD_FL_STACKTRACE)) {
char *str = elt_data->field_var_str[j++]; char *str = elt_data->field_var_str[j++];
char *val_str = (char *)(uintptr_t)var_val; char *val_str = (char *)(uintptr_t)var_val;
unsigned int size; unsigned int size;
size = min(val->size, STR_VAR_LEN_MAX); if (val->flags & HIST_FIELD_FL_STRING) {
strscpy(str, val_str, size); size = min(val->size, STR_VAR_LEN_MAX);
strscpy(str, val_str, size);
} else {
char *stack_start = str + sizeof(unsigned long);
int e;
e = stack_trace_save((void *)stack_start,
HIST_STACKTRACE_DEPTH,
HIST_STACKTRACE_SKIP);
if (e < HIST_STACKTRACE_DEPTH - 1)
((unsigned long *)stack_start)[e] = 0;
*((unsigned long *)str) = e;
}
var_val = (u64)(uintptr_t)str; var_val = (u64)(uintptr_t)str;
} }
tracing_map_set_var(elt, var_idx, var_val); tracing_map_set_var(elt, var_idx, var_val);
...@@ -3824,7 +3855,8 @@ static void save_field_var(struct hist_trigger_data *hist_data, ...@@ -3824,7 +3855,8 @@ static void save_field_var(struct hist_trigger_data *hist_data,
{ {
hist_data->field_vars[hist_data->n_field_vars++] = field_var; hist_data->field_vars[hist_data->n_field_vars++] = field_var;
if (field_var->val->flags & HIST_FIELD_FL_STRING) /* Stack traces are saved in the string storage too */
if (field_var->val->flags & (HIST_FIELD_FL_STRING | HIST_FIELD_FL_STACKTRACE))
hist_data->n_field_var_str++; hist_data->n_field_var_str++;
} }
...@@ -3849,6 +3881,9 @@ static int check_synth_field(struct synth_event *event, ...@@ -3849,6 +3881,9 @@ static int check_synth_field(struct synth_event *event,
&& field->is_dynamic) && field->is_dynamic)
return 0; return 0;
if (strstr(hist_field->type, "long[") && field->is_stack)
return 0;
if (strcmp(field->type, hist_field->type) != 0) { if (strcmp(field->type, hist_field->type) != 0) {
if (field->size != hist_field->size || if (field->size != hist_field->size ||
(!field->is_string && field->is_signed != hist_field->is_signed)) (!field->is_string && field->is_signed != hist_field->is_signed))
...@@ -4103,7 +4138,8 @@ static int action_create(struct hist_trigger_data *hist_data, ...@@ -4103,7 +4138,8 @@ static int action_create(struct hist_trigger_data *hist_data,
} }
hist_data->save_vars[hist_data->n_save_vars++] = field_var; hist_data->save_vars[hist_data->n_save_vars++] = field_var;
if (field_var->val->flags & HIST_FIELD_FL_STRING) if (field_var->val->flags &
(HIST_FIELD_FL_STRING | HIST_FIELD_FL_STACKTRACE))
hist_data->n_save_var_str++; hist_data->n_save_var_str++;
kfree(param); kfree(param);
} }
...@@ -4242,6 +4278,19 @@ static u64 hist_field_execname(struct hist_field *hist_field, ...@@ -4242,6 +4278,19 @@ static u64 hist_field_execname(struct hist_field *hist_field,
return (u64)(unsigned long)(elt_data->comm); return (u64)(unsigned long)(elt_data->comm);
} }
static u64 hist_field_stack(struct hist_field *hist_field,
struct tracing_map_elt *elt,
struct trace_buffer *buffer,
struct ring_buffer_event *rbe,
void *event)
{
u32 str_item = *(u32 *)(event + hist_field->field->offset);
int str_loc = str_item & 0xffff;
char *addr = (char *)(event + str_loc);
return (u64)(unsigned long)addr;
}
static u64 hist_fn_call(struct hist_field *hist_field, static u64 hist_fn_call(struct hist_field *hist_field,
struct tracing_map_elt *elt, struct tracing_map_elt *elt,
struct trace_buffer *buffer, struct trace_buffer *buffer,
...@@ -4305,6 +4354,8 @@ static u64 hist_fn_call(struct hist_field *hist_field, ...@@ -4305,6 +4354,8 @@ static u64 hist_fn_call(struct hist_field *hist_field,
return div_by_mult_and_shift(hist_field, elt, buffer, rbe, event); return div_by_mult_and_shift(hist_field, elt, buffer, rbe, event);
case HIST_FIELD_FN_EXECNAME: case HIST_FIELD_FN_EXECNAME:
return hist_field_execname(hist_field, elt, buffer, rbe, event); return hist_field_execname(hist_field, elt, buffer, rbe, event);
case HIST_FIELD_FN_STACK:
return hist_field_stack(hist_field, elt, buffer, rbe, event);
default: default:
return 0; return 0;
} }
...@@ -4351,7 +4402,8 @@ static int create_var_field(struct hist_trigger_data *hist_data, ...@@ -4351,7 +4402,8 @@ static int create_var_field(struct hist_trigger_data *hist_data,
if (!ret && hist_data->fields[val_idx]->flags & HIST_FIELD_FL_EXECNAME) if (!ret && hist_data->fields[val_idx]->flags & HIST_FIELD_FL_EXECNAME)
update_var_execname(hist_data->fields[val_idx]); update_var_execname(hist_data->fields[val_idx]);
if (!ret && hist_data->fields[val_idx]->flags & HIST_FIELD_FL_STRING) if (!ret && hist_data->fields[val_idx]->flags &
(HIST_FIELD_FL_STRING | HIST_FIELD_FL_STACKTRACE))
hist_data->fields[val_idx]->var_str_idx = hist_data->n_var_str++; hist_data->fields[val_idx]->var_str_idx = hist_data->n_var_str++;
return ret; return ret;
...@@ -5092,7 +5144,8 @@ static void hist_trigger_elt_update(struct hist_trigger_data *hist_data, ...@@ -5092,7 +5144,8 @@ static void hist_trigger_elt_update(struct hist_trigger_data *hist_data,
if (hist_field->flags & HIST_FIELD_FL_VAR) { if (hist_field->flags & HIST_FIELD_FL_VAR) {
var_idx = hist_field->var.idx; var_idx = hist_field->var.idx;
if (hist_field->flags & HIST_FIELD_FL_STRING) { if (hist_field->flags &
(HIST_FIELD_FL_STRING | HIST_FIELD_FL_STACKTRACE)) {
unsigned int str_start, var_str_idx, idx; unsigned int str_start, var_str_idx, idx;
char *str, *val_str; char *str, *val_str;
unsigned int size; unsigned int size;
...@@ -5105,9 +5158,20 @@ static void hist_trigger_elt_update(struct hist_trigger_data *hist_data, ...@@ -5105,9 +5158,20 @@ static void hist_trigger_elt_update(struct hist_trigger_data *hist_data,
str = elt_data->field_var_str[idx]; str = elt_data->field_var_str[idx];
val_str = (char *)(uintptr_t)hist_val; val_str = (char *)(uintptr_t)hist_val;
size = min(hist_field->size, STR_VAR_LEN_MAX); if (hist_field->flags & HIST_FIELD_FL_STRING) {
strscpy(str, val_str, size); size = min(hist_field->size, STR_VAR_LEN_MAX);
strscpy(str, val_str, size);
} else {
char *stack_start = str + sizeof(unsigned long);
int e;
e = stack_trace_save((void *)stack_start,
HIST_STACKTRACE_DEPTH,
HIST_STACKTRACE_SKIP);
if (e < HIST_STACKTRACE_DEPTH - 1)
((unsigned long *)stack_start)[e] = 0;
*((unsigned long *)str) = e;
}
hist_val = (u64)(uintptr_t)str; hist_val = (u64)(uintptr_t)str;
} }
tracing_map_set_var(elt, var_idx, hist_val); tracing_map_set_var(elt, var_idx, hist_val);
...@@ -5193,8 +5257,17 @@ static void event_hist_trigger(struct event_trigger_data *data, ...@@ -5193,8 +5257,17 @@ static void event_hist_trigger(struct event_trigger_data *data,
if (key_field->flags & HIST_FIELD_FL_STACKTRACE) { if (key_field->flags & HIST_FIELD_FL_STACKTRACE) {
memset(entries, 0, HIST_STACKTRACE_SIZE); memset(entries, 0, HIST_STACKTRACE_SIZE);
stack_trace_save(entries, HIST_STACKTRACE_DEPTH, if (key_field->field) {
HIST_STACKTRACE_SKIP); unsigned long *stack, n_entries;
field_contents = hist_fn_call(key_field, elt, buffer, rbe, rec);
stack = (unsigned long *)(long)field_contents;
n_entries = *stack;
memcpy(entries, ++stack, n_entries * sizeof(unsigned long));
} else {
stack_trace_save(entries, HIST_STACKTRACE_DEPTH,
HIST_STACKTRACE_SKIP);
}
key = entries; key = entries;
} else { } else {
field_contents = hist_fn_call(key_field, elt, buffer, rbe, rec); field_contents = hist_fn_call(key_field, elt, buffer, rbe, rec);
...@@ -5297,7 +5370,10 @@ static void hist_trigger_print_key(struct seq_file *m, ...@@ -5297,7 +5370,10 @@ static void hist_trigger_print_key(struct seq_file *m,
seq_printf(m, "%s: %-30s[%3llu]", field_name, seq_printf(m, "%s: %-30s[%3llu]", field_name,
syscall_name, uval); syscall_name, uval);
} else if (key_field->flags & HIST_FIELD_FL_STACKTRACE) { } else if (key_field->flags & HIST_FIELD_FL_STACKTRACE) {
seq_puts(m, "stacktrace:\n"); if (key_field->field)
seq_printf(m, "%s.stacktrace", key_field->field->name);
else
seq_puts(m, "stacktrace:\n");
hist_trigger_stacktrace_print(m, hist_trigger_stacktrace_print(m,
key + key_field->offset, key + key_field->offset,
HIST_STACKTRACE_DEPTH); HIST_STACKTRACE_DEPTH);
...@@ -5842,7 +5918,8 @@ static void hist_field_print(struct seq_file *m, struct hist_field *hist_field) ...@@ -5842,7 +5918,8 @@ static void hist_field_print(struct seq_file *m, struct hist_field *hist_field)
if (hist_field->flags) { if (hist_field->flags) {
if (!(hist_field->flags & HIST_FIELD_FL_VAR_REF) && if (!(hist_field->flags & HIST_FIELD_FL_VAR_REF) &&
!(hist_field->flags & HIST_FIELD_FL_EXPR)) { !(hist_field->flags & HIST_FIELD_FL_EXPR) &&
!(hist_field->flags & HIST_FIELD_FL_STACKTRACE)) {
const char *flags = get_hist_field_flags(hist_field); const char *flags = get_hist_field_flags(hist_field);
if (flags) if (flags)
...@@ -5875,9 +5952,12 @@ static int event_hist_trigger_print(struct seq_file *m, ...@@ -5875,9 +5952,12 @@ static int event_hist_trigger_print(struct seq_file *m,
if (i > hist_data->n_vals) if (i > hist_data->n_vals)
seq_puts(m, ","); seq_puts(m, ",");
if (field->flags & HIST_FIELD_FL_STACKTRACE) if (field->flags & HIST_FIELD_FL_STACKTRACE) {
seq_puts(m, "stacktrace"); if (field->field)
else seq_printf(m, "%s.stacktrace", field->field->name);
else
seq_puts(m, "stacktrace");
} else
hist_field_print(m, field); hist_field_print(m, field);
} }
......
...@@ -173,6 +173,14 @@ static int synth_field_is_string(char *type) ...@@ -173,6 +173,14 @@ static int synth_field_is_string(char *type)
return false; return false;
} }
static int synth_field_is_stack(char *type)
{
if (strstr(type, "long[") != NULL)
return true;
return false;
}
static int synth_field_string_size(char *type) static int synth_field_string_size(char *type)
{ {
char buf[4], *end, *start; char buf[4], *end, *start;
...@@ -248,6 +256,8 @@ static int synth_field_size(char *type) ...@@ -248,6 +256,8 @@ static int synth_field_size(char *type)
size = sizeof(gfp_t); size = sizeof(gfp_t);
else if (synth_field_is_string(type)) else if (synth_field_is_string(type))
size = synth_field_string_size(type); size = synth_field_string_size(type);
else if (synth_field_is_stack(type))
size = 0;
return size; return size;
} }
...@@ -292,6 +302,8 @@ static const char *synth_field_fmt(char *type) ...@@ -292,6 +302,8 @@ static const char *synth_field_fmt(char *type)
fmt = "%x"; fmt = "%x";
else if (synth_field_is_string(type)) else if (synth_field_is_string(type))
fmt = "%.*s"; fmt = "%.*s";
else if (synth_field_is_stack(type))
fmt = "%s";
return fmt; return fmt;
} }
...@@ -371,6 +383,23 @@ static enum print_line_t print_synth_event(struct trace_iterator *iter, ...@@ -371,6 +383,23 @@ static enum print_line_t print_synth_event(struct trace_iterator *iter,
i == se->n_fields - 1 ? "" : " "); i == se->n_fields - 1 ? "" : " ");
n_u64 += STR_VAR_LEN_MAX / sizeof(u64); n_u64 += STR_VAR_LEN_MAX / sizeof(u64);
} }
} else if (se->fields[i]->is_stack) {
u32 offset, data_offset, len;
unsigned long *p, *end;
offset = (u32)entry->fields[n_u64];
data_offset = offset & 0xffff;
len = offset >> 16;
p = (void *)entry + data_offset;
end = (void *)p + len - (sizeof(long) - 1);
trace_seq_printf(s, "%s=STACK:\n", se->fields[i]->name);
for (; *p && p < end; p++)
trace_seq_printf(s, "=> %pS\n", (void *)*p);
n_u64++;
} else { } else {
struct trace_print_flags __flags[] = { struct trace_print_flags __flags[] = {
__def_gfpflag_names, {-1, NULL} }; __def_gfpflag_names, {-1, NULL} };
...@@ -416,8 +445,7 @@ static unsigned int trace_string(struct synth_trace_event *entry, ...@@ -416,8 +445,7 @@ static unsigned int trace_string(struct synth_trace_event *entry,
if (is_dynamic) { if (is_dynamic) {
u32 data_offset; u32 data_offset;
data_offset = offsetof(typeof(*entry), fields); data_offset = struct_size(entry, fields, event->n_u64);
data_offset += event->n_u64 * sizeof(u64);
data_offset += data_size; data_offset += data_size;
len = kern_fetch_store_strlen((unsigned long)str_val); len = kern_fetch_store_strlen((unsigned long)str_val);
...@@ -447,6 +475,43 @@ static unsigned int trace_string(struct synth_trace_event *entry, ...@@ -447,6 +475,43 @@ static unsigned int trace_string(struct synth_trace_event *entry,
return len; return len;
} }
static unsigned int trace_stack(struct synth_trace_event *entry,
struct synth_event *event,
long *stack,
unsigned int data_size,
unsigned int *n_u64)
{
unsigned int len;
u32 data_offset;
void *data_loc;
data_offset = struct_size(entry, fields, event->n_u64);
data_offset += data_size;
for (len = 0; len < HIST_STACKTRACE_DEPTH; len++) {
if (!stack[len])
break;
}
/* Include the zero'd element if it fits */
if (len < HIST_STACKTRACE_DEPTH)
len++;
len *= sizeof(long);
/* Find the dynamic section to copy the stack into. */
data_loc = (void *)entry + data_offset;
memcpy(data_loc, stack, len);
/* Fill in the field that holds the offset/len combo */
data_offset |= len << 16;
*(u32 *)&entry->fields[*n_u64] = data_offset;
(*n_u64)++;
return len;
}
static notrace void trace_event_raw_event_synth(void *__data, static notrace void trace_event_raw_event_synth(void *__data,
u64 *var_ref_vals, u64 *var_ref_vals,
unsigned int *var_ref_idx) unsigned int *var_ref_idx)
...@@ -473,7 +538,12 @@ static notrace void trace_event_raw_event_synth(void *__data, ...@@ -473,7 +538,12 @@ static notrace void trace_event_raw_event_synth(void *__data,
val_idx = var_ref_idx[field_pos]; val_idx = var_ref_idx[field_pos];
str_val = (char *)(long)var_ref_vals[val_idx]; str_val = (char *)(long)var_ref_vals[val_idx];
len = kern_fetch_store_strlen((unsigned long)str_val); if (event->dynamic_fields[i]->is_stack) {
len = *((unsigned long *)str_val);
len *= sizeof(unsigned long);
} else {
len = kern_fetch_store_strlen((unsigned long)str_val);
}
fields_size += len; fields_size += len;
} }
...@@ -499,6 +569,12 @@ static notrace void trace_event_raw_event_synth(void *__data, ...@@ -499,6 +569,12 @@ static notrace void trace_event_raw_event_synth(void *__data,
event->fields[i]->is_dynamic, event->fields[i]->is_dynamic,
data_size, &n_u64); data_size, &n_u64);
data_size += len; /* only dynamic string increments */ data_size += len; /* only dynamic string increments */
} else if (event->fields[i]->is_stack) {
long *stack = (long *)(long)var_ref_vals[val_idx];
len = trace_stack(entry, event, stack,
data_size, &n_u64);
data_size += len;
} else { } else {
struct synth_field *field = event->fields[i]; struct synth_field *field = event->fields[i];
u64 val = var_ref_vals[val_idx]; u64 val = var_ref_vals[val_idx];
...@@ -561,6 +637,9 @@ static int __set_synth_event_print_fmt(struct synth_event *event, ...@@ -561,6 +637,9 @@ static int __set_synth_event_print_fmt(struct synth_event *event,
event->fields[i]->is_dynamic) event->fields[i]->is_dynamic)
pos += snprintf(buf + pos, LEN_OR_ZERO, pos += snprintf(buf + pos, LEN_OR_ZERO,
", __get_str(%s)", event->fields[i]->name); ", __get_str(%s)", event->fields[i]->name);
else if (event->fields[i]->is_stack)
pos += snprintf(buf + pos, LEN_OR_ZERO,
", __get_stacktrace(%s)", event->fields[i]->name);
else else
pos += snprintf(buf + pos, LEN_OR_ZERO, pos += snprintf(buf + pos, LEN_OR_ZERO,
", REC->%s", event->fields[i]->name); ", REC->%s", event->fields[i]->name);
...@@ -697,7 +776,8 @@ static struct synth_field *parse_synth_field(int argc, char **argv, ...@@ -697,7 +776,8 @@ static struct synth_field *parse_synth_field(int argc, char **argv,
ret = -EINVAL; ret = -EINVAL;
goto free; goto free;
} else if (size == 0) { } else if (size == 0) {
if (synth_field_is_string(field->type)) { if (synth_field_is_string(field->type) ||
synth_field_is_stack(field->type)) {
char *type; char *type;
len = sizeof("__data_loc ") + strlen(field->type) + 1; len = sizeof("__data_loc ") + strlen(field->type) + 1;
...@@ -728,6 +808,8 @@ static struct synth_field *parse_synth_field(int argc, char **argv, ...@@ -728,6 +808,8 @@ static struct synth_field *parse_synth_field(int argc, char **argv,
if (synth_field_is_string(field->type)) if (synth_field_is_string(field->type))
field->is_string = true; field->is_string = true;
else if (synth_field_is_stack(field->type))
field->is_stack = true;
field->is_signed = synth_field_signed(field->type); field->is_signed = synth_field_signed(field->type);
out: out:
......
...@@ -1539,7 +1539,7 @@ static void osnoise_sleep(void) ...@@ -1539,7 +1539,7 @@ static void osnoise_sleep(void)
wake_time = ktime_add_us(ktime_get(), interval); wake_time = ktime_add_us(ktime_get(), interval);
__set_current_state(TASK_INTERRUPTIBLE); __set_current_state(TASK_INTERRUPTIBLE);
while (schedule_hrtimeout_range(&wake_time, 0, HRTIMER_MODE_ABS)) { while (schedule_hrtimeout(&wake_time, HRTIMER_MODE_ABS)) {
if (kthread_should_stop()) if (kthread_should_stop())
break; break;
} }
......
...@@ -403,3 +403,26 @@ int trace_seq_hex_dump(struct trace_seq *s, const char *prefix_str, ...@@ -403,3 +403,26 @@ int trace_seq_hex_dump(struct trace_seq *s, const char *prefix_str,
return 1; return 1;
} }
EXPORT_SYMBOL(trace_seq_hex_dump); EXPORT_SYMBOL(trace_seq_hex_dump);
/*
* trace_seq_acquire - acquire seq buffer with size len
* @s: trace sequence descriptor
* @len: size of buffer to be acquired
*
* acquire buffer with size of @len from trace_seq for output usage,
* user can fill string into that buffer.
*
* Returns start address of acquired buffer.
*
* it allow multiple usage in one trace output function call.
*/
char *trace_seq_acquire(struct trace_seq *s, unsigned int len)
{
char *ret = trace_seq_buffer_ptr(s);
if (!WARN_ON_ONCE(seq_buf_buffer_left(&s->seq) < len))
seq_buf_commit(&s->seq, len);
return ret;
}
EXPORT_SYMBOL(trace_seq_acquire);
...@@ -18,6 +18,7 @@ struct synth_field { ...@@ -18,6 +18,7 @@ struct synth_field {
bool is_signed; bool is_signed;
bool is_string; bool is_string;
bool is_dynamic; bool is_dynamic;
bool is_stack;
}; };
struct synth_event { struct synth_event {
......
...@@ -571,8 +571,8 @@ static void for_each_tracepoint_range( ...@@ -571,8 +571,8 @@ static void for_each_tracepoint_range(
bool trace_module_has_bad_taint(struct module *mod) bool trace_module_has_bad_taint(struct module *mod)
{ {
return mod->taints & ~((1 << TAINT_OOT_MODULE) | (1 << TAINT_CRAP) | return mod->taints & ~((1 << TAINT_OOT_MODULE) | (1 << TAINT_CRAP) |
(1 << TAINT_UNSIGNED_MODULE) | (1 << TAINT_UNSIGNED_MODULE) | (1 << TAINT_TEST) |
(1 << TAINT_TEST)); (1 << TAINT_LIVEPATCH));
} }
static BLOCKING_NOTIFIER_HEAD(tracepoint_notify_list); static BLOCKING_NOTIFIER_HEAD(tracepoint_notify_list);
......
...@@ -46,6 +46,13 @@ config SAMPLE_FTRACE_DIRECT_MULTI ...@@ -46,6 +46,13 @@ config SAMPLE_FTRACE_DIRECT_MULTI
that hooks to wake_up_process and schedule, and prints that hooks to wake_up_process and schedule, and prints
the function addresses. the function addresses.
config SAMPLE_FTRACE_OPS
tristate "Build custom ftrace ops example"
depends on FUNCTION_TRACER
help
This builds an ftrace ops example that hooks two functions and
measures the time taken to invoke one function a number of times.
config SAMPLE_TRACE_ARRAY config SAMPLE_TRACE_ARRAY
tristate "Build sample module for kernel access to Ftrace instancess" tristate "Build sample module for kernel access to Ftrace instancess"
depends on EVENT_TRACING && m depends on EVENT_TRACING && m
......
...@@ -24,6 +24,7 @@ obj-$(CONFIG_SAMPLE_TRACE_CUSTOM_EVENTS) += trace_events/ ...@@ -24,6 +24,7 @@ obj-$(CONFIG_SAMPLE_TRACE_CUSTOM_EVENTS) += trace_events/
obj-$(CONFIG_SAMPLE_TRACE_PRINTK) += trace_printk/ obj-$(CONFIG_SAMPLE_TRACE_PRINTK) += trace_printk/
obj-$(CONFIG_SAMPLE_FTRACE_DIRECT) += ftrace/ obj-$(CONFIG_SAMPLE_FTRACE_DIRECT) += ftrace/
obj-$(CONFIG_SAMPLE_FTRACE_DIRECT_MULTI) += ftrace/ obj-$(CONFIG_SAMPLE_FTRACE_DIRECT_MULTI) += ftrace/
obj-$(CONFIG_SAMPLE_FTRACE_OPS) += ftrace/
obj-$(CONFIG_SAMPLE_TRACE_ARRAY) += ftrace/ obj-$(CONFIG_SAMPLE_TRACE_ARRAY) += ftrace/
subdir-$(CONFIG_SAMPLE_UHID) += uhid subdir-$(CONFIG_SAMPLE_UHID) += uhid
obj-$(CONFIG_VIDEO_PCI_SKELETON) += v4l/ obj-$(CONFIG_VIDEO_PCI_SKELETON) += v4l/
......
...@@ -5,6 +5,7 @@ obj-$(CONFIG_SAMPLE_FTRACE_DIRECT) += ftrace-direct-too.o ...@@ -5,6 +5,7 @@ obj-$(CONFIG_SAMPLE_FTRACE_DIRECT) += ftrace-direct-too.o
obj-$(CONFIG_SAMPLE_FTRACE_DIRECT) += ftrace-direct-modify.o obj-$(CONFIG_SAMPLE_FTRACE_DIRECT) += ftrace-direct-modify.o
obj-$(CONFIG_SAMPLE_FTRACE_DIRECT_MULTI) += ftrace-direct-multi.o obj-$(CONFIG_SAMPLE_FTRACE_DIRECT_MULTI) += ftrace-direct-multi.o
obj-$(CONFIG_SAMPLE_FTRACE_DIRECT_MULTI) += ftrace-direct-multi-modify.o obj-$(CONFIG_SAMPLE_FTRACE_DIRECT_MULTI) += ftrace-direct-multi-modify.o
obj-$(CONFIG_SAMPLE_FTRACE_OPS) += ftrace-ops.o
CFLAGS_sample-trace-array.o := -I$(src) CFLAGS_sample-trace-array.o := -I$(src)
obj-$(CONFIG_SAMPLE_TRACE_ARRAY) += sample-trace-array.o obj-$(CONFIG_SAMPLE_TRACE_ARRAY) += sample-trace-array.o
...@@ -3,7 +3,6 @@ ...@@ -3,7 +3,6 @@
#include <linux/kthread.h> #include <linux/kthread.h>
#include <linux/ftrace.h> #include <linux/ftrace.h>
#include <asm/asm-offsets.h> #include <asm/asm-offsets.h>
#include <asm/nospec-branch.h>
extern void my_direct_func1(void); extern void my_direct_func1(void);
extern void my_direct_func2(void); extern void my_direct_func2(void);
...@@ -26,6 +25,7 @@ static unsigned long my_ip = (unsigned long)schedule; ...@@ -26,6 +25,7 @@ static unsigned long my_ip = (unsigned long)schedule;
#ifdef CONFIG_X86_64 #ifdef CONFIG_X86_64
#include <asm/ibt.h> #include <asm/ibt.h>
#include <asm/nospec-branch.h>
asm ( asm (
" .pushsection .text, \"ax\", @progbits\n" " .pushsection .text, \"ax\", @progbits\n"
......
...@@ -3,7 +3,6 @@ ...@@ -3,7 +3,6 @@
#include <linux/kthread.h> #include <linux/kthread.h>
#include <linux/ftrace.h> #include <linux/ftrace.h>
#include <asm/asm-offsets.h> #include <asm/asm-offsets.h>
#include <asm/nospec-branch.h>
extern void my_direct_func1(unsigned long ip); extern void my_direct_func1(unsigned long ip);
extern void my_direct_func2(unsigned long ip); extern void my_direct_func2(unsigned long ip);
...@@ -24,6 +23,7 @@ extern void my_tramp2(void *); ...@@ -24,6 +23,7 @@ extern void my_tramp2(void *);
#ifdef CONFIG_X86_64 #ifdef CONFIG_X86_64
#include <asm/ibt.h> #include <asm/ibt.h>
#include <asm/nospec-branch.h>
asm ( asm (
" .pushsection .text, \"ax\", @progbits\n" " .pushsection .text, \"ax\", @progbits\n"
......
...@@ -5,7 +5,6 @@ ...@@ -5,7 +5,6 @@
#include <linux/ftrace.h> #include <linux/ftrace.h>
#include <linux/sched/stat.h> #include <linux/sched/stat.h>
#include <asm/asm-offsets.h> #include <asm/asm-offsets.h>
#include <asm/nospec-branch.h>
extern void my_direct_func(unsigned long ip); extern void my_direct_func(unsigned long ip);
...@@ -19,6 +18,7 @@ extern void my_tramp(void *); ...@@ -19,6 +18,7 @@ extern void my_tramp(void *);
#ifdef CONFIG_X86_64 #ifdef CONFIG_X86_64
#include <asm/ibt.h> #include <asm/ibt.h>
#include <asm/nospec-branch.h>
asm ( asm (
" .pushsection .text, \"ax\", @progbits\n" " .pushsection .text, \"ax\", @progbits\n"
......
...@@ -4,7 +4,6 @@ ...@@ -4,7 +4,6 @@
#include <linux/mm.h> /* for handle_mm_fault() */ #include <linux/mm.h> /* for handle_mm_fault() */
#include <linux/ftrace.h> #include <linux/ftrace.h>
#include <asm/asm-offsets.h> #include <asm/asm-offsets.h>
#include <asm/nospec-branch.h>
extern void my_direct_func(struct vm_area_struct *vma, extern void my_direct_func(struct vm_area_struct *vma,
unsigned long address, unsigned int flags); unsigned long address, unsigned int flags);
...@@ -21,6 +20,7 @@ extern void my_tramp(void *); ...@@ -21,6 +20,7 @@ extern void my_tramp(void *);
#ifdef CONFIG_X86_64 #ifdef CONFIG_X86_64
#include <asm/ibt.h> #include <asm/ibt.h>
#include <asm/nospec-branch.h>
asm ( asm (
" .pushsection .text, \"ax\", @progbits\n" " .pushsection .text, \"ax\", @progbits\n"
......
...@@ -4,7 +4,6 @@ ...@@ -4,7 +4,6 @@
#include <linux/sched.h> /* for wake_up_process() */ #include <linux/sched.h> /* for wake_up_process() */
#include <linux/ftrace.h> #include <linux/ftrace.h>
#include <asm/asm-offsets.h> #include <asm/asm-offsets.h>
#include <asm/nospec-branch.h>
extern void my_direct_func(struct task_struct *p); extern void my_direct_func(struct task_struct *p);
...@@ -18,6 +17,7 @@ extern void my_tramp(void *); ...@@ -18,6 +17,7 @@ extern void my_tramp(void *);
#ifdef CONFIG_X86_64 #ifdef CONFIG_X86_64
#include <asm/ibt.h> #include <asm/ibt.h>
#include <asm/nospec-branch.h>
asm ( asm (
" .pushsection .text, \"ax\", @progbits\n" " .pushsection .text, \"ax\", @progbits\n"
......
// SPDX-License-Identifier: GPL-2.0-only
#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
#include <linux/ftrace.h>
#include <linux/ktime.h>
#include <linux/module.h>
#include <asm/barrier.h>
/*
* Arbitrary large value chosen to be sufficiently large to minimize noise but
* sufficiently small to complete quickly.
*/
static unsigned int nr_function_calls = 100000;
module_param(nr_function_calls, uint, 0);
MODULE_PARM_DESC(nr_function_calls, "How many times to call the relevant tracee");
/*
* The number of ops associated with a call site affects whether a tracer can
* be called directly or whether it's necessary to go via the list func, which
* can be significantly more expensive.
*/
static unsigned int nr_ops_relevant = 1;
module_param(nr_ops_relevant, uint, 0);
MODULE_PARM_DESC(nr_ops_relevant, "How many ftrace_ops to associate with the relevant tracee");
/*
* On architectures where all call sites share the same trampoline, having
* tracers enabled for distinct functions can force the use of the list func
* and incur overhead for all call sites.
*/
static unsigned int nr_ops_irrelevant;
module_param(nr_ops_irrelevant, uint, 0);
MODULE_PARM_DESC(nr_ops_irrelevant, "How many ftrace_ops to associate with the irrelevant tracee");
/*
* On architectures with DYNAMIC_FTRACE_WITH_REGS, saving the full pt_regs can
* be more expensive than only saving the minimal necessary regs.
*/
static bool save_regs;
module_param(save_regs, bool, 0);
MODULE_PARM_DESC(save_regs, "Register ops with FTRACE_OPS_FL_SAVE_REGS (save all registers in the trampoline)");
static bool assist_recursion;
module_param(assist_recursion, bool, 0);
MODULE_PARM_DESC(assist_reursion, "Register ops with FTRACE_OPS_FL_RECURSION");
static bool assist_rcu;
module_param(assist_rcu, bool, 0);
MODULE_PARM_DESC(assist_reursion, "Register ops with FTRACE_OPS_FL_RCU");
/*
* By default, a trivial tracer is used which immediately returns to mimimize
* overhead. Sometimes a consistency check using a more expensive tracer is
* desireable.
*/
static bool check_count;
module_param(check_count, bool, 0);
MODULE_PARM_DESC(check_count, "Check that tracers are called the expected number of times\n");
/*
* Usually it's not interesting to leave the ops registered after the test
* runs, but sometimes it can be useful to leave them registered so that they
* can be inspected through the tracefs 'enabled_functions' file.
*/
static bool persist;
module_param(persist, bool, 0);
MODULE_PARM_DESC(persist, "Successfully load module and leave ftrace ops registered after test completes\n");
/*
* Marked as noinline to ensure that an out-of-line traceable copy is
* generated by the compiler.
*
* The barrier() ensures the compiler won't elide calls by determining there
* are no side-effects.
*/
static noinline void tracee_relevant(void)
{
barrier();
}
/*
* Marked as noinline to ensure that an out-of-line traceable copy is
* generated by the compiler.
*
* The barrier() ensures the compiler won't elide calls by determining there
* are no side-effects.
*/
static noinline void tracee_irrelevant(void)
{
barrier();
}
struct sample_ops {
struct ftrace_ops ops;
unsigned int count;
};
static void ops_func_nop(unsigned long ip, unsigned long parent_ip,
struct ftrace_ops *op,
struct ftrace_regs *fregs)
{
/* do nothing */
}
static void ops_func_count(unsigned long ip, unsigned long parent_ip,
struct ftrace_ops *op,
struct ftrace_regs *fregs)
{
struct sample_ops *self;
self = container_of(op, struct sample_ops, ops);
self->count++;
}
static struct sample_ops *ops_relevant;
static struct sample_ops *ops_irrelevant;
static struct sample_ops *ops_alloc_init(void *tracee, ftrace_func_t func,
unsigned long flags, int nr)
{
struct sample_ops *ops;
ops = kcalloc(nr, sizeof(*ops), GFP_KERNEL);
if (WARN_ON_ONCE(!ops))
return NULL;
for (unsigned int i = 0; i < nr; i++) {
ops[i].ops.func = func;
ops[i].ops.flags = flags;
WARN_ON_ONCE(ftrace_set_filter_ip(&ops[i].ops, (unsigned long)tracee, 0, 0));
WARN_ON_ONCE(register_ftrace_function(&ops[i].ops));
}
return ops;
}
static void ops_destroy(struct sample_ops *ops, int nr)
{
if (!ops)
return;
for (unsigned int i = 0; i < nr; i++) {
WARN_ON_ONCE(unregister_ftrace_function(&ops[i].ops));
ftrace_free_filter(&ops[i].ops);
}
kfree(ops);
}
static void ops_check(struct sample_ops *ops, int nr,
unsigned int expected_count)
{
if (!ops || !check_count)
return;
for (unsigned int i = 0; i < nr; i++) {
if (ops->count == expected_count)
continue;
pr_warn("Counter called %u times (expected %u)\n",
ops->count, expected_count);
}
}
static ftrace_func_t tracer_relevant = ops_func_nop;
static ftrace_func_t tracer_irrelevant = ops_func_nop;
static int __init ftrace_ops_sample_init(void)
{
unsigned long flags = 0;
ktime_t start, end;
u64 period;
if (!IS_ENABLED(CONFIG_DYNAMIC_FTRACE_WITH_REGS) && save_regs) {
pr_info("this kernel does not support saving registers\n");
save_regs = false;
} else if (save_regs) {
flags |= FTRACE_OPS_FL_SAVE_REGS;
}
if (assist_recursion)
flags |= FTRACE_OPS_FL_RECURSION;
if (assist_rcu)
flags |= FTRACE_OPS_FL_RCU;
if (check_count) {
tracer_relevant = ops_func_count;
tracer_irrelevant = ops_func_count;
}
pr_info("registering:\n"
" relevant ops: %u\n"
" tracee: %ps\n"
" tracer: %ps\n"
" irrelevant ops: %u\n"
" tracee: %ps\n"
" tracer: %ps\n"
" saving registers: %s\n"
" assist recursion: %s\n"
" assist RCU: %s\n",
nr_ops_relevant, tracee_relevant, tracer_relevant,
nr_ops_irrelevant, tracee_irrelevant, tracer_irrelevant,
save_regs ? "YES" : "NO",
assist_recursion ? "YES" : "NO",
assist_rcu ? "YES" : "NO");
ops_relevant = ops_alloc_init(tracee_relevant, tracer_relevant,
flags, nr_ops_relevant);
ops_irrelevant = ops_alloc_init(tracee_irrelevant, tracer_irrelevant,
flags, nr_ops_irrelevant);
start = ktime_get();
for (unsigned int i = 0; i < nr_function_calls; i++)
tracee_relevant();
end = ktime_get();
ops_check(ops_relevant, nr_ops_relevant, nr_function_calls);
ops_check(ops_irrelevant, nr_ops_irrelevant, 0);
period = ktime_to_ns(ktime_sub(end, start));
pr_info("Attempted %u calls to %ps in %lluns (%lluns / call)\n",
nr_function_calls, tracee_relevant,
period, div_u64(period, nr_function_calls));
if (persist)
return 0;
ops_destroy(ops_relevant, nr_ops_relevant);
ops_destroy(ops_irrelevant, nr_ops_irrelevant);
/*
* The benchmark completed sucessfully, but there's no reason to keep
* the module around. Return an error do the user doesn't have to
* manually unload the module.
*/
return -EINVAL;
}
module_init(ftrace_ops_sample_init);
static void __exit ftrace_ops_sample_exit(void)
{
ops_destroy(ops_relevant, nr_ops_relevant);
ops_destroy(ops_irrelevant, nr_ops_irrelevant);
}
module_exit(ftrace_ops_sample_exit);
MODULE_AUTHOR("Mark Rutland");
MODULE_DESCRIPTION("Example of using custom ftrace_ops");
MODULE_LICENSE("GPL");
...@@ -23,8 +23,8 @@ ...@@ -23,8 +23,8 @@
#endif #endif
/* Assumes debugfs is mounted */ /* Assumes debugfs is mounted */
const char *data_file = "/sys/kernel/debug/tracing/user_events_data"; const char *data_file = "/sys/kernel/tracing/user_events_data";
const char *status_file = "/sys/kernel/debug/tracing/user_events_status"; const char *status_file = "/sys/kernel/tracing/user_events_status";
static int event_status(long **status) static int event_status(long **status)
{ {
......
...@@ -12,9 +12,9 @@ calls. Only the functions's names and the call time are provided. ...@@ -12,9 +12,9 @@ calls. Only the functions's names and the call time are provided.
Usage: Usage:
Be sure that you have CONFIG_FUNCTION_TRACER Be sure that you have CONFIG_FUNCTION_TRACER
# mount -t debugfs nodev /sys/kernel/debug # mount -t tracefs nodev /sys/kernel/tracing
# echo function > /sys/kernel/debug/tracing/current_tracer # echo function > /sys/kernel/tracing/current_tracer
$ cat /sys/kernel/debug/tracing/trace_pipe > ~/raw_trace_func $ cat /sys/kernel/tracing/trace_pipe > ~/raw_trace_func
Wait some times but not too much, the script is a bit slow. Wait some times but not too much, the script is a bit slow.
Break the pipe (Ctrl + Z) Break the pipe (Ctrl + Z)
$ scripts/tracing/draw_functrace.py < ~/raw_trace_func > draw_functrace $ scripts/tracing/draw_functrace.py < ~/raw_trace_func > draw_functrace
......
...@@ -14,8 +14,8 @@ ...@@ -14,8 +14,8 @@
#include "tracing_path.h" #include "tracing_path.h"
static char tracing_mnt[PATH_MAX] = "/sys/kernel/debug"; static char tracing_mnt[PATH_MAX] = "/sys/kernel/debug";
static char tracing_path[PATH_MAX] = "/sys/kernel/debug/tracing"; static char tracing_path[PATH_MAX] = "/sys/kernel/tracing";
static char tracing_events_path[PATH_MAX] = "/sys/kernel/debug/tracing/events"; static char tracing_events_path[PATH_MAX] = "/sys/kernel/tracing/events";
static void __tracing_path_set(const char *tracing, const char *mountpoint) static void __tracing_path_set(const char *tracing, const char *mountpoint)
{ {
......
#!/bin/sh
# SPDX-License-Identifier: GPL-2.0
# description: event filter function - test event filtering on functions
# requires: set_event events/kmem/kmem_cache_free/filter
# flags: instance
fail() { #msg
echo $1
exit_fail
}
echo "Test event filter function name"
echo 0 > tracing_on
echo 0 > events/enable
echo > trace
echo 'call_site.function == exit_mmap' > events/kmem/kmem_cache_free/filter
echo 1 > events/kmem/kmem_cache_free/enable
echo 1 > tracing_on
ls > /dev/null
echo 0 > events/kmem/kmem_cache_free/enable
hitcnt=`grep kmem_cache_free trace| grep exit_mmap | wc -l`
misscnt=`grep kmem_cache_free trace| grep -v exit_mmap | wc -l`
if [ $hitcnt -eq 0 ]; then
exit_fail
fi
if [ $misscnt -gt 0 ]; then
exit_fail
fi
address=`grep ' exit_mmap$' /proc/kallsyms | cut -d' ' -f1`
echo "Test event filter function address"
echo 0 > tracing_on
echo 0 > events/enable
echo > trace
echo "call_site.function == 0x$address" > events/kmem/kmem_cache_free/filter
echo 1 > events/kmem/kmem_cache_free/enable
echo 1 > tracing_on
sleep 1
echo 0 > events/kmem/kmem_cache_free/enable
hitcnt=`grep kmem_cache_free trace| grep exit_mmap | wc -l`
misscnt=`grep kmem_cache_free trace| grep -v exit_mmap | wc -l`
if [ $hitcnt -eq 0 ]; then
exit_fail
fi
if [ $misscnt -gt 0 ]; then
exit_fail
fi
reset_events_filter
exit 0
#!/bin/sh
# SPDX-License-Identifier: GPL-2.0
# description: event trigger - test inter-event histogram trigger trace action with dynamic string param
# requires: set_event synthetic_events events/sched/sched_process_exec/hist "long[]' >> synthetic_events":README
fail() { #msg
echo $1
exit_fail
}
echo "Test create synthetic event with stack"
echo 's:wake_lat pid_t pid; u64 delta; unsigned long[] stack;' > dynamic_events
echo 'hist:keys=next_pid:ts=common_timestamp.usecs,st=stacktrace if prev_state == 1||prev_state == 2' >> events/sched/sched_switch/trigger
echo 'hist:keys=prev_pid:delta=common_timestamp.usecs-$ts,s=$st:onmax($delta).trace(wake_lat,prev_pid,$delta,$s)' >> events/sched/sched_switch/trigger
echo 1 > events/synthetic/wake_lat/enable
sleep 1
if ! grep -q "=>.*sched" trace; then
fail "Failed to create synthetic event with stack"
fi
exit 0
...@@ -70,6 +70,12 @@ grep "myevent[[:space:]]unsigned long var" synthetic_events ...@@ -70,6 +70,12 @@ grep "myevent[[:space:]]unsigned long var" synthetic_events
echo "myevent char var[10]" > synthetic_events echo "myevent char var[10]" > synthetic_events
grep "myevent[[:space:]]char\[10\] var" synthetic_events grep "myevent[[:space:]]char\[10\] var" synthetic_events
if grep -q 'long\[\]' README; then
# test stacktrace type
echo "myevent unsigned long[] var" > synthetic_events
grep "myevent[[:space:]]unsigned long\[\] var" synthetic_events
fi
do_reset do_reset
exit 0 exit 0
...@@ -1584,7 +1584,7 @@ static void *do_printloop(void *arg) ...@@ -1584,7 +1584,7 @@ static void *do_printloop(void *arg)
/* /*
* Toss a coin to decide if we want to sleep before printing * Toss a coin to decide if we want to sleep before printing
* out the backtrace. The reason for this is that opening * out the backtrace. The reason for this is that opening
* /sys/kernel/debug/tracing/trace will cause a blackout of * /sys/kernel/tracing/trace will cause a blackout of
* hundreds of ms, where no latencies will be noted by the * hundreds of ms, where no latencies will be noted by the
* latency tracer. Thus by randomly sleeping we try to avoid * latency tracer. Thus by randomly sleeping we try to avoid
* missing traces systematically due to this. With this option * missing traces systematically due to this. With this option
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment