tracing: Update Documentation/trace/ftrace.txt

The documentation of ftrace.txt has become rather outdated. Bring it closer
to reality of todays kernel.
Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
parent b11fb737
...@@ -5,10 +5,11 @@ Copyright 2008 Red Hat Inc. ...@@ -5,10 +5,11 @@ Copyright 2008 Red Hat Inc.
Author: Steven Rostedt <srostedt@redhat.com> Author: Steven Rostedt <srostedt@redhat.com>
License: The GNU Free Documentation License, Version 1.2 License: The GNU Free Documentation License, Version 1.2
(dual licensed under the GPL v2) (dual licensed under the GPL v2)
Reviewers: Elias Oltmanns, Randy Dunlap, Andrew Morton, Original Reviewers: Elias Oltmanns, Randy Dunlap, Andrew Morton,
John Kacur, and David Teigland. John Kacur, and David Teigland.
Written for: 2.6.28-rc2 Written for: 2.6.28-rc2
Updated for: 3.10 Updated for: 3.10
Updated for: 4.13 - Copyright 2017 VMware Inc. Steven Rostedt
Introduction Introduction
------------ ------------
...@@ -26,9 +27,11 @@ a task is woken to the task is actually scheduled in. ...@@ -26,9 +27,11 @@ a task is woken to the task is actually scheduled in.
One of the most common uses of ftrace is the event tracing. One of the most common uses of ftrace is the event tracing.
Through out the kernel is hundreds of static event points that Through out the kernel is hundreds of static event points that
can be enabled via the debugfs file system to see what is can be enabled via the tracefs file system to see what is
going on in certain parts of the kernel. going on in certain parts of the kernel.
See events.txt for more information.
Implementation Details Implementation Details
---------------------- ----------------------
...@@ -39,34 +42,47 @@ See ftrace-design.txt for details for arch porters and such. ...@@ -39,34 +42,47 @@ See ftrace-design.txt for details for arch porters and such.
The File System The File System
--------------- ---------------
Ftrace uses the debugfs file system to hold the control files as Ftrace uses the tracefs file system to hold the control files as
well as the files to display output. well as the files to display output.
When debugfs is configured into the kernel (which selecting any ftrace When tracefs is configured into the kernel (which selecting any ftrace
option will do) the directory /sys/kernel/debug will be created. To mount option will do) the directory /sys/kernel/tracing will be created. To mount
this directory, you can add to your /etc/fstab file: this directory, you can add to your /etc/fstab file:
debugfs /sys/kernel/debug debugfs defaults 0 0 tracefs /sys/kernel/tracing tracefs defaults 0 0
Or you can mount it at run time with: Or you can mount it at run time with:
mount -t debugfs nodev /sys/kernel/debug mount -t tracefs nodev /sys/kernel/tracing
For quicker access to that directory you may want to make a soft link to For quicker access to that directory you may want to make a soft link to
it: it:
ln -s /sys/kernel/debug /debug ln -s /sys/kernel/tracing /tracing
*** NOTICE ***
Before 4.1, all ftrace tracing control files were within the debugfs
file system, which is typically located at /sys/kernel/debug/tracing.
For backward compatibility, when mounting the debugfs file system,
the tracefs file system will be automatically mounted at:
/sys/kernel/debug/tracing
Any selected ftrace option will also create a directory called tracing All files located in the tracefs file system will be located in that
within the debugfs. The rest of the document will assume that you are in debugfs file system directory as well.
the ftrace directory (cd /sys/kernel/debug/tracing) and will only concentrate
on the files within that directory and not distract from the content with *** NOTICE ***
the extended "/sys/kernel/debug/tracing" path name.
Any selected ftrace option will also create the tracefs file system.
The rest of the document will assume that you are in the ftrace directory
(cd /sys/kernel/tracing) and will only concentrate on the files within that
directory and not distract from the content with the extended
"/sys/kernel/tracing" path name.
That's it! (assuming that you have ftrace configured into your kernel) That's it! (assuming that you have ftrace configured into your kernel)
After mounting debugfs, you can see a directory called After mounting tracefs you will have access to the control and output files
"tracing". This directory contains the control and output files
of ftrace. Here is a list of some of the key files: of ftrace. Here is a list of some of the key files:
...@@ -92,10 +108,20 @@ of ftrace. Here is a list of some of the key files: ...@@ -92,10 +108,20 @@ of ftrace. Here is a list of some of the key files:
writing to the ring buffer, the tracing overhead may writing to the ring buffer, the tracing overhead may
still be occurring. still be occurring.
The kernel function tracing_off() can be used within the
kernel to disable writing to the ring buffer, which will
set this file to "0". User space can re-enable tracing by
echoing "1" into the file.
Note, the function and event trigger "traceoff" will also
set this file to zero and stop tracing. Which can also
be re-enabled by user space using this file.
trace: trace:
This file holds the output of the trace in a human This file holds the output of the trace in a human
readable format (described below). readable format (described below). Note, tracing is temporarily
disabled while this file is being read (opened).
trace_pipe: trace_pipe:
...@@ -109,7 +135,8 @@ of ftrace. Here is a list of some of the key files: ...@@ -109,7 +135,8 @@ of ftrace. Here is a list of some of the key files:
will not be read again with a sequential read. The will not be read again with a sequential read. The
"trace" file is static, and if the tracer is not "trace" file is static, and if the tracer is not
adding more data, it will display the same adding more data, it will display the same
information every time it is read. information every time it is read. This file will not
disable tracing while being read.
trace_options: trace_options:
...@@ -128,12 +155,14 @@ of ftrace. Here is a list of some of the key files: ...@@ -128,12 +155,14 @@ of ftrace. Here is a list of some of the key files:
tracing_max_latency: tracing_max_latency:
Some of the tracers record the max latency. Some of the tracers record the max latency.
For example, the time interrupts are disabled. For example, the maximum time that interrupts are disabled.
This time is saved in this file. The max trace The maximum time is saved in this file. The max trace will also be
will also be stored, and displayed by "trace". stored, and displayed by "trace". A new max trace will only be
A new max trace will only be recorded if the recorded if the latency is greater than the value in this file
latency is greater than the value in this (in microseconds).
file. (in microseconds)
By echoing in a time into this file, no latency will be recorded
unless it is greater than the time in this file.
tracing_thresh: tracing_thresh:
...@@ -152,32 +181,34 @@ of ftrace. Here is a list of some of the key files: ...@@ -152,32 +181,34 @@ of ftrace. Here is a list of some of the key files:
that the kernel uses for allocation, usually 4 KB in size). that the kernel uses for allocation, usually 4 KB in size).
If the last page allocated has room for more bytes If the last page allocated has room for more bytes
than requested, the rest of the page will be used, than requested, the rest of the page will be used,
making the actual allocation bigger than requested. making the actual allocation bigger than requested or shown.
( Note, the size may not be a multiple of the page size ( Note, the size may not be a multiple of the page size
due to buffer management meta-data. ) due to buffer management meta-data. )
Buffer sizes for individual CPUs may vary
(see "per_cpu/cpu0/buffer_size_kb" below), and if they do
this file will show "X".
buffer_total_size_kb: buffer_total_size_kb:
This displays the total combined size of all the trace buffers. This displays the total combined size of all the trace buffers.
free_buffer: free_buffer:
If a process is performing the tracing, and the ring buffer If a process is performing tracing, and the ring buffer should be
should be shrunk "freed" when the process is finished, even shrunk "freed" when the process is finished, even if it were to be
if it were to be killed by a signal, this file can be used killed by a signal, this file can be used for that purpose. On close
for that purpose. On close of this file, the ring buffer will of this file, the ring buffer will be resized to its minimum size.
be resized to its minimum size. Having a process that is tracing Having a process that is tracing also open this file, when the process
also open this file, when the process exits its file descriptor exits its file descriptor for this file will be closed, and in doing so,
for this file will be closed, and in doing so, the ring buffer the ring buffer will be "freed".
will be "freed".
It may also stop tracing if disable_on_free option is set. It may also stop tracing if disable_on_free option is set.
tracing_cpumask: tracing_cpumask:
This is a mask that lets the user only trace This is a mask that lets the user only trace on specified CPUs.
on specified CPUs. The format is a hex string The format is a hex string representing the CPUs.
representing the CPUs.
set_ftrace_filter: set_ftrace_filter:
...@@ -190,6 +221,9 @@ of ftrace. Here is a list of some of the key files: ...@@ -190,6 +221,9 @@ of ftrace. Here is a list of some of the key files:
to be traced. Echoing names of functions into this file to be traced. Echoing names of functions into this file
will limit the trace to only those functions. will limit the trace to only those functions.
The functions listed in "available_filter_functions" are what
can be written into this file.
This interface also allows for commands to be used. See the This interface also allows for commands to be used. See the
"Filter commands" section for more details. "Filter commands" section for more details.
...@@ -202,7 +236,14 @@ of ftrace. Here is a list of some of the key files: ...@@ -202,7 +236,14 @@ of ftrace. Here is a list of some of the key files:
set_ftrace_pid: set_ftrace_pid:
Have the function tracer only trace a single thread. Have the function tracer only trace the threads whose PID are
listed in this file.
If the "function-fork" option is set, then when a task whose
PID is listed in this file forks, the child's PID will
automatically be added to this file, and the child will be
traced by the function tracer as well. This option will also
cause PIDs of tasks that exit to be removed from the file.
set_event_pid: set_event_pid:
...@@ -217,17 +258,28 @@ of ftrace. Here is a list of some of the key files: ...@@ -217,17 +258,28 @@ of ftrace. Here is a list of some of the key files:
set_graph_function: set_graph_function:
Set a "trigger" function where tracing should start Functions listed in this file will cause the function graph
with the function graph tracer (See the section tracer to only trace these functions and the functions that
"dynamic ftrace" for more details). they call. (See the section "dynamic ftrace" for more details).
set_graph_notrace:
Similar to set_graph_function, but will disable function graph
tracing when the function is hit until it exits the function.
This makes it possible to ignore tracing functions that are called
by a specific function.
available_filter_functions: available_filter_functions:
This lists the functions that ftrace This lists the functions that ftrace has processed and can trace.
has processed and can trace. These are the function These are the function names that you can pass to
names that you can pass to "set_ftrace_filter" or "set_ftrace_filter" or "set_ftrace_notrace".
"set_ftrace_notrace". (See the section "dynamic ftrace" (See the section "dynamic ftrace" below for more details.)
below for more details.)
dyn_ftrace_total_info:
This file is for debugging purposes. The number of functions that
have been converted to nops and are available to be traced.
enabled_functions: enabled_functions:
...@@ -250,12 +302,21 @@ of ftrace. Here is a list of some of the key files: ...@@ -250,12 +302,21 @@ of ftrace. Here is a list of some of the key files:
an 'I' will be displayed on the same line as the function that an 'I' will be displayed on the same line as the function that
can be overridden. can be overridden.
If the architecture supports it, it will also show what callback
is being directly called by the function. If the count is greater
than 1 it most likely will be ftrace_ops_list_func().
If the callback of the function jumps to a trampoline that is
specific to a the callback and not the standard trampoline,
its address will be printed as well as the function that the
trampoline calls.
function_profile_enabled: function_profile_enabled:
When set it will enable all functions with either the function When set it will enable all functions with either the function
tracer, or if enabled, the function graph tracer. It will tracer, or if configured, the function graph tracer. It will
keep a histogram of the number of functions that were called keep a histogram of the number of functions that were called
and if run with the function graph tracer, it will also keep and if the function graph tracer was configured, it will also keep
track of the time spent in those functions. The histogram track of the time spent in those functions. The histogram
content can be displayed in the files: content can be displayed in the files:
...@@ -283,12 +344,11 @@ of ftrace. Here is a list of some of the key files: ...@@ -283,12 +344,11 @@ of ftrace. Here is a list of some of the key files:
printk_formats: printk_formats:
This is for tools that read the raw format files. If an event in This is for tools that read the raw format files. If an event in
the ring buffer references a string (currently only trace_printk() the ring buffer references a string, only a pointer to the string
does this), only a pointer to the string is recorded into the buffer is recorded into the buffer and not the string itself. This prevents
and not the string itself. This prevents tools from knowing what tools from knowing what that string was. This file displays the string
that string was. This file displays the string and address for and address for the string allowing tools to map the pointers to what
the string allowing tools to map the pointers to what the the strings were.
strings were.
saved_cmdlines: saved_cmdlines:
...@@ -298,6 +358,22 @@ of ftrace. Here is a list of some of the key files: ...@@ -298,6 +358,22 @@ of ftrace. Here is a list of some of the key files:
comms for events. If a pid for a comm is not listed, then comms for events. If a pid for a comm is not listed, then
"<...>" is displayed in the output. "<...>" is displayed in the output.
If the option "record-cmd" is set to "0", then comms of tasks
will not be saved during recording. By default, it is enabled.
saved_cmdlines_size:
By default, 128 comms are saved (see "saved_cmdlines" above). To
increase or decrease the amount of comms that are cached, echo
in a the number of comms to cache, into this file.
saved_tgids:
If the option "record-tgid" is set, on each scheduling context switch
the Task Group ID of a task is saved in a table mapping the PID of
the thread to its TGID. By default, the "record-tgid" option is
disabled.
snapshot: snapshot:
This displays the "snapshot" buffer and also lets the user This displays the "snapshot" buffer and also lets the user
...@@ -336,6 +412,9 @@ of ftrace. Here is a list of some of the key files: ...@@ -336,6 +412,9 @@ of ftrace. Here is a list of some of the key files:
# cat trace_clock # cat trace_clock
[local] global counter x86-tsc [local] global counter x86-tsc
The clock with the square brackets around it is the one
in effect.
local: Default clock, but may not be in sync across CPUs local: Default clock, but may not be in sync across CPUs
global: This clock is in sync with all CPUs but may global: This clock is in sync with all CPUs but may
...@@ -448,6 +527,23 @@ of ftrace. Here is a list of some of the key files: ...@@ -448,6 +527,23 @@ of ftrace. Here is a list of some of the key files:
See events.txt for more information. See events.txt for more information.
set_event:
By echoing in the event into this file, will enable that event.
See events.txt for more information.
available_events:
A list of events that can be enabled in tracing.
See events.txt for more information.
hwlat_detector:
Directory for the Hardware Latency Detector.
See "Hardware Latency Detector" section below.
per_cpu: per_cpu:
This is a directory that contains the trace per_cpu information. This is a directory that contains the trace per_cpu information.
...@@ -539,13 +635,25 @@ Here is the list of current tracers that may be configured. ...@@ -539,13 +635,25 @@ Here is the list of current tracers that may be configured.
to draw a graph of function calls similar to C code to draw a graph of function calls similar to C code
source. source.
"blk"
The block tracer. The tracer used by the blktrace user
application.
"hwlat"
The Hardware Latency tracer is used to detect if the hardware
produces any latency. See "Hardware Latency Detector" section
below.
"irqsoff" "irqsoff"
Traces the areas that disable interrupts and saves Traces the areas that disable interrupts and saves
the trace with the longest max latency. the trace with the longest max latency.
See tracing_max_latency. When a new max is recorded, See tracing_max_latency. When a new max is recorded,
it replaces the old trace. It is best to view this it replaces the old trace. It is best to view this
trace with the latency-format option enabled. trace with the latency-format option enabled, which
happens automatically when the tracer is selected.
"preemptoff" "preemptoff"
...@@ -571,6 +679,26 @@ Here is the list of current tracers that may be configured. ...@@ -571,6 +679,26 @@ Here is the list of current tracers that may be configured.
RT tasks (as the current "wakeup" does). This is useful RT tasks (as the current "wakeup" does). This is useful
for those interested in wake up timings of RT tasks. for those interested in wake up timings of RT tasks.
"wakeup_dl"
Traces and records the max latency that it takes for
a SCHED_DEADLINE task to be woken (as the "wakeup" and
"wakeup_rt" does).
"mmiotrace"
A special tracer that is used to trace binary module.
It will trace all the calls that a module makes to the
hardware. Everything it writes and reads from the I/O
as well.
"branch"
This tracer can be configured when tracing likely/unlikely
calls within the kernel. It will trace when a likely and
unlikely branch is hit and if it was correct in its prediction
of being correct.
"nop" "nop"
This is the "trace nothing" tracer. To remove all This is the "trace nothing" tracer. To remove all
...@@ -582,7 +710,7 @@ Examples of using the tracer ...@@ -582,7 +710,7 @@ Examples of using the tracer
---------------------------- ----------------------------
Here are typical examples of using the tracers when controlling Here are typical examples of using the tracers when controlling
them only with the debugfs interface (without using any them only with the tracefs interface (without using any
user-land utilities). user-land utilities).
Output format: Output format:
...@@ -674,7 +802,7 @@ why a latency happened. Here is a typical trace. ...@@ -674,7 +802,7 @@ why a latency happened. Here is a typical trace.
This shows that the current tracer is "irqsoff" tracing the time This shows that the current tracer is "irqsoff" tracing the time
for which interrupts were disabled. It gives the trace version (which for which interrupts were disabled. It gives the trace version (which
never changes) and the version of the kernel upon which this was executed on never changes) and the version of the kernel upon which this was executed on
(3.10). Then it displays the max latency in microseconds (259 us). The number (3.8). Then it displays the max latency in microseconds (259 us). The number
of trace entries displayed and the total number (both are four: #4/4). of trace entries displayed and the total number (both are four: #4/4).
VP, KP, SP, and HP are always zero and are reserved for later use. VP, KP, SP, and HP are always zero and are reserved for later use.
#P is the number of online CPUs (#P:4). #P is the number of online CPUs (#P:4).
...@@ -709,6 +837,8 @@ explains which is which. ...@@ -709,6 +837,8 @@ explains which is which.
'.' otherwise. '.' otherwise.
hardirq/softirq: hardirq/softirq:
'Z' - NMI occurred inside a hardirq
'z' - NMI is running
'H' - hard irq occurred inside a softirq. 'H' - hard irq occurred inside a softirq.
'h' - hard irq is running 'h' - hard irq is running
's' - soft irq is running 's' - soft irq is running
...@@ -757,24 +887,24 @@ nohex ...@@ -757,24 +887,24 @@ nohex
nobin nobin
noblock noblock
trace_printk trace_printk
nobranch
annotate annotate
nouserstacktrace nouserstacktrace
nosym-userobj nosym-userobj
noprintk-msg-only noprintk-msg-only
context-info context-info
nolatency-format nolatency-format
sleep-time
graph-time
record-cmd record-cmd
norecord-tgid
overwrite overwrite
nodisable_on_free nodisable_on_free
irq-info irq-info
markers markers
noevent-fork noevent-fork
function-trace function-trace
nofunction-fork
nodisplay-graph nodisplay-graph
nostacktrace nostacktrace
nobranch
To disable one of the options, echo in the option prepended with To disable one of the options, echo in the option prepended with
"no". "no".
...@@ -830,8 +960,6 @@ Here are the available options: ...@@ -830,8 +960,6 @@ Here are the available options:
trace_printk - Can disable trace_printk() from writing into the buffer. trace_printk - Can disable trace_printk() from writing into the buffer.
branch - Enable branch tracing with the tracer.
annotate - It is sometimes confusing when the CPU buffers are full annotate - It is sometimes confusing when the CPU buffers are full
and one CPU buffer had a lot of events recently, thus and one CPU buffer had a lot of events recently, thus
a shorter time frame, were another CPU may have only had a shorter time frame, were another CPU may have only had
...@@ -850,7 +978,8 @@ Here are the available options: ...@@ -850,7 +978,8 @@ Here are the available options:
<idle>-0 [001] .Ns3 21169.031485: sub_preempt_count <-_raw_spin_unlock <idle>-0 [001] .Ns3 21169.031485: sub_preempt_count <-_raw_spin_unlock
userstacktrace - This option changes the trace. It records a userstacktrace - This option changes the trace. It records a
stacktrace of the current userspace thread. stacktrace of the current user space thread after
each trace event.
sym-userobj - when user stacktrace are enabled, look up which sym-userobj - when user stacktrace are enabled, look up which
object the address belongs to, and print a object the address belongs to, and print a
...@@ -873,29 +1002,21 @@ x494] <- /root/a.out[+0x4a8] <- /lib/libc-2.7.so[+0x1e1a6] ...@@ -873,29 +1002,21 @@ x494] <- /root/a.out[+0x4a8] <- /lib/libc-2.7.so[+0x1e1a6]
context-info - Show only the event data. Hides the comm, PID, context-info - Show only the event data. Hides the comm, PID,
timestamp, CPU, and other useful data. timestamp, CPU, and other useful data.
latency-format - This option changes the trace. When latency-format - This option changes the trace output. When it is enabled,
it is enabled, the trace displays the trace displays additional information about the
additional information about the latency, as described in "Latency trace format".
latencies, as described in "Latency
trace format".
sleep-time - When running function graph tracer, to include
the time a task schedules out in its function.
When enabled, it will account time the task has been
scheduled out as part of the function call.
graph-time - When running function profiler with function graph tracer,
to include the time to call nested functions. When this is
not set, the time reported for the function will only
include the time the function itself executed for, not the
time for functions that it called.
record-cmd - When any event or tracer is enabled, a hook is enabled record-cmd - When any event or tracer is enabled, a hook is enabled
in the sched_switch trace point to fill comm cache in the sched_switch trace point to fill comm cache
with mapped pids and comms. But this may cause some with mapped pids and comms. But this may cause some
overhead, and if you only care about pids, and not the overhead, and if you only care about pids, and not the
name of the task, disabling this option can lower the name of the task, disabling this option can lower the
impact of tracing. impact of tracing. See "saved_cmdlines".
record-tgid - When any event or tracer is enabled, a hook is enabled
in the sched_switch trace point to fill the cache of
mapped Thread Group IDs (TGID) mapping to pids. See
"saved_tgids".
overwrite - This controls what happens when the trace buffer is overwrite - This controls what happens when the trace buffer is
full. If "1" (default), the oldest events are full. If "1" (default), the oldest events are
...@@ -935,19 +1056,98 @@ x494] <- /root/a.out[+0x4a8] <- /lib/libc-2.7.so[+0x1e1a6] ...@@ -935,19 +1056,98 @@ x494] <- /root/a.out[+0x4a8] <- /lib/libc-2.7.so[+0x1e1a6]
functions. This keeps the overhead of the tracer down functions. This keeps the overhead of the tracer down
when performing latency tests. when performing latency tests.
function-fork - When set, tasks with PIDs listed in set_ftrace_pid will
have the PIDs of their children added to set_ftrace_pid
when those tasks fork. Also, when tasks with PIDs in
set_ftrace_pid exit, their PIDs will be removed from the
file.
display-graph - When set, the latency tracers (irqsoff, wakeup, etc) will display-graph - When set, the latency tracers (irqsoff, wakeup, etc) will
use function graph tracing instead of function tracing. use function graph tracing instead of function tracing.
stacktrace - This is one of the options that changes the trace stacktrace - When set, a stack trace is recorded after any trace event
itself. When a trace is recorded, so is the stack is recorded.
of functions. This allows for back traces of
trace sites. branch - Enable branch tracing with the tracer. This enables branch
tracer along with the currently set tracer. Enabling this
with the "nop" tracer is the same as just enabling the
"branch" tracer.
Note: Some tracers have their own options. They only appear in this Note: Some tracers have their own options. They only appear in this
file when the tracer is active. They always appear in the file when the tracer is active. They always appear in the
options directory. options directory.
Here are the per tracer options:
Options for function tracer:
func_stack_trace - When set, a stack trace is recorded after every
function that is recorded. NOTE! Limit the functions
that are recorded before enabling this, with
"set_ftrace_filter" otherwise the system performance
will be critically degraded. Remember to disable
this option before clearing the function filter.
Options for function_graph tracer:
Since the function_graph tracer has a slightly different output
it has its own options to control what is displayed.
funcgraph-overrun - When set, the "overrun" of the graph stack is
displayed after each function traced. The
overrun, is when the stack depth of the calls
is greater than what is reserved for each task.
Each task has a fixed array of functions to
trace in the call graph. If the depth of the
calls exceeds that, the function is not traced.
The overrun is the number of functions missed
due to exceeding this array.
funcgraph-cpu - When set, the CPU number of the CPU where the trace
occurred is displayed.
funcgraph-overhead - When set, if the function takes longer than
A certain amount, then a delay marker is
displayed. See "delay" above, under the
header description.
funcgraph-proc - Unlike other tracers, the process' command line
is not displayed by default, but instead only
when a task is traced in and out during a context
switch. Enabling this options has the command
of each process displayed at every line.
funcgraph-duration - At the end of each function (the return)
the duration of the amount of time in the
function is displayed in microseconds.
funcgraph-abstime - When set, the timestamp is displayed at each
line.
funcgraph-irqs - When disabled, functions that happen inside an
interrupt will not be traced.
funcgraph-tail - When set, the return event will include the function
that it represents. By default this is off, and
only a closing curly bracket "}" is displayed for
the return of a function.
sleep-time - When running function graph tracer, to include
the time a task schedules out in its function.
When enabled, it will account time the task has been
scheduled out as part of the function call.
graph-time - When running function profiler with function graph tracer,
to include the time to call nested functions. When this is
not set, the time reported for the function will only
include the time the function itself executed for, not the
time for functions that it called.
Options for blk tracer:
blk_classic - Shows a more minimalistic output.
irqsoff irqsoff
------- -------
...@@ -1711,6 +1911,85 @@ events. ...@@ -1711,6 +1911,85 @@ events.
<idle>-0 2d..3 6us : 0:120:R ==> [002] 5882: 94:R sleep <idle>-0 2d..3 6us : 0:120:R ==> [002] 5882: 94:R sleep
Hardware Latency Detector
-------------------------
The hardware latency detector is executed by enabling the "hwlat" tracer.
NOTE, this tracer will affect the performance of the system as it will
periodically make a CPU constantly busy with interrupts disabled.
# echo hwlat > current_tracer
# sleep 100
# cat trace
# tracer: hwlat
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / delay
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
# | | | |||| | |
<...>-3638 [001] d... 19452.055471: #1 inner/outer(us): 12/14 ts:1499801089.066141940
<...>-3638 [003] d... 19454.071354: #2 inner/outer(us): 11/9 ts:1499801091.082164365
<...>-3638 [002] dn.. 19461.126852: #3 inner/outer(us): 12/9 ts:1499801098.138150062
<...>-3638 [001] d... 19488.340960: #4 inner/outer(us): 8/12 ts:1499801125.354139633
<...>-3638 [003] d... 19494.388553: #5 inner/outer(us): 8/12 ts:1499801131.402150961
<...>-3638 [003] d... 19501.283419: #6 inner/outer(us): 0/12 ts:1499801138.297435289 nmi-total:4 nmi-count:1
The above output is somewhat the same in the header. All events will have
interrupts disabled 'd'. Under the FUNCTION title there is:
#1 - This is the count of events recorded that were greater than the
tracing_threshold (See below).
inner/outer(us): 12/14
This shows two numbers as "inner latency" and "outer latency". The test
runs in a loop checking a timestamp twice. The latency detected within
the two timestamps is the "inner latency" and the latency detected
after the previous timestamp and the next timestamp in the loop is
the "outer latency".
ts:1499801089.066141940
The absolute timestamp that the event happened.
nmi-total:4 nmi-count:1
On architectures that support it, if an NMI comes in during the
test, the time spent in NMI is reported in "nmi-total" (in
microseconds).
All architectures that have NMIs will show the "nmi-count" if an
NMI comes in during the test.
hwlat files:
tracing_threshold - This gets automatically set to "10" to represent 10
microseconds. This is the threshold of latency that
needs to be detected before the trace will be recorded.
Note, when hwlat tracer is finished (another tracer is
written into "current_tracer"), the original value for
tracing_threshold is placed back into this file.
hwlat_detector/width - The length of time the test runs with interrupts
disabled.
hwlat_detector/window - The length of time of the window which the test
runs. That is, the test will run for "width"
microseconds per "window" microseconds
tracing_cpumask - When the test is started. A kernel thread is created that
runs the test. This thread will alternate between CPUs
listed in the tracing_cpumask between each period
(one "window"). To limit the test to specific CPUs
set the mask in this file to only the CPUs that the test
should run on.
function function
-------- --------
...@@ -1821,15 +2100,15 @@ something like this simple program: ...@@ -1821,15 +2100,15 @@ something like this simple program:
#define STR(x) _STR(x) #define STR(x) _STR(x)
#define MAX_PATH 256 #define MAX_PATH 256
const char *find_debugfs(void) const char *find_tracefs(void)
{ {
static char debugfs[MAX_PATH+1]; static char tracefs[MAX_PATH+1];
static int debugfs_found; static int tracefs_found;
char type[100]; char type[100];
FILE *fp; FILE *fp;
if (debugfs_found) if (tracefs_found)
return debugfs; return tracefs;
if ((fp = fopen("/proc/mounts","r")) == NULL) { if ((fp = fopen("/proc/mounts","r")) == NULL) {
perror("/proc/mounts"); perror("/proc/mounts");
...@@ -1839,27 +2118,27 @@ const char *find_debugfs(void) ...@@ -1839,27 +2118,27 @@ const char *find_debugfs(void)
while (fscanf(fp, "%*s %" while (fscanf(fp, "%*s %"
STR(MAX_PATH) STR(MAX_PATH)
"s %99s %*s %*d %*d\n", "s %99s %*s %*d %*d\n",
debugfs, type) == 2) { tracefs, type) == 2) {
if (strcmp(type, "debugfs") == 0) if (strcmp(type, "tracefs") == 0)
break; break;
} }
fclose(fp); fclose(fp);
if (strcmp(type, "debugfs") != 0) { if (strcmp(type, "tracefs") != 0) {
fprintf(stderr, "debugfs not mounted"); fprintf(stderr, "tracefs not mounted");
return NULL; return NULL;
} }
strcat(debugfs, "/tracing/"); strcat(tracefs, "/tracing/");
debugfs_found = 1; tracefs_found = 1;
return debugfs; return tracefs;
} }
const char *tracing_file(const char *file_name) const char *tracing_file(const char *file_name)
{ {
static char trace_file[MAX_PATH+1]; static char trace_file[MAX_PATH+1];
snprintf(trace_file, MAX_PATH, "%s/%s", find_debugfs(), file_name); snprintf(trace_file, MAX_PATH, "%s/%s", find_tracefs(), file_name);
return trace_file; return trace_file;
} }
...@@ -1898,12 +2177,12 @@ Or this simple script! ...@@ -1898,12 +2177,12 @@ Or this simple script!
------ ------
#!/bin/bash #!/bin/bash
debugfs=`sed -ne 's/^debugfs \(.*\) debugfs.*/\1/p' /proc/mounts` tracefs=`sed -ne 's/^tracefs \(.*\) tracefs.*/\1/p' /proc/mounts`
echo nop > $debugfs/tracing/current_tracer echo nop > $tracefs/tracing/current_tracer
echo 0 > $debugfs/tracing/tracing_on echo 0 > $tracefs/tracing/tracing_on
echo $$ > $debugfs/tracing/set_ftrace_pid echo $$ > $tracefs/tracing/set_ftrace_pid
echo function > $debugfs/tracing/current_tracer echo function > $tracefs/tracing/current_tracer
echo 1 > $debugfs/tracing/tracing_on echo 1 > $tracefs/tracing/tracing_on
exec "$@" exec "$@"
------ ------
...@@ -2145,13 +2424,18 @@ include the -pg switch in the compiling of the kernel.) ...@@ -2145,13 +2424,18 @@ include the -pg switch in the compiling of the kernel.)
At compile time every C file object is run through the At compile time every C file object is run through the
recordmcount program (located in the scripts directory). This recordmcount program (located in the scripts directory). This
program will parse the ELF headers in the C object to find all program will parse the ELF headers in the C object to find all
the locations in the .text section that call mcount. (Note, only the locations in the .text section that call mcount. Starting
white listed .text sections are processed, since processing other with gcc verson 4.6, the -mfentry has been added for x86, which
sections like .init.text may cause races due to those sections calls "__fentry__" instead of "mcount". Which is called before
being freed unexpectedly). the creation of the stack frame.
A new section called "__mcount_loc" is created that holds Note, not all sections are traced. They may be prevented by either
references to all the mcount call sites in the .text section. a notrace, or blocked another way and all inline functions are not
traced. Check the "available_filter_functions" file to see what functions
can be traced.
A section called "__mcount_loc" is created that holds
references to all the mcount/fentry call sites in the .text section.
The recordmcount program re-links this section back into the The recordmcount program re-links this section back into the
original object. The final linking stage of the kernel will add all these original object. The final linking stage of the kernel will add all these
references into a single table. references into a single table.
...@@ -2679,7 +2963,7 @@ in time without stopping tracing. Ftrace swaps the current ...@@ -2679,7 +2963,7 @@ in time without stopping tracing. Ftrace swaps the current
buffer with a spare buffer, and tracing continues in the new buffer with a spare buffer, and tracing continues in the new
current (=previous spare) buffer. current (=previous spare) buffer.
The following debugfs files in "tracing" are related to this The following tracefs files in "tracing" are related to this
feature: feature:
snapshot: snapshot:
...@@ -2752,7 +3036,7 @@ cat: snapshot: Device or resource busy ...@@ -2752,7 +3036,7 @@ cat: snapshot: Device or resource busy
Instances Instances
--------- ---------
In the debugfs tracing directory is a directory called "instances". In the tracefs tracing directory is a directory called "instances".
This directory can have new directories created inside of it using This directory can have new directories created inside of it using
mkdir, and removing directories with rmdir. The directory created mkdir, and removing directories with rmdir. The directory created
with mkdir in this directory will already contain files and other with mkdir in this directory will already contain files and other
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment