Commit 3ce5aceb authored by Ingo Molnar's avatar Ingo Molnar

Merge tag 'perf-core-for-mingo-5.3-20190611' of...

Merge tag 'perf-core-for-mingo-5.3-20190611' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core

Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

perf record:

  Alexey Budankov:

  - Allow mixing --user-regs with --call-graph=dwarf, making sure that
    the minimal set of registers for DWARF unwinding is present in the
    set of user registers requested to be present in each sample, while
    warning the user that this may make callchains unreliable if more
    that the minimal set of registers is needed to unwind.

  yuzhoujian:

  - Add support to collect callchains from kernel or user space only,
    IOW allow setting the perf_event_attr.exclude_callchain_{kernel,user}
    bits from the command line.

perf trace:

  Arnaldo Carvalho de Melo:

  - Remove x86_64 specific syscall numbers from the augmented_raw_syscalls
    BPF in-kernel collector of augmented raw_syscalls:sys_{enter,exit}
    payloads, use instead the syscall numbers obtainer either by the
    arch specific syscalltbl generators or from audit-libs.

  - Allow 'perf trace' to ask for the number of bytes to collect for
    string arguments, for now ask for PATH_MAX, i.e. the whole
    pathnames, which ends up being just a way to speficy which syscall
    args are pathnames and thus should be read using bpf_probe_read_str().

  - Skip unknown syscalls when expanding strace like syscall groups.
    This helps using the 'string' group of syscalls to work in arm64,
    where some of the syscalls present in x86_64 that deal with
    strings, for instance 'access', are deprecated and this should not
    be asked for tracing.

  Leo Yan:

  - Exit when failing to build eBPF program.

perf config:

  Arnaldo Carvalho de Melo:

  - Bail out when a handler returns failure for a key-value pair. This
    helps with cases where processing a key-value pair is not just a
    matter of setting some tool specific knob, involving, for instance
    building a BPF program to then attach to the list of events 'perf
    trace' will use, e.g. augmented_raw_syscalls.c.

perf.data:

  Kan Liang:

  - Read and store die ID information available in new Intel processors
    in CPUID.1F in the CPU topology written in the perf.data header.

perf stat:

  Kan Liang:

  - Support per-die aggregation.

Documentation:

  Arnaldo Carvalho de Melo:

  - Update perf.data documentation about the CPU_TOPOLOGY, MEM_TOPOLOGY,
    CLOCKID and DIR_FORMAT headers.

  Song Liu:

  - Add description of headers HEADER_BPF_PROG_INFO and HEADER_BPF_BTF.

  Leo Yan:

  - Update default value for llvm.clang-bpf-cmd-template in 'man perf-config'.

JVMTI:

  Jiri Olsa:

  - Address gcc string overflow warning for strncpy()

core:

  - Remove superfluous nthreads system_wide setup in perf_evsel__alloc_fd().

Intel PT:

  Adrian Hunter:

  - Add support for samples to contain IPC ratio, collecting cycles
    information from CYC packets, showing the IPC info periodically, because
    Intel PT does not update the cycle count on every branch or instruction,
    the incremental values will often be zero.  When there are values, they
    will be the number of instructions and number of cycles since the last
    update, and thus represent the average IPC since the last IPC value.

    E.g.:

    # perf record --cpu 1 -m200000 -a -e intel_pt/cyc/u sleep 0.0001
    rounding mmap pages size to 1024M (262144 pages)
    [ perf record: Woken up 0 times to write data ]
    [ perf record: Captured and wrote 2.208 MB perf.data ]
    # perf script --insn-trace --xed -F+ipc,-dso,-cpu,-tid
    #
    <SNIP + add line numbering to make sense of IPC counts e.g.: (18/3)>
    1   cc1 63501.650479626: 7f5219ac27bf _int_free+0x3f   jnz 0x7f5219ac2af0       IPC: 0.81 (36/44)
    2   cc1 63501.650479626: 7f5219ac27c5 _int_free+0x45   cmp $0x1f, %rbp
    3   cc1 63501.650479626: 7f5219ac27c9 _int_free+0x49   jbe 0x7f5219ac2b00
    4   cc1 63501.650479626: 7f5219ac27cf _int_free+0x4f   test $0x8, %al
    5   cc1 63501.650479626: 7f5219ac27d1 _int_free+0x51   jnz 0x7f5219ac2b00
    6   cc1 63501.650479626: 7f5219ac27d7 _int_free+0x57   movq  0x13c58a(%rip), %rcx
    7   cc1 63501.650479626: 7f5219ac27de _int_free+0x5e   mov %rdi, %r12
    8   cc1 63501.650479626: 7f5219ac27e1 _int_free+0x61   movq  %fs:(%rcx), %rax
    9   cc1 63501.650479626: 7f5219ac27e5 _int_free+0x65   test %rax, %rax
   10   cc1 63501.650479626: 7f5219ac27e8 _int_free+0x68   jz 0x7f5219ac2821
   11   cc1 63501.650479626: 7f5219ac27ea _int_free+0x6a   leaq  -0x11(%rbp), %rdi
   12   cc1 63501.650479626: 7f5219ac27ee _int_free+0x6e   mov %rdi, %rsi
   13   cc1 63501.650479626: 7f5219ac27f1 _int_free+0x71   shr $0x4, %rsi
   14   cc1 63501.650479626: 7f5219ac27f5 _int_free+0x75   cmpq  %rsi, 0x13caf4(%rip)
   15   cc1 63501.650479626: 7f5219ac27fc _int_free+0x7c   jbe 0x7f5219ac2821
   16   cc1 63501.650479626: 7f5219ac2821 _int_free+0xa1   cmpq  0x13f138(%rip), %rbp
   17   cc1 63501.650479626: 7f5219ac2828 _int_free+0xa8   jnbe 0x7f5219ac28d8
   18   cc1 63501.650479626: 7f5219ac28d8 _int_free+0x158  testb  $0x2, 0x8(%rbx)
   19   cc1 63501.650479628: 7f5219ac28dc _int_free+0x15c  jnz 0x7f5219ac2ab0       IPC: 6.00 (18/3)
    <SNIP>

  - Allow using time ranges with Intel PT, i.e. these features, already
    present but not optimially usable with Intel PT, should be now:

        Select the second 10% time slice:

        $ perf script --time 10%/2

        Select from 0% to 10% time slice:

        $ perf script --time 0%-10%

        Select the first and second 10% time slices:

        $ perf script --time 10%/1,10%/2

        Select from 0% to 10% and 30% to 40% slices:

        $ perf script --time 0%-10%,30%-40%

cs-etm (ARM):

  Mathieu Poirier:

  - Add support for CPU-wide trace scenarios.

s390:

  Thomas Richter:

  - Fix missing kvm module load for s390.

  - Fix OOM error in TUI mode on s390

  - Support s390 diag event display when doing analysis on !s390
    architectures.
Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
parents d0e1a507 04c41bcb
Database Export
===============
perf tool's python scripting engine:
tools/perf/util/scripting-engines/trace-event-python.c
supports scripts:
tools/perf/scripts/python/export-to-sqlite.py
tools/perf/scripts/python/export-to-postgresql.py
which export data to a SQLite3 or PostgreSQL database.
The export process provides records with unique sequential ids which allows the
data to be imported directly to a database and provides the relationships
between tables.
Over time it is possible to continue to expand the export while maintaining
backward and forward compatibility, by following some simple rules:
1. Because of the nature of SQL, existing tables and columns can continue to be
used so long as the names and meanings (and to some extent data types) remain
the same.
2. New tables and columns can be added, without affecting existing SQL queries,
so long as the new names are unique.
3. Scripts that use a database (e.g. exported-sql-viewer.py) can maintain
backward compatibility by testing for the presence of new tables and columns
before using them. e.g. function IsSelectable() in exported-sql-viewer.py
4. The export scripts themselves maintain forward compatibility (i.e. an existing
script will continue to work with new versions of perf) by accepting a variable
number of arguments (e.g. def call_return_table(*x)) i.e. perf can pass more
arguments which old scripts will ignore.
5. The scripting engine tests for the existence of script handler functions
before calling them. The scripting engine can also test for the support of new
or optional features by checking for the existence and value of script global
variables.
......@@ -103,6 +103,36 @@ The flags are "bcrosyiABEx" which stand for branch, call, return, conditional,
system, asynchronous, interrupt, transaction abort, trace begin, trace end, and
in transaction, respectively.
Another interesting field that is not printed by default is 'ipc' which can be
displayed as follows:
perf script --itrace=be -F+ipc
There are two ways that instructions-per-cycle (IPC) can be calculated depending
on the recording.
If the 'cyc' config term (see config terms section below) was used, then IPC is
calculated using the cycle count from CYC packets, otherwise MTC packets are
used - refer to the 'mtc' config term. When MTC is used, however, the values
are less accurate because the timing is less accurate.
Because Intel PT does not update the cycle count on every branch or instruction,
the values will often be zero. When there are values, they will be the number
of instructions and number of cycles since the last update, and thus represent
the average IPC since the last IPC for that event type. Note IPC for "branches"
events is calculated separately from IPC for "instructions" events.
Also note that the IPC instruction count may or may not include the current
instruction. If the cycle count is associated with an asynchronous branch
(e.g. page fault or interrupt), then the instruction count does not include the
current instruction, otherwise it does. That is consistent with whether or not
that instruction has retired when the cycle count is updated.
Another note, in the case of "branches" events, non-taken branches are not
presently sampled, so IPC values for them do not appear e.g. a CYC packet with a
TNT packet that starts with a non-taken branch. To see every possible IPC
value, "instructions" events can be used e.g. --itrace=i0ns
While it is possible to create scripts to analyze the data, an alternative
approach is available to export the data to a sqlite or postgresql database.
Refer to script export-to-sqlite.py or export-to-postgresql.py for more details,
......
......@@ -564,9 +564,12 @@ llvm.*::
llvm.clang-bpf-cmd-template::
Cmdline template. Below lines show its default value. Environment
variable is used to pass options.
"$CLANG_EXEC -D__KERNEL__ $CLANG_OPTIONS $KERNEL_INC_OPTIONS \
-Wno-unused-value -Wno-pointer-sign -working-directory \
$WORKING_DIR -c $CLANG_SOURCE -target bpf -O2 -o -"
"$CLANG_EXEC -D__KERNEL__ -D__NR_CPUS__=$NR_CPUS "\
"-DLINUX_VERSION_CODE=$LINUX_VERSION_CODE " \
"$CLANG_OPTIONS $PERF_BPF_INC_OPTIONS $KERNEL_INC_OPTIONS " \
"-Wno-unused-value -Wno-pointer-sign " \
"-working-directory $WORKING_DIR " \
"-c \"$CLANG_SOURCE\" -target bpf $CLANG_EMIT_LLVM -O2 -o - $LLVM_OPTIONS_PIPE"
llvm.clang-opt::
Options passed to clang.
......
......@@ -142,12 +142,14 @@ OPTIONS
perf diff --time 0%-10%,30%-40%
It also supports analyzing samples within a given time window
<start>,<stop>. Times have the format seconds.microseconds. If 'start'
is not given (i.e., time string is ',x.y') then analysis starts at
the beginning of the file. If stop time is not given (i.e, time
string is 'x.y,') then analysis goes to the end of the file. Time string is
'a1.b1,c1.d1:a2.b2,c2.d2'. Use ':' to separate timestamps for different
perf.data files.
<start>,<stop>. Times have the format seconds.nanoseconds. If 'start'
is not given (i.e. time string is ',x.y') then analysis starts at
the beginning of the file. If stop time is not given (i.e. time
string is 'x.y,') then analysis goes to the end of the file.
Multiple ranges can be separated by spaces, which requires the argument
to be quoted e.g. --time "1234.567,1234.789 1235,"
Time string is'a1.b1,c1.d1:a2.b2,c2.d2'. Use ':' to separate timestamps
for different perf.data files.
For example, we get the timestamp information from 'perf script'.
......
......@@ -490,6 +490,17 @@ Configure all used events to run in kernel space.
--all-user::
Configure all used events to run in user space.
--kernel-callchains::
Collect callchains only from kernel space. I.e. this option sets
perf_event_attr.exclude_callchain_user to 1.
--user-callchains::
Collect callchains only from user space. I.e. this option sets
perf_event_attr.exclude_callchain_kernel to 1.
Don't use both --kernel-callchains and --user-callchains at the same time or no
callchains will be collected.
--timestamp-filename
Append timestamp to output file name.
......
......@@ -412,12 +412,13 @@ OPTIONS
--time::
Only analyze samples within given time window: <start>,<stop>. Times
have the format seconds.microseconds. If start is not given (i.e., time
have the format seconds.nanoseconds. If start is not given (i.e. time
string is ',x.y') then analysis starts at the beginning of the file. If
stop time is not given (i.e, time string is 'x.y,') then analysis goes
to end of file.
stop time is not given (i.e. time string is 'x.y,') then analysis goes
to end of file. Multiple ranges can be separated by spaces, which
requires the argument to be quoted e.g. --time "1234.567,1234.789 1235,"
Also support time percent with multiple time range. Time string is
Also support time percent with multiple time ranges. Time string is
'a%/n,b%/m,...' or 'a%-b%,c%-%d,...'.
For example:
......
......@@ -117,7 +117,7 @@ OPTIONS
Comma separated list of fields to print. Options are:
comm, tid, pid, time, cpu, event, trace, ip, sym, dso, addr, symoff,
srcline, period, iregs, uregs, brstack, brstacksym, flags, bpf-output, brstackinsn,
brstackoff, callindent, insn, insnlen, synth, phys_addr, metric, misc, srccode.
brstackoff, callindent, insn, insnlen, synth, phys_addr, metric, misc, srccode, ipc.
Field list can be prepended with the type, trace, sw or hw,
to indicate to which event type the field list applies.
e.g., -F sw:comm,tid,time,ip,sym and -F trace:time,cpu,trace
......@@ -203,6 +203,9 @@ OPTIONS
The synth field is used by synthesized events which may be created when
Instruction Trace decoding.
The ipc (instructions per cycle) field is synthesized and may have a value when
Instruction Trace decoding.
Finally, a user may not set fields to none for all event types.
i.e., -F "" is not allowed.
......@@ -358,12 +361,13 @@ include::itrace.txt[]
--time::
Only analyze samples within given time window: <start>,<stop>. Times
have the format seconds.microseconds. If start is not given (i.e., time
have the format seconds.nanoseconds. If start is not given (i.e. time
string is ',x.y') then analysis starts at the beginning of the file. If
stop time is not given (i.e, time string is 'x.y,') then analysis goes
to end of file.
stop time is not given (i.e. time string is 'x.y,') then analysis goes
to end of file. Multiple ranges can be separated by spaces, which
requires the argument to be quoted e.g. --time "1234.567,1234.789 1235,"
Also support time percent with multipe time range. Time string is
Also support time percent with multiple time ranges. Time string is
'a%/n,b%/m,...' or 'a%-b%,c%-%d,...'.
For example:
......
......@@ -200,6 +200,13 @@ use --per-socket in addition to -a. (system-wide). The output includes the
socket number and the number of online processors on that socket. This is
useful to gauge the amount of aggregation.
--per-die::
Aggregate counts per processor die for system-wide mode measurements. This
is a useful mode to detect imbalance between dies. To enable this mode,
use --per-die in addition to -a. (system-wide). The output includes the
die number and the number of online processors on that die. This is
useful to gauge the amount of aggregation.
--per-core::
Aggregate counts per physical processor for system-wide mode measurements. This
is a useful mode to detect imbalance between physical cores. To enable this mode,
......@@ -239,6 +246,9 @@ Input file name.
--per-socket::
Aggregate counts per processor socket for system-wide mode measurements.
--per-die::
Aggregate counts per processor die for system-wide mode measurements.
--per-core::
Aggregate counts per physical processor for system-wide mode measurements.
......
......@@ -151,25 +151,45 @@ struct {
HEADER_CPU_TOPOLOGY = 13,
String lists defining the core and CPU threads topology.
The string lists are followed by a variable length array
which contains core_id and socket_id of each cpu.
The number of entries can be determined by the size of the
section minus the sizes of both string lists.
struct {
/*
* First revision of HEADER_CPU_TOPOLOGY
*
* See 'struct perf_header_string_list' definition earlier
* in this file.
*/
struct perf_header_string_list cores; /* Variable length */
struct perf_header_string_list threads; /* Variable length */
/*
* Second revision of HEADER_CPU_TOPOLOGY, older tools
* will not consider what comes next
*/
struct {
uint32_t core_id;
uint32_t socket_id;
} cpus[nr]; /* Variable length records */
/* 'nr' comes from previously processed HEADER_NRCPUS's nr_cpu_avail */
/*
* Third revision of HEADER_CPU_TOPOLOGY, older tools
* will not consider what comes next
*/
struct perf_header_string_list dies; /* Variable length */
uint32_t die_id[nr_cpus_avail]; /* from previously processed HEADER_NR_CPUS, VLA */
};
Example:
sibling cores : 0-3
sibling sockets : 0-8
sibling dies : 0-3
sibling dies : 4-7
sibling threads : 0-1
sibling threads : 2-3
sibling threads : 4-5
sibling threads : 6-7
HEADER_NUMA_TOPOLOGY = 14,
......@@ -272,6 +292,69 @@ struct {
Two uint64_t for the time of first sample and the time of last sample.
HEADER_SAMPLE_TOPOLOGY = 22,
Physical memory map and its node assignments.
The format of data in MEM_TOPOLOGY is as follows:
0 - version | for future changes
8 - block_size_bytes | /sys/devices/system/memory/block_size_bytes
16 - count | number of nodes
For each node we store map of physical indexes:
32 - node id | node index
40 - size | size of bitmap
48 - bitmap | bitmap of memory indexes that belongs to node
| /sys/devices/system/node/node<NODE>/memory<INDEX>
The MEM_TOPOLOGY can be displayed with following command:
$ perf report --header-only -I
...
# memory nodes (nr 1, block size 0x8000000):
# 0 [7G]: 0-23,32-69
HEADER_CLOCKID = 23,
One uint64_t for the clockid frequency, specified, for instance, via 'perf
record -k' (see clock_gettime()), to enable timestamps derived metrics
conversion into wall clock time on the reporting stage.
HEADER_DIR_FORMAT = 24,
The data files layout is described by HEADER_DIR_FORMAT feature. Currently it
holds only version number (1):
uint64_t version;
The current version holds only version value (1) means that data files:
- Follow the 'data.*' name format.
- Contain raw events data in standard perf format as read from kernel (and need
to be sorted)
Future versions are expected to describe different data files layout according
to special needs.
HEADER_BPF_PROG_INFO = 25,
struct bpf_prog_info_linear, which contains detailed information about
a BPF program, including type, id, tag, jited/xlated instructions, etc.
HEADER_BPF_BTF = 26,
Contains BPF Type Format (BTF). For more information about BTF, please
refer to Documentation/bpf/btf.rst.
struct {
u32 id;
u32 data_size;
char data[];
};
HEADER_COMPRESSED = 27,
struct {
......
......@@ -413,6 +413,9 @@ ifdef CORESIGHT
$(call feature_check,libopencsd)
ifeq ($(feature-libopencsd), 1)
CFLAGS += -DHAVE_CSTRACE_SUPPORT $(LIBOPENCSD_CFLAGS)
ifeq ($(feature-reallocarray), 0)
CFLAGS += -DCOMPAT_NEED_REALLOCARRAY
endif
LDFLAGS += $(LIBOPENCSD_LDFLAGS)
EXTLIBS += $(OPENCSDLIBS)
$(call detected,CONFIG_LIBOPENCSD)
......
This diff is collapsed.
......@@ -2191,6 +2191,10 @@ static struct option __record_options[] = {
OPT_BOOLEAN_FLAG(0, "all-user", &record.opts.all_user,
"Configure all used events to run in user space.",
PARSE_OPT_EXCLUSIVE),
OPT_BOOLEAN(0, "kernel-callchains", &record.opts.kernel_callchains,
"collect kernel callchains"),
OPT_BOOLEAN(0, "user-callchains", &record.opts.user_callchains,
"collect user callchains"),
OPT_STRING(0, "clang-path", &llvm_param.clang_path, "clang path",
"clang binary to use for compiling BPF scriptlets"),
OPT_STRING(0, "clang-opt", &llvm_param.clang_opt, "clang options",
......
......@@ -1428,6 +1428,10 @@ int cmd_report(int argc, const char **argv)
&report.range_num);
if (ret < 0)
goto error;
itrace_synth_opts__set_time_range(&itrace_synth_opts,
report.ptime_range,
report.range_num);
}
if (session->tevent.pevent &&
......@@ -1449,8 +1453,10 @@ int cmd_report(int argc, const char **argv)
ret = 0;
error:
if (report.ptime_range)
if (report.ptime_range) {
itrace_synth_opts__clear_time_range(&itrace_synth_opts);
zfree(&report.ptime_range);
}
zstd_fini(&(session->zstd_data));
perf_session__delete(session);
return ret;
......
......@@ -102,6 +102,7 @@ enum perf_output_field {
PERF_OUTPUT_METRIC = 1U << 28,
PERF_OUTPUT_MISC = 1U << 29,
PERF_OUTPUT_SRCCODE = 1U << 30,
PERF_OUTPUT_IPC = 1U << 31,
};
struct output_option {
......@@ -139,6 +140,7 @@ struct output_option {
{.str = "metric", .field = PERF_OUTPUT_METRIC},
{.str = "misc", .field = PERF_OUTPUT_MISC},
{.str = "srccode", .field = PERF_OUTPUT_SRCCODE},
{.str = "ipc", .field = PERF_OUTPUT_IPC},
};
enum {
......@@ -1268,6 +1270,20 @@ static int perf_sample__fprintf_insn(struct perf_sample *sample,
return printed;
}
static int perf_sample__fprintf_ipc(struct perf_sample *sample,
struct perf_event_attr *attr, FILE *fp)
{
unsigned int ipc;
if (!PRINT_FIELD(IPC) || !sample->cyc_cnt || !sample->insn_cnt)
return 0;
ipc = (sample->insn_cnt * 100) / sample->cyc_cnt;
return fprintf(fp, " \t IPC: %u.%02u (%" PRIu64 "/%" PRIu64 ") ",
ipc / 100, ipc % 100, sample->insn_cnt, sample->cyc_cnt);
}
static int perf_sample__fprintf_bts(struct perf_sample *sample,
struct perf_evsel *evsel,
struct thread *thread,
......@@ -1312,6 +1328,8 @@ static int perf_sample__fprintf_bts(struct perf_sample *sample,
printed += perf_sample__fprintf_addr(sample, thread, attr, fp);
}
printed += perf_sample__fprintf_ipc(sample, attr, fp);
if (print_srcline_last)
printed += map__fprintf_srcline(al->map, al->addr, "\n ", fp);
......@@ -1859,6 +1877,9 @@ static void process_event(struct perf_script *script,
if (PRINT_FIELD(PHYS_ADDR))
fprintf(fp, "%16" PRIx64, sample->phys_addr);
perf_sample__fprintf_ipc(sample, attr, fp);
fprintf(fp, "\n");
if (PRINT_FIELD(SRCCODE)) {
......@@ -3433,7 +3454,7 @@ int cmd_script(int argc, const char **argv)
"Fields: comm,tid,pid,time,cpu,event,trace,ip,sym,dso,"
"addr,symoff,srcline,period,iregs,uregs,brstack,"
"brstacksym,flags,bpf-output,brstackinsn,brstackoff,"
"callindent,insn,insnlen,synth,phys_addr,metric,misc",
"callindent,insn,insnlen,synth,phys_addr,metric,misc,ipc",
parse_output_fields),
OPT_BOOLEAN('a', "all-cpus", &system_wide,
"system-wide collection from all CPUs"),
......@@ -3808,6 +3829,10 @@ int cmd_script(int argc, const char **argv)
&script.range_num);
if (err < 0)
goto out_delete;
itrace_synth_opts__set_time_range(&itrace_synth_opts,
script.ptime_range,
script.range_num);
}
err = __cmd_script(&script);
......@@ -3815,8 +3840,10 @@ int cmd_script(int argc, const char **argv)
flush_scripting();
out_delete:
if (script.ptime_range)
if (script.ptime_range) {
itrace_synth_opts__clear_time_range(&itrace_synth_opts);
zfree(&script.ptime_range);
}
perf_evlist__free_stats(session->evlist);
perf_session__delete(session);
......
......@@ -776,6 +776,8 @@ static struct option stat_options[] = {
"stop workload and print counts after a timeout period in ms (>= 10ms)"),
OPT_SET_UINT(0, "per-socket", &stat_config.aggr_mode,
"aggregate counts per processor socket", AGGR_SOCKET),
OPT_SET_UINT(0, "per-die", &stat_config.aggr_mode,
"aggregate counts per processor die", AGGR_DIE),
OPT_SET_UINT(0, "per-core", &stat_config.aggr_mode,
"aggregate counts per physical processor core", AGGR_CORE),
OPT_SET_UINT(0, "per-thread", &stat_config.aggr_mode,
......@@ -800,6 +802,12 @@ static int perf_stat__get_socket(struct perf_stat_config *config __maybe_unused,
return cpu_map__get_socket(map, cpu, NULL);
}
static int perf_stat__get_die(struct perf_stat_config *config __maybe_unused,
struct cpu_map *map, int cpu)
{
return cpu_map__get_die(map, cpu, NULL);
}
static int perf_stat__get_core(struct perf_stat_config *config __maybe_unused,
struct cpu_map *map, int cpu)
{
......@@ -840,6 +848,12 @@ static int perf_stat__get_socket_cached(struct perf_stat_config *config,
return perf_stat__get_aggr(config, perf_stat__get_socket, map, idx);
}
static int perf_stat__get_die_cached(struct perf_stat_config *config,
struct cpu_map *map, int idx)
{
return perf_stat__get_aggr(config, perf_stat__get_die, map, idx);
}
static int perf_stat__get_core_cached(struct perf_stat_config *config,
struct cpu_map *map, int idx)
{
......@@ -870,6 +884,13 @@ static int perf_stat_init_aggr_mode(void)
}
stat_config.aggr_get_id = perf_stat__get_socket_cached;
break;
case AGGR_DIE:
if (cpu_map__build_die_map(evsel_list->cpus, &stat_config.aggr_map)) {
perror("cannot build die map");
return -1;
}
stat_config.aggr_get_id = perf_stat__get_die_cached;
break;
case AGGR_CORE:
if (cpu_map__build_core_map(evsel_list->cpus, &stat_config.aggr_map)) {
perror("cannot build core map");
......@@ -935,21 +956,55 @@ static int perf_env__get_socket(struct cpu_map *map, int idx, void *data)
return cpu == -1 ? -1 : env->cpu[cpu].socket_id;
}
static int perf_env__get_die(struct cpu_map *map, int idx, void *data)
{
struct perf_env *env = data;
int die_id = -1, cpu = perf_env__get_cpu(env, map, idx);
if (cpu != -1) {
/*
* Encode socket in bit range 15:8
* die_id is relative to socket,
* we need a global id. So we combine
* socket + die id
*/
if (WARN_ONCE(env->cpu[cpu].socket_id >> 8, "The socket id number is too big.\n"))
return -1;
if (WARN_ONCE(env->cpu[cpu].die_id >> 8, "The die id number is too big.\n"))
return -1;
die_id = (env->cpu[cpu].socket_id << 8) | (env->cpu[cpu].die_id & 0xff);
}
return die_id;
}
static int perf_env__get_core(struct cpu_map *map, int idx, void *data)
{
struct perf_env *env = data;
int core = -1, cpu = perf_env__get_cpu(env, map, idx);
if (cpu != -1) {
int socket_id = env->cpu[cpu].socket_id;
/*
* Encode socket in upper 16 bits
* core_id is relative to socket, and
* Encode socket in bit range 31:24
* encode die id in bit range 23:16
* core_id is relative to socket and die,
* we need a global id. So we combine
* socket + core id.
* socket + die id + core id
*/
core = (socket_id << 16) | (env->cpu[cpu].core_id & 0xffff);
if (WARN_ONCE(env->cpu[cpu].socket_id >> 8, "The socket id number is too big.\n"))
return -1;
if (WARN_ONCE(env->cpu[cpu].die_id >> 8, "The die id number is too big.\n"))
return -1;
if (WARN_ONCE(env->cpu[cpu].core_id >> 16, "The core id number is too big.\n"))
return -1;
core = (env->cpu[cpu].socket_id << 24) |
(env->cpu[cpu].die_id << 16) |
(env->cpu[cpu].core_id & 0xffff);
}
return core;
......@@ -961,6 +1016,12 @@ static int perf_env__build_socket_map(struct perf_env *env, struct cpu_map *cpus
return cpu_map__build_map(cpus, sockp, perf_env__get_socket, env);
}
static int perf_env__build_die_map(struct perf_env *env, struct cpu_map *cpus,
struct cpu_map **diep)
{
return cpu_map__build_map(cpus, diep, perf_env__get_die, env);
}
static int perf_env__build_core_map(struct perf_env *env, struct cpu_map *cpus,
struct cpu_map **corep)
{
......@@ -972,6 +1033,11 @@ static int perf_stat__get_socket_file(struct perf_stat_config *config __maybe_un
{
return perf_env__get_socket(map, idx, &perf_stat.session->header.env);
}
static int perf_stat__get_die_file(struct perf_stat_config *config __maybe_unused,
struct cpu_map *map, int idx)
{
return perf_env__get_die(map, idx, &perf_stat.session->header.env);
}
static int perf_stat__get_core_file(struct perf_stat_config *config __maybe_unused,
struct cpu_map *map, int idx)
......@@ -991,6 +1057,13 @@ static int perf_stat_init_aggr_mode_file(struct perf_stat *st)
}
stat_config.aggr_get_id = perf_stat__get_socket_file;
break;
case AGGR_DIE:
if (perf_env__build_die_map(env, evsel_list->cpus, &stat_config.aggr_map)) {
perror("cannot build die map");
return -1;
}
stat_config.aggr_get_id = perf_stat__get_die_file;
break;
case AGGR_CORE:
if (perf_env__build_core_map(env, evsel_list->cpus, &stat_config.aggr_map)) {
perror("cannot build core map");
......@@ -1541,6 +1614,8 @@ static int __cmd_report(int argc, const char **argv)
OPT_STRING('i', "input", &input_name, "file", "input file name"),
OPT_SET_UINT(0, "per-socket", &perf_stat.aggr_mode,
"aggregate counts per processor socket", AGGR_SOCKET),
OPT_SET_UINT(0, "per-die", &perf_stat.aggr_mode,
"aggregate counts per processor die", AGGR_DIE),
OPT_SET_UINT(0, "per-core", &perf_stat.aggr_mode,
"aggregate counts per physical processor core", AGGR_CORE),
OPT_SET_UINT('A', "no-aggr", &perf_stat.aggr_mode,
......
......@@ -971,8 +971,14 @@ struct syscall {
struct syscall_arg_fmt *arg_fmt;
};
/*
* Must match what is in the BPF program:
*
* tools/perf/examples/bpf/augmented_raw_syscalls.c
*/
struct bpf_map_syscall_entry {
bool enabled;
u16 string_args_len[6];
};
/*
......@@ -1226,8 +1232,17 @@ static void thread__set_filename_pos(struct thread *thread, const char *bf,
static size_t syscall_arg__scnprintf_augmented_string(struct syscall_arg *arg, char *bf, size_t size)
{
struct augmented_arg *augmented_arg = arg->augmented.args;
size_t printed = scnprintf(bf, size, "\"%.*s\"", augmented_arg->size, augmented_arg->value);
/*
* So that the next arg with a payload can consume its augmented arg, i.e. for rename* syscalls
* we would have two strings, each prefixed by its size.
*/
int consumed = sizeof(*augmented_arg) + augmented_arg->size;
arg->augmented.args += consumed;
arg->augmented.size -= consumed;
return scnprintf(bf, size, "\"%.*s\"", augmented_arg->size, augmented_arg->value);
return printed;
}
static size_t syscall_arg__scnprintf_filename(char *bf, size_t size,
......@@ -1415,10 +1430,11 @@ static int syscall__set_arg_fmts(struct syscall *sc)
if (sc->fmt && sc->fmt->arg[idx].scnprintf)
continue;
len = strlen(field->name);
if (strcmp(field->type, "const char *") == 0 &&
(strcmp(field->name, "filename") == 0 ||
strcmp(field->name, "path") == 0 ||
strcmp(field->name, "pathname") == 0))
((len >= 4 && strcmp(field->name + len - 4, "name") == 0) ||
strstr(field->name, "path") != NULL))
sc->arg_fmt[idx].scnprintf = SCA_FILENAME;
else if ((field->flags & TEP_FIELD_IS_POINTER) || strstr(field->name, "addr"))
sc->arg_fmt[idx].scnprintf = SCA_PTR;
......@@ -1429,8 +1445,7 @@ static int syscall__set_arg_fmts(struct syscall *sc)
else if ((strcmp(field->type, "int") == 0 ||
strcmp(field->type, "unsigned int") == 0 ||
strcmp(field->type, "long") == 0) &&
(len = strlen(field->name)) >= 2 &&
strcmp(field->name + len - 2, "fd") == 0) {
len >= 2 && strcmp(field->name + len - 2, "fd") == 0) {
/*
* /sys/kernel/tracing/events/syscalls/sys_enter*
* egrep 'field:.*fd;' .../format|sed -r 's/.*field:([a-z ]+) [a-z_]*fd.+/\1/g'|sort|uniq -c
......@@ -1513,6 +1528,7 @@ static int trace__read_syscall_info(struct trace *trace, int id)
static int trace__validate_ev_qualifier(struct trace *trace)
{
int err = 0, i;
bool printed_invalid_prefix = false;
size_t nr_allocated;
struct str_node *pos;
......@@ -1539,14 +1555,15 @@ static int trace__validate_ev_qualifier(struct trace *trace)
if (id >= 0)
goto matches;
if (err == 0) {
fputs("Error:\tInvalid syscall ", trace->output);
err = -EINVAL;
if (!printed_invalid_prefix) {
pr_debug("Skipping unknown syscalls: ");
printed_invalid_prefix = true;
} else {
fputs(", ", trace->output);
pr_debug(", ");
}
fputs(sc, trace->output);
pr_debug("%s", sc);
continue;
}
matches:
trace->ev_qualifier_ids.entries[i++] = id;
......@@ -1575,15 +1592,14 @@ static int trace__validate_ev_qualifier(struct trace *trace)
}
}
if (err < 0) {
fputs("\nHint:\ttry 'perf list syscalls:sys_enter_*'"
"\nHint:\tand: 'man syscalls'\n", trace->output);
out_free:
zfree(&trace->ev_qualifier_ids.entries);
trace->ev_qualifier_ids.nr = 0;
}
out:
if (printed_invalid_prefix)
pr_debug("\n");
return err;
out_free:
zfree(&trace->ev_qualifier_ids.entries);
trace->ev_qualifier_ids.nr = 0;
goto out;
}
/*
......@@ -2710,6 +2726,25 @@ static int trace__set_ev_qualifier_tp_filter(struct trace *trace)
}
#ifdef HAVE_LIBBPF_SUPPORT
static void trace__init_bpf_map_syscall_args(struct trace *trace, int id, struct bpf_map_syscall_entry *entry)
{
struct syscall *sc = trace__syscall_info(trace, NULL, id);
int arg = 0;
if (sc == NULL)
goto out;
for (; arg < sc->nr_args; ++arg) {
entry->string_args_len[arg] = 0;
if (sc->arg_fmt[arg].scnprintf == SCA_FILENAME) {
/* Should be set like strace -s strsize */
entry->string_args_len[arg] = PATH_MAX;
}
}
out:
for (; arg < 6; ++arg)
entry->string_args_len[arg] = 0;
}
static int trace__set_ev_qualifier_bpf_filter(struct trace *trace)
{
int fd = bpf_map__fd(trace->syscalls.map);
......@@ -2722,6 +2757,9 @@ static int trace__set_ev_qualifier_bpf_filter(struct trace *trace)
for (i = 0; i < trace->ev_qualifier_ids.nr; ++i) {
int key = trace->ev_qualifier_ids.entries[i];
if (value.enabled)
trace__init_bpf_map_syscall_args(trace, key, &value);
err = bpf_map_update_elem(fd, &key, &value, BPF_EXIST);
if (err)
break;
......@@ -2739,6 +2777,9 @@ static int __trace__init_syscalls_bpf_map(struct trace *trace, bool enabled)
int err = 0, key;
for (key = 0; key < trace->sctbl->syscalls.nr_entries; ++key) {
if (enabled)
trace__init_bpf_map_syscall_args(trace, key, &value);
err = bpf_map_update_elem(fd, &key, &value, BPF_ANY);
if (err)
break;
......@@ -3662,7 +3703,12 @@ static int trace__config(const char *var, const char *value, void *arg)
struct option o = OPT_CALLBACK('e', "event", &trace->evlist, "event",
"event selector. use 'perf list' to list available events",
parse_events_option);
err = parse_events_option(&o, value, 0);
/*
* We can't propagate parse_event_option() return, as it is 1
* for failure while perf_config() expects -1.
*/
if (parse_events_option(&o, value, 0))
err = -1;
} else if (!strcmp(var, "trace.show_timestamp")) {
trace->show_tstamp = perf_config_bool(var, value);
} else if (!strcmp(var, "trace.show_duration")) {
......
// SPDX-License-Identifier: GPL-2.0
#include <linux/compiler.h>
#include <linux/string.h>
#include <sys/types.h>
#include <stdio.h>
#include <string.h>
......@@ -162,8 +163,7 @@ copy_class_filename(const char * class_sign, const char * file_name, char * resu
result[i] = '\0';
} else {
/* fallback case */
size_t file_name_len = strlen(file_name);
strncpy(result, file_name, file_name_len < max_length ? file_name_len : max_length);
strlcpy(result, file_name, max_length);
}
}
......
......@@ -61,6 +61,8 @@ struct record_opts {
bool record_switch_events;
bool all_kernel;
bool all_user;
bool kernel_callchains;
bool user_callchains;
bool tail_synthesize;
bool overwrite;
bool ignore_missing_thread;
......
......@@ -394,7 +394,9 @@ if branches:
'to_ip bigint,'
'branch_type integer,'
'in_tx boolean,'
'call_path_id bigint)')
'call_path_id bigint,'
'insn_count bigint,'
'cyc_count bigint)')
else:
do_query(query, 'CREATE TABLE samples ('
'id bigint NOT NULL,'
......@@ -418,7 +420,9 @@ else:
'data_src bigint,'
'branch_type integer,'
'in_tx boolean,'
'call_path_id bigint)')
'call_path_id bigint,'
'insn_count bigint,'
'cyc_count bigint)')
if perf_db_export_calls or perf_db_export_callchains:
do_query(query, 'CREATE TABLE call_paths ('
......@@ -439,7 +443,9 @@ if perf_db_export_calls:
'return_id bigint,'
'parent_call_path_id bigint,'
'flags integer,'
'parent_id bigint)')
'parent_id bigint,'
'insn_count bigint,'
'cyc_count bigint)')
do_query(query, 'CREATE VIEW machines_view AS '
'SELECT '
......@@ -521,6 +527,9 @@ if perf_db_export_calls:
'return_time,'
'return_time - call_time AS elapsed_time,'
'branch_count,'
'insn_count,'
'cyc_count,'
'CASE WHEN cyc_count=0 THEN CAST(0 AS NUMERIC(20, 2)) ELSE CAST((CAST(insn_count AS FLOAT) / cyc_count) AS NUMERIC(20, 2)) END AS IPC,'
'call_id,'
'return_id,'
'CASE WHEN flags=0 THEN \'\' WHEN flags=1 THEN \'no call\' WHEN flags=2 THEN \'no return\' WHEN flags=3 THEN \'no call/return\' WHEN flags=6 THEN \'jump\' ELSE CAST ( flags AS VARCHAR(6) ) END AS flags,'
......@@ -546,7 +555,10 @@ do_query(query, 'CREATE VIEW samples_view AS '
'to_sym_offset,'
'(SELECT short_name FROM dsos WHERE id = to_dso_id) AS to_dso_short_name,'
'(SELECT name FROM branch_types WHERE id = branch_type) AS branch_type_name,'
'in_tx'
'in_tx,'
'insn_count,'
'cyc_count,'
'CASE WHEN cyc_count=0 THEN CAST(0 AS NUMERIC(20, 2)) ELSE CAST((CAST(insn_count AS FLOAT) / cyc_count) AS NUMERIC(20, 2)) END AS IPC'
' FROM samples')
......@@ -618,10 +630,10 @@ def trace_begin():
comm_table(0, "unknown")
dso_table(0, 0, "unknown", "unknown", "")
symbol_table(0, 0, 0, 0, 0, "unknown")
sample_table(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
sample_table(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
if perf_db_export_calls or perf_db_export_callchains:
call_path_table(0, 0, 0, 0)
call_return_table(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
call_return_table(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
unhandled_count = 0
......@@ -772,11 +784,11 @@ def branch_type_table(branch_type, name, *x):
value = struct.pack(fmt, 2, 4, branch_type, n, name)
branch_type_file.write(value)
def sample_table(sample_id, evsel_id, machine_id, thread_id, comm_id, dso_id, symbol_id, sym_offset, ip, time, cpu, to_dso_id, to_symbol_id, to_sym_offset, to_ip, period, weight, transaction, data_src, branch_type, in_tx, call_path_id, *x):
def sample_table(sample_id, evsel_id, machine_id, thread_id, comm_id, dso_id, symbol_id, sym_offset, ip, time, cpu, to_dso_id, to_symbol_id, to_sym_offset, to_ip, period, weight, transaction, data_src, branch_type, in_tx, call_path_id, insn_cnt, cyc_cnt, *x):
if branches:
value = struct.pack("!hiqiqiqiqiqiqiqiqiqiqiiiqiqiqiqiiiBiq", 18, 8, sample_id, 8, evsel_id, 8, machine_id, 8, thread_id, 8, comm_id, 8, dso_id, 8, symbol_id, 8, sym_offset, 8, ip, 8, time, 4, cpu, 8, to_dso_id, 8, to_symbol_id, 8, to_sym_offset, 8, to_ip, 4, branch_type, 1, in_tx, 8, call_path_id)
value = struct.pack("!hiqiqiqiqiqiqiqiqiqiqiiiqiqiqiqiiiBiqiqiq", 20, 8, sample_id, 8, evsel_id, 8, machine_id, 8, thread_id, 8, comm_id, 8, dso_id, 8, symbol_id, 8, sym_offset, 8, ip, 8, time, 4, cpu, 8, to_dso_id, 8, to_symbol_id, 8, to_sym_offset, 8, to_ip, 4, branch_type, 1, in_tx, 8, call_path_id, 8, insn_cnt, 8, cyc_cnt)
else:
value = struct.pack("!hiqiqiqiqiqiqiqiqiqiqiiiqiqiqiqiqiqiqiqiiiBiq", 22, 8, sample_id, 8, evsel_id, 8, machine_id, 8, thread_id, 8, comm_id, 8, dso_id, 8, symbol_id, 8, sym_offset, 8, ip, 8, time, 4, cpu, 8, to_dso_id, 8, to_symbol_id, 8, to_sym_offset, 8, to_ip, 8, period, 8, weight, 8, transaction, 8, data_src, 4, branch_type, 1, in_tx, 8, call_path_id)
value = struct.pack("!hiqiqiqiqiqiqiqiqiqiqiiiqiqiqiqiqiqiqiqiiiBiqiqiq", 24, 8, sample_id, 8, evsel_id, 8, machine_id, 8, thread_id, 8, comm_id, 8, dso_id, 8, symbol_id, 8, sym_offset, 8, ip, 8, time, 4, cpu, 8, to_dso_id, 8, to_symbol_id, 8, to_sym_offset, 8, to_ip, 8, period, 8, weight, 8, transaction, 8, data_src, 4, branch_type, 1, in_tx, 8, call_path_id, 8, insn_cnt, 8, cyc_cnt)
sample_file.write(value)
def call_path_table(cp_id, parent_id, symbol_id, ip, *x):
......@@ -784,7 +796,7 @@ def call_path_table(cp_id, parent_id, symbol_id, ip, *x):
value = struct.pack(fmt, 4, 8, cp_id, 8, parent_id, 8, symbol_id, 8, ip)
call_path_file.write(value)
def call_return_table(cr_id, thread_id, comm_id, call_path_id, call_time, return_time, branch_count, call_id, return_id, parent_call_path_id, flags, parent_id, *x):
fmt = "!hiqiqiqiqiqiqiqiqiqiqiiiq"
value = struct.pack(fmt, 12, 8, cr_id, 8, thread_id, 8, comm_id, 8, call_path_id, 8, call_time, 8, return_time, 8, branch_count, 8, call_id, 8, return_id, 8, parent_call_path_id, 4, flags, 8, parent_id)
def call_return_table(cr_id, thread_id, comm_id, call_path_id, call_time, return_time, branch_count, call_id, return_id, parent_call_path_id, flags, parent_id, insn_cnt, cyc_cnt, *x):
fmt = "!hiqiqiqiqiqiqiqiqiqiqiiiqiqiq"
value = struct.pack(fmt, 14, 8, cr_id, 8, thread_id, 8, comm_id, 8, call_path_id, 8, call_time, 8, return_time, 8, branch_count, 8, call_id, 8, return_id, 8, parent_call_path_id, 4, flags, 8, parent_id, 8, insn_cnt, 8, cyc_cnt)
call_file.write(value)
......@@ -218,7 +218,9 @@ if branches:
'to_ip bigint,'
'branch_type integer,'
'in_tx boolean,'
'call_path_id bigint)')
'call_path_id bigint,'
'insn_count bigint,'
'cyc_count bigint)')
else:
do_query(query, 'CREATE TABLE samples ('
'id integer NOT NULL PRIMARY KEY,'
......@@ -242,7 +244,9 @@ else:
'data_src bigint,'
'branch_type integer,'
'in_tx boolean,'
'call_path_id bigint)')
'call_path_id bigint,'
'insn_count bigint,'
'cyc_count bigint)')
if perf_db_export_calls or perf_db_export_callchains:
do_query(query, 'CREATE TABLE call_paths ('
......@@ -263,7 +267,9 @@ if perf_db_export_calls:
'return_id bigint,'
'parent_call_path_id bigint,'
'flags integer,'
'parent_id bigint)')
'parent_id bigint,'
'insn_count bigint,'
'cyc_count bigint)')
# printf was added to sqlite in version 3.8.3
sqlite_has_printf = False
......@@ -359,6 +365,9 @@ if perf_db_export_calls:
'return_time,'
'return_time - call_time AS elapsed_time,'
'branch_count,'
'insn_count,'
'cyc_count,'
'CASE WHEN cyc_count=0 THEN CAST(0 AS FLOAT) ELSE ROUND(CAST(insn_count AS FLOAT) / cyc_count, 2) END AS IPC,'
'call_id,'
'return_id,'
'CASE WHEN flags=0 THEN \'\' WHEN flags=1 THEN \'no call\' WHEN flags=2 THEN \'no return\' WHEN flags=3 THEN \'no call/return\' WHEN flags=6 THEN \'jump\' ELSE flags END AS flags,'
......@@ -384,7 +393,10 @@ do_query(query, 'CREATE VIEW samples_view AS '
'to_sym_offset,'
'(SELECT short_name FROM dsos WHERE id = to_dso_id) AS to_dso_short_name,'
'(SELECT name FROM branch_types WHERE id = branch_type) AS branch_type_name,'
'in_tx'
'in_tx,'
'insn_count,'
'cyc_count,'
'CASE WHEN cyc_count=0 THEN CAST(0 AS FLOAT) ELSE ROUND(CAST(insn_count AS FLOAT) / cyc_count, 2) END AS IPC'
' FROM samples')
do_query(query, 'END TRANSACTION')
......@@ -407,15 +419,15 @@ branch_type_query = QSqlQuery(db)
branch_type_query.prepare("INSERT INTO branch_types VALUES (?, ?)")
sample_query = QSqlQuery(db)
if branches:
sample_query.prepare("INSERT INTO samples VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)")
sample_query.prepare("INSERT INTO samples VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)")
else:
sample_query.prepare("INSERT INTO samples VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)")
sample_query.prepare("INSERT INTO samples VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)")
if perf_db_export_calls or perf_db_export_callchains:
call_path_query = QSqlQuery(db)
call_path_query.prepare("INSERT INTO call_paths VALUES (?, ?, ?, ?)")
if perf_db_export_calls:
call_query = QSqlQuery(db)
call_query.prepare("INSERT INTO calls VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)")
call_query.prepare("INSERT INTO calls VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)")
def trace_begin():
printdate("Writing records...")
......@@ -427,10 +439,10 @@ def trace_begin():
comm_table(0, "unknown")
dso_table(0, 0, "unknown", "unknown", "")
symbol_table(0, 0, 0, 0, 0, "unknown")
sample_table(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
sample_table(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
if perf_db_export_calls or perf_db_export_callchains:
call_path_table(0, 0, 0, 0)
call_return_table(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
call_return_table(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
unhandled_count = 0
......@@ -486,14 +498,14 @@ def sample_table(*x):
if branches:
for xx in x[0:15]:
sample_query.addBindValue(str(xx))
for xx in x[19:22]:
for xx in x[19:24]:
sample_query.addBindValue(str(xx))
do_query_(sample_query)
else:
bind_exec(sample_query, 22, x)
bind_exec(sample_query, 24, x)
def call_path_table(*x):
bind_exec(call_path_query, 4, x)
def call_return_table(*x):
bind_exec(call_query, 12, x)
bind_exec(call_query, 14, x)
......@@ -51,6 +51,7 @@ perf-y += clang.o
perf-y += unit_number__scnprintf.o
perf-y += mem2node.o
perf-y += map_groups.o
perf-y += time-utils-test.o
$(OUTPUT)tests/llvm-src-base.c: tests/bpf-script-example.c tests/Build
$(call rule_mkdir)
......
......@@ -289,6 +289,10 @@ static struct test generic_tests[] = {
.desc = "mem2node",
.func = test__mem2node,
},
{
.desc = "time utils",
.func = test__time_utils,
},
{
.desc = "map_groups__merge_in",
.func = test__map_groups__merge_in,
......
......@@ -18,6 +18,32 @@
#define PERF_TP_SAMPLE_TYPE (PERF_SAMPLE_RAW | PERF_SAMPLE_TIME | \
PERF_SAMPLE_CPU | PERF_SAMPLE_PERIOD)
#if defined(__s390x__)
/* Return true if kvm module is available and loaded. Test this
* and retun success when trace point kvm_s390_create_vm
* exists. Otherwise this test always fails.
*/
static bool kvm_s390_create_vm_valid(void)
{
char *eventfile;
bool rc = false;
eventfile = get_events_file("kvm-s390");
if (eventfile) {
DIR *mydir = opendir(eventfile);
if (mydir) {
rc = true;
closedir(mydir);
}
put_events_file(eventfile);
}
return rc;
}
#endif
static int test__checkevent_tracepoint(struct perf_evlist *evlist)
{
struct perf_evsel *evsel = perf_evlist__first(evlist);
......@@ -1642,6 +1668,7 @@ static struct evlist_test test__events[] = {
{
.name = "kvm-s390:kvm_s390_create_vm",
.check = test__checkevent_tracepoint,
.valid = kvm_s390_create_vm_valid,
.id = 100,
},
#endif
......
......@@ -108,6 +108,7 @@ int test__clang_subtest_get_nr(void);
int test__unit_number__scnprint(struct test *test, int subtest);
int test__mem2node(struct test *t, int subtest);
int test__map_groups__merge_in(struct test *t, int subtest);
int test__time_utils(struct test *t, int subtest);
bool test__bp_signal_is_supported(void);
bool test__wp_is_supported(void);
......
// SPDX-License-Identifier: GPL-2.0
#include <linux/compiler.h>
#include <linux/time64.h>
#include <inttypes.h>
#include <string.h>
#include "time-utils.h"
#include "evlist.h"
#include "session.h"
#include "debug.h"
#include "tests.h"
static bool test__parse_nsec_time(const char *str, u64 expected)
{
u64 ptime;
int err;
pr_debug("\nparse_nsec_time(\"%s\")\n", str);
err = parse_nsec_time(str, &ptime);
if (err) {
pr_debug("error %d\n", err);
return false;
}
if (ptime != expected) {
pr_debug("Failed. ptime %" PRIu64 " expected %" PRIu64 "\n",
ptime, expected);
return false;
}
pr_debug("%" PRIu64 "\n", ptime);
return true;
}
static bool test__perf_time__parse_str(const char *ostr, u64 start, u64 end)
{
struct perf_time_interval ptime;
int err;
pr_debug("\nperf_time__parse_str(\"%s\")\n", ostr);
err = perf_time__parse_str(&ptime, ostr);
if (err) {
pr_debug("Error %d\n", err);
return false;
}
if (ptime.start != start || ptime.end != end) {
pr_debug("Failed. Expected %" PRIu64 " to %" PRIu64 "\n",
start, end);
return false;
}
return true;
}
#define TEST_MAX 64
struct test_data {
const char *str;
u64 first;
u64 last;
struct perf_time_interval ptime[TEST_MAX];
int num;
u64 skip[TEST_MAX];
u64 noskip[TEST_MAX];
};
static bool test__perf_time__parse_for_ranges(struct test_data *d)
{
struct perf_evlist evlist = {
.first_sample_time = d->first,
.last_sample_time = d->last,
};
struct perf_session session = { .evlist = &evlist };
struct perf_time_interval *ptime = NULL;
int range_size, range_num;
bool pass = false;
int i, err;
pr_debug("\nperf_time__parse_for_ranges(\"%s\")\n", d->str);
if (strchr(d->str, '%'))
pr_debug("first_sample_time %" PRIu64 " last_sample_time %" PRIu64 "\n",
d->first, d->last);
err = perf_time__parse_for_ranges(d->str, &session, &ptime, &range_size,
&range_num);
if (err) {
pr_debug("error %d\n", err);
goto out;
}
if (range_size < d->num || range_num != d->num) {
pr_debug("bad size: range_size %d range_num %d expected num %d\n",
range_size, range_num, d->num);
goto out;
}
for (i = 0; i < d->num; i++) {
if (ptime[i].start != d->ptime[i].start ||
ptime[i].end != d->ptime[i].end) {
pr_debug("bad range %d expected %" PRIu64 " to %" PRIu64 "\n",
i, d->ptime[i].start, d->ptime[i].end);
goto out;
}
}
if (perf_time__ranges_skip_sample(ptime, d->num, 0)) {
pr_debug("failed to keep 0\n");
goto out;
}
for (i = 0; i < TEST_MAX; i++) {
if (d->skip[i] &&
!perf_time__ranges_skip_sample(ptime, d->num, d->skip[i])) {
pr_debug("failed to skip %" PRIu64 "\n", d->skip[i]);
goto out;
}
if (d->noskip[i] &&
perf_time__ranges_skip_sample(ptime, d->num, d->noskip[i])) {
pr_debug("failed to keep %" PRIu64 "\n", d->noskip[i]);
goto out;
}
}
pass = true;
out:
free(ptime);
return pass;
}
int test__time_utils(struct test *t __maybe_unused, int subtest __maybe_unused)
{
bool pass = true;
pass &= test__parse_nsec_time("0", 0);
pass &= test__parse_nsec_time("1", 1000000000ULL);
pass &= test__parse_nsec_time("0.000000001", 1);
pass &= test__parse_nsec_time("1.000000001", 1000000001ULL);
pass &= test__parse_nsec_time("123456.123456", 123456123456000ULL);
pass &= test__parse_nsec_time("1234567.123456789", 1234567123456789ULL);
pass &= test__parse_nsec_time("18446744073.709551615",
0xFFFFFFFFFFFFFFFFULL);
pass &= test__perf_time__parse_str("1234567.123456789,1234567.123456789",
1234567123456789ULL, 1234567123456789ULL);
pass &= test__perf_time__parse_str("1234567.123456789,1234567.123456790",
1234567123456789ULL, 1234567123456790ULL);
pass &= test__perf_time__parse_str("1234567.123456789,",
1234567123456789ULL, 0);
pass &= test__perf_time__parse_str(",1234567.123456789",
0, 1234567123456789ULL);
pass &= test__perf_time__parse_str("0,1234567.123456789",
0, 1234567123456789ULL);
{
u64 b = 1234567123456789ULL;
struct test_data d = {
.str = "1234567.123456789,1234567.123456790",
.ptime = { {b, b + 1}, },
.num = 1,
.skip = { b - 1, b + 2, },
.noskip = { b, b + 1, },
};
pass &= test__perf_time__parse_for_ranges(&d);
}
{
u64 b = 1234567123456789ULL;
u64 c = 7654321987654321ULL;
u64 e = 8000000000000000ULL;
struct test_data d = {
.str = "1234567.123456789,1234567.123456790 "
"7654321.987654321,7654321.987654444 "
"8000000,8000000.000000005",
.ptime = { {b, b + 1}, {c, c + 123}, {e, e + 5}, },
.num = 3,
.skip = { b - 1, b + 2, c - 1, c + 124, e - 1, e + 6 },
.noskip = { b, b + 1, c, c + 123, e, e + 5 },
};
pass &= test__perf_time__parse_for_ranges(&d);
}
{
u64 b = 7654321ULL * NSEC_PER_SEC;
struct test_data d = {
.str = "10%/1",
.first = b,
.last = b + 100,
.ptime = { {b, b + 9}, },
.num = 1,
.skip = { b - 1, b + 10, },
.noskip = { b, b + 9, },
};
pass &= test__perf_time__parse_for_ranges(&d);
}
{
u64 b = 7654321ULL * NSEC_PER_SEC;
struct test_data d = {
.str = "10%/2",
.first = b,
.last = b + 100,
.ptime = { {b + 10, b + 19}, },
.num = 1,
.skip = { b + 9, b + 20, },
.noskip = { b + 10, b + 19, },
};
pass &= test__perf_time__parse_for_ranges(&d);
}
{
u64 b = 11223344ULL * NSEC_PER_SEC;
struct test_data d = {
.str = "10%/1,10%/2",
.first = b,
.last = b + 100,
.ptime = { {b, b + 9}, {b + 10, b + 19}, },
.num = 2,
.skip = { b - 1, b + 20, },
.noskip = { b, b + 8, b + 9, b + 10, b + 11, b + 12, b + 19, },
};
pass &= test__perf_time__parse_for_ranges(&d);
}
{
u64 b = 11223344ULL * NSEC_PER_SEC;
struct test_data d = {
.str = "10%/1,10%/3,10%/10",
.first = b,
.last = b + 100,
.ptime = { {b, b + 9}, {b + 20, b + 29}, { b + 90, b + 100}, },
.num = 3,
.skip = { b - 1, b + 10, b + 19, b + 30, b + 89, b + 101 },
.noskip = { b, b + 9, b + 20, b + 29, b + 90, b + 100},
};
pass &= test__perf_time__parse_for_ranges(&d);
}
pr_debug("\n");
return pass ? 0 : TEST_FAIL;
}
......@@ -931,9 +931,8 @@ static int symbol__inc_addr_samples(struct symbol *sym, struct map *map,
if (sym == NULL)
return 0;
src = symbol__hists(sym, evsel->evlist->nr_entries);
if (src == NULL)
return -ENOMEM;
return __symbol__inc_addr_samples(sym, map, src, evsel->idx, addr, sample);
return (src) ? __symbol__inc_addr_samples(sym, map, src, evsel->idx,
addr, sample) : 0;
}
static int symbol__account_cycles(u64 addr, u64 start,
......
......@@ -74,6 +74,8 @@ enum itrace_period_type {
* @period_type: 'instructions' events period type
* @initial_skip: skip N events at the beginning.
* @cpu_bitmap: CPUs for which to synthesize events, or NULL for all
* @ptime_range: time intervals to trace or NULL
* @range_num: number of time intervals to trace
*/
struct itrace_synth_opts {
bool set;
......@@ -98,6 +100,8 @@ struct itrace_synth_opts {
enum itrace_period_type period_type;
unsigned long initial_skip;
unsigned long *cpu_bitmap;
struct perf_time_interval *ptime_range;
int range_num;
};
/**
......@@ -590,6 +594,21 @@ static inline void auxtrace__free(struct perf_session *session)
" PERIOD[ns|us|ms|i|t]: specify period to sample stream\n" \
" concatenate multiple options. Default is ibxwpe or cewp\n"
static inline
void itrace_synth_opts__set_time_range(struct itrace_synth_opts *opts,
struct perf_time_interval *ptime_range,
int range_num)
{
opts->ptime_range = ptime_range;
opts->range_num = range_num;
}
static inline
void itrace_synth_opts__clear_time_range(struct itrace_synth_opts *opts)
{
opts->ptime_range = NULL;
opts->range_num = 0;
}
#else
......@@ -733,6 +752,21 @@ void auxtrace_mmap_params__set_idx(struct auxtrace_mmap_params *mp,
#define ITRACE_HELP ""
static inline
void itrace_synth_opts__set_time_range(struct itrace_synth_opts *opts
__maybe_unused,
struct perf_time_interval *ptime_range
__maybe_unused,
int range_num __maybe_unused)
{
}
static inline
void itrace_synth_opts__clear_time_range(struct itrace_synth_opts *opts
__maybe_unused)
{
}
#endif
#endif
......@@ -739,11 +739,15 @@ int perf_config(config_fn_t fn, void *data)
if (ret < 0) {
pr_err("Error: wrong config key-value pair %s=%s\n",
key, value);
break;
/*
* Can't be just a 'break', as perf_config_set__for_each_entry()
* expands to two nested for() loops.
*/
goto out;
}
}
}
out:
return ret;
}
......
......@@ -373,6 +373,46 @@ int cpu_map__build_map(struct cpu_map *cpus, struct cpu_map **res,
return 0;
}
int cpu_map__get_die_id(int cpu)
{
int value, ret = cpu__get_topology_int(cpu, "die_id", &value);
return ret ?: value;
}
int cpu_map__get_die(struct cpu_map *map, int idx, void *data)
{
int cpu, die_id, s;
if (idx > map->nr)
return -1;
cpu = map->map[idx];
die_id = cpu_map__get_die_id(cpu);
/* There is no die_id on legacy system. */
if (die_id == -1)
die_id = 0;
s = cpu_map__get_socket(map, idx, data);
if (s == -1)
return -1;
/*
* Encode socket in bit range 15:8
* die_id is relative to socket, and
* we need a global id. So we combine
* socket + die id
*/
if (WARN_ONCE(die_id >> 8, "The die id number is too big.\n"))
return -1;
if (WARN_ONCE(s >> 8, "The socket id number is too big.\n"))
return -1;
return (s << 8) | (die_id & 0xff);
}
int cpu_map__get_core_id(int cpu)
{
int value, ret = cpu__get_topology_int(cpu, "core_id", &value);
......@@ -381,7 +421,7 @@ int cpu_map__get_core_id(int cpu)
int cpu_map__get_core(struct cpu_map *map, int idx, void *data)
{
int cpu, s;
int cpu, s_die;
if (idx > map->nr)
return -1;
......@@ -390,17 +430,22 @@ int cpu_map__get_core(struct cpu_map *map, int idx, void *data)
cpu = cpu_map__get_core_id(cpu);
s = cpu_map__get_socket(map, idx, data);
if (s == -1)
/* s_die is the combination of socket + die id */
s_die = cpu_map__get_die(map, idx, data);
if (s_die == -1)
return -1;
/*
* encode socket in upper 16 bits
* core_id is relative to socket, and
* encode socket in bit range 31:24
* encode die id in bit range 23:16
* core_id is relative to socket and die,
* we need a global id. So we combine
* socket+ core id
* socket + die id + core id
*/
return (s << 16) | (cpu & 0xffff);
if (WARN_ONCE(cpu >> 16, "The core id number is too big.\n"))
return -1;
return (s_die << 16) | (cpu & 0xffff);
}
int cpu_map__build_socket_map(struct cpu_map *cpus, struct cpu_map **sockp)
......@@ -408,6 +453,11 @@ int cpu_map__build_socket_map(struct cpu_map *cpus, struct cpu_map **sockp)
return cpu_map__build_map(cpus, sockp, cpu_map__get_socket, NULL);
}
int cpu_map__build_die_map(struct cpu_map *cpus, struct cpu_map **diep)
{
return cpu_map__build_map(cpus, diep, cpu_map__get_die, NULL);
}
int cpu_map__build_core_map(struct cpu_map *cpus, struct cpu_map **corep)
{
return cpu_map__build_map(cpus, corep, cpu_map__get_core, NULL);
......
......@@ -25,9 +25,12 @@ size_t cpu_map__snprint_mask(struct cpu_map *map, char *buf, size_t size);
size_t cpu_map__fprintf(struct cpu_map *map, FILE *fp);
int cpu_map__get_socket_id(int cpu);
int cpu_map__get_socket(struct cpu_map *map, int idx, void *data);
int cpu_map__get_die_id(int cpu);
int cpu_map__get_die(struct cpu_map *map, int idx, void *data);
int cpu_map__get_core_id(int cpu);
int cpu_map__get_core(struct cpu_map *map, int idx, void *data);
int cpu_map__build_socket_map(struct cpu_map *cpus, struct cpu_map **sockp);
int cpu_map__build_die_map(struct cpu_map *cpus, struct cpu_map **diep);
int cpu_map__build_core_map(struct cpu_map *cpus, struct cpu_map **corep);
const struct cpu_map *cpu_map__online(void); /* thread unsafe */
......@@ -43,7 +46,12 @@ static inline int cpu_map__socket(struct cpu_map *sock, int s)
static inline int cpu_map__id_to_socket(int id)
{
return id >> 16;
return id >> 24;
}
static inline int cpu_map__id_to_die(int id)
{
return (id >> 16) & 0xff;
}
static inline int cpu_map__id_to_cpu(int id)
......
// SPDX-License-Identifier: GPL-2.0
#include <sys/param.h>
#include <sys/utsname.h>
#include <inttypes.h>
#include <api/fs/fs.h>
......@@ -8,11 +9,14 @@
#include "util.h"
#include "env.h"
#define CORE_SIB_FMT \
"%s/devices/system/cpu/cpu%d/topology/core_siblings_list"
#define DIE_SIB_FMT \
"%s/devices/system/cpu/cpu%d/topology/die_cpus_list"
#define THRD_SIB_FMT \
"%s/devices/system/cpu/cpu%d/topology/thread_siblings_list"
#define THRD_SIB_FMT_NEW \
"%s/devices/system/cpu/cpu%d/topology/core_cpus_list"
#define NODE_ONLINE_FMT \
"%s/devices/system/node/online"
#define NODE_MEMINFO_FMT \
......@@ -34,12 +38,12 @@ static int build_cpu_topology(struct cpu_topology *tp, int cpu)
sysfs__mountpoint(), cpu);
fp = fopen(filename, "r");
if (!fp)
goto try_threads;
goto try_dies;
sret = getline(&buf, &len, fp);
fclose(fp);
if (sret <= 0)
goto try_threads;
goto try_dies;
p = strchr(buf, '\n');
if (p)
......@@ -57,9 +61,44 @@ static int build_cpu_topology(struct cpu_topology *tp, int cpu)
}
ret = 0;
try_dies:
if (!tp->die_siblings)
goto try_threads;
scnprintf(filename, MAXPATHLEN, DIE_SIB_FMT,
sysfs__mountpoint(), cpu);
fp = fopen(filename, "r");
if (!fp)
goto try_threads;
sret = getline(&buf, &len, fp);
fclose(fp);
if (sret <= 0)
goto try_threads;
p = strchr(buf, '\n');
if (p)
*p = '\0';
for (i = 0; i < tp->die_sib; i++) {
if (!strcmp(buf, tp->die_siblings[i]))
break;
}
if (i == tp->die_sib) {
tp->die_siblings[i] = buf;
tp->die_sib++;
buf = NULL;
len = 0;
}
ret = 0;
try_threads:
scnprintf(filename, MAXPATHLEN, THRD_SIB_FMT,
scnprintf(filename, MAXPATHLEN, THRD_SIB_FMT_NEW,
sysfs__mountpoint(), cpu);
if (access(filename, F_OK) == -1) {
scnprintf(filename, MAXPATHLEN, THRD_SIB_FMT,
sysfs__mountpoint(), cpu);
}
fp = fopen(filename, "r");
if (!fp)
goto done;
......@@ -98,21 +137,46 @@ void cpu_topology__delete(struct cpu_topology *tp)
for (i = 0 ; i < tp->core_sib; i++)
zfree(&tp->core_siblings[i]);
if (tp->die_sib) {
for (i = 0 ; i < tp->die_sib; i++)
zfree(&tp->die_siblings[i]);
}
for (i = 0 ; i < tp->thread_sib; i++)
zfree(&tp->thread_siblings[i]);
free(tp);
}
static bool has_die_topology(void)
{
char filename[MAXPATHLEN];
struct utsname uts;
if (uname(&uts) < 0)
return false;
if (strncmp(uts.machine, "x86_64", 6))
return false;
scnprintf(filename, MAXPATHLEN, DIE_SIB_FMT,
sysfs__mountpoint(), 0);
if (access(filename, F_OK) == -1)
return false;
return true;
}
struct cpu_topology *cpu_topology__new(void)
{
struct cpu_topology *tp = NULL;
void *addr;
u32 nr, i;
u32 nr, i, nr_addr;
size_t sz;
long ncpus;
int ret = -1;
struct cpu_map *map;
bool has_die = has_die_topology();
ncpus = cpu__max_present_cpu();
......@@ -126,7 +190,11 @@ struct cpu_topology *cpu_topology__new(void)
nr = (u32)(ncpus & UINT_MAX);
sz = nr * sizeof(char *);
addr = calloc(1, sizeof(*tp) + 2 * sz);
if (has_die)
nr_addr = 3;
else
nr_addr = 2;
addr = calloc(1, sizeof(*tp) + nr_addr * sz);
if (!addr)
goto out_free;
......@@ -134,6 +202,10 @@ struct cpu_topology *cpu_topology__new(void)
addr += sizeof(*tp);
tp->core_siblings = addr;
addr += sz;
if (has_die) {
tp->die_siblings = addr;
addr += sz;
}
tp->thread_siblings = addr;
for (i = 0; i < nr; i++) {
......
......@@ -7,8 +7,10 @@
struct cpu_topology {
u32 core_sib;
u32 die_sib;
u32 thread_sib;
char **core_siblings;
char **die_siblings;
char **thread_siblings;
};
......
......@@ -14,43 +14,12 @@
#include <stdio.h>
struct cs_etm_decoder;
enum cs_etm_sample_type {
CS_ETM_EMPTY,
CS_ETM_RANGE,
CS_ETM_DISCONTINUITY,
CS_ETM_EXCEPTION,
CS_ETM_EXCEPTION_RET,
};
enum cs_etm_isa {
CS_ETM_ISA_UNKNOWN,
CS_ETM_ISA_A64,
CS_ETM_ISA_A32,
CS_ETM_ISA_T32,
};
struct cs_etm_packet {
enum cs_etm_sample_type sample_type;
enum cs_etm_isa isa;
u64 start_addr;
u64 end_addr;
u32 instr_count;
u32 last_instr_type;
u32 last_instr_subtype;
u32 flags;
u32 exception_number;
u8 last_instr_cond;
u8 last_instr_taken_branch;
u8 last_instr_size;
u8 trace_chan_id;
int cpu;
};
struct cs_etm_packet;
struct cs_etm_packet_queue;
struct cs_etm_queue;
typedef u32 (*cs_etm_mem_cb_type)(struct cs_etm_queue *, u64,
size_t, u8 *);
typedef u32 (*cs_etm_mem_cb_type)(struct cs_etm_queue *, u8, u64, size_t, u8 *);
struct cs_etmv3_trace_params {
u32 reg_ctrl;
......@@ -119,7 +88,7 @@ int cs_etm_decoder__add_mem_access_cb(struct cs_etm_decoder *decoder,
u64 start, u64 end,
cs_etm_mem_cb_type cb_func);
int cs_etm_decoder__get_packet(struct cs_etm_decoder *decoder,
int cs_etm_decoder__get_packet(struct cs_etm_packet_queue *packet_queue,
struct cs_etm_packet *packet);
int cs_etm_decoder__reset(struct cs_etm_decoder *decoder);
......
This diff is collapsed.
This diff is collapsed.
......@@ -246,6 +246,7 @@ int perf_env__read_cpu_topology_map(struct perf_env *env)
for (cpu = 0; cpu < nr_cpus; ++cpu) {
env->cpu[cpu].core_id = cpu_map__get_core_id(cpu);
env->cpu[cpu].socket_id = cpu_map__get_socket_id(cpu);
env->cpu[cpu].die_id = cpu_map__get_die_id(cpu);
}
env->nr_cpus_avail = nr_cpus;
......
......@@ -9,6 +9,7 @@
struct cpu_topology_map {
int socket_id;
int die_id;
int core_id;
};
......@@ -49,6 +50,7 @@ struct perf_env {
int nr_cmdline;
int nr_sibling_cores;
int nr_sibling_dies;
int nr_sibling_threads;
int nr_numa_nodes;
int nr_memory_nodes;
......@@ -57,6 +59,7 @@ struct perf_env {
char *cmdline;
const char **cmdline_argv;
char *sibling_cores;
char *sibling_dies;
char *sibling_threads;
char *pmu_mappings;
struct cpu_topology_map *cpu;
......
......@@ -204,6 +204,8 @@ struct perf_sample {
u64 period;
u64 weight;
u64 transaction;
u64 insn_cnt;
u64 cyc_cnt;
u32 cpu;
u32 raw_size;
u64 data_src;
......
......@@ -679,6 +679,10 @@ static void __perf_evsel__config_callchain(struct perf_evsel *evsel,
attr->sample_max_stack = param->max_stack;
if (opts->kernel_callchains)
attr->exclude_callchain_user = 1;
if (opts->user_callchains)
attr->exclude_callchain_kernel = 1;
if (param->record_mode == CALLCHAIN_LBR) {
if (!opts->branch_stack) {
if (attr->exclude_user) {
......@@ -701,7 +705,14 @@ static void __perf_evsel__config_callchain(struct perf_evsel *evsel,
if (!function) {
perf_evsel__set_sample_bit(evsel, REGS_USER);
perf_evsel__set_sample_bit(evsel, STACK_USER);
attr->sample_regs_user |= PERF_REGS_MASK;
if (opts->sample_user_regs && DWARF_MINIMAL_REGS != PERF_REGS_MASK) {
attr->sample_regs_user |= DWARF_MINIMAL_REGS;
pr_warning("WARNING: The use of --call-graph=dwarf may require all the user registers, "
"specifying a subset with --user-regs may render DWARF unwinding unreliable, "
"so the minimal registers set (IP, SP) is explicitly forced.\n");
} else {
attr->sample_regs_user |= PERF_REGS_MASK;
}
attr->sample_stack_user = param->dump_size;
attr->exclude_callchain_user = 1;
} else {
......@@ -1136,9 +1147,6 @@ void perf_evsel__config(struct perf_evsel *evsel, struct record_opts *opts,
static int perf_evsel__alloc_fd(struct perf_evsel *evsel, int ncpus, int nthreads)
{
if (evsel->system_wide)
nthreads = 1;
evsel->fd = xyarray__new(ncpus, nthreads, sizeof(int));
if (evsel->fd) {
......
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
......@@ -23,8 +23,12 @@ int smt_on(void)
char fn[256];
snprintf(fn, sizeof fn,
"devices/system/cpu/cpu%d/topology/thread_siblings",
cpu);
"devices/system/cpu/cpu%d/topology/core_cpus", cpu);
if (access(fn, F_OK) == -1) {
snprintf(fn, sizeof fn,
"devices/system/cpu/cpu%d/topology/thread_siblings",
cpu);
}
if (sysfs__read_str(fn, &str, &strlen) < 0)
continue;
/* Entry is hex, but does not have 0x, so need custom parser */
......
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment