- 27 Apr, 2016 1 commit
-
-
Arnaldo Carvalho de Melo authored
The default remains 127, which is good for most cases, and not even hit most of the time, but then for some cases, as reported by Brendan, 1024+ deep frames are appearing on the radar for things like groovy, ruby. And in some workloads putting a _lower_ cap on this may make sense. One that is per event still needs to be put in place tho. The new file is: # cat /proc/sys/kernel/perf_event_max_stack 127 Chaging it: # echo 256 > /proc/sys/kernel/perf_event_max_stack # cat /proc/sys/kernel/perf_event_max_stack 256 But as soon as there is some event using callchains we get: # echo 512 > /proc/sys/kernel/perf_event_max_stack -bash: echo: write error: Device or resource busy # Because we only allocate the callchain percpu data structures when there is a user, which allows for changing the max easily, its just a matter of having no callchain users at that point. Reported-and-Tested-by: Brendan Gregg <brendan.d.gregg@gmail.com> Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: David Ahern <dsahern@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: Wang Nan <wangnan0@huawei.com> Cc: Zefan Li <lizefan@huawei.com> Link: http://lkml.kernel.org/r/20160426002928.GB16708@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
- 26 Apr, 2016 14 commits
-
-
Arnaldo Carvalho de Melo authored
Propagate the error instead. Cc: David Ahern <dsahern@gmail.com> Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-z6erjg35d1gekevwujoa0223@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Arnaldo Carvalho de Melo authored
Introduced in commit 4babf2c5 ("x86: wire up preadv2 and pwritev2"). This will make 'perf trace' aware of them. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-vojoylgce2cetsy36446s5ny@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Ravi Bangoria authored
Perf is not able to register probe in kernel module when dwarf supprt is not there(and so it goes for symtab). Perf passes full path of module where only module name is required which is causing the problem. This patch fixes this issue. Before applying patch: $ dpkg -s libdw-dev dpkg-query: package 'libdw-dev' is not installed and no information is... $ sudo ./perf probe -m /linux/samples/kprobes/kprobe_example.ko kprobe_init Added new event: probe:kprobe_init (on kprobe_init in /linux/samples/kprobes/kprobe_example.ko) You can now use it in all perf tools, such as: perf record -e probe:kprobe_init -aR sleep 1 $ sudo cat /sys/kernel/debug/tracing/kprobe_events p:probe/kprobe_init /linux/samples/kprobes/kprobe_example.ko:kprobe_init $ sudo ./perf record -a -e probe:kprobe_init [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.105 MB perf.data ] $ sudo ./perf script # No output here After applying patch: $ sudo ./perf probe -m /linux/samples/kprobes/kprobe_example.ko kprobe_init Added new event: probe:kprobe_init (on kprobe_init in kprobe_example) You can now use it in all perf tools, such as: perf record -e probe:kprobe_init -aR sleep 1 $ sudo cat /sys/kernel/debug/tracing/kprobe_events p:probe/kprobe_init kprobe_example:kprobe_init $ sudo ./perf record -a -e probe:kprobe_init [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.105 MB perf.data (2 samples) ] $ sudo ./perf script insmod 13990 [002] 5961.216833: probe:kprobe_init: ... insmod 13995 [002] 5962.889384: probe:kprobe_init: ... Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1461680741-12517-1-git-send-email-ravi.bangoria@linux.vnet.ibm.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Ravi Bangoria authored
Perf can add a probe on kernel module which has not been loaded yet. The current implementation finds the module name from path. But if the filename is different from the actual module name then perf fails to register a probe while loading module because of mismatch in the names. For example, samples/kobject/kobject-example.ko is loaded as kobject_example. Before applying patch: $ sudo ./perf probe -m /linux/samples/kobject/kobject-example.ko foo_show Added new event: probe:foo_show (on foo_show in kobject-example) You can now use it in all perf tools, such as: perf record -e probe:foo_show -aR sleep 1 $ cat /sys/kernel/debug/tracing/kprobe_events p:probe/foo_show kobject-example:foo_show $ insmod kobject-example.ko $ lsmod Module Size Used by kobject_example 16384 0 Generate read to /sys/kernel/kobject_example/foo while recording data with below command $ sudo ./perf record -e probe:foo_show -a [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.093 MB perf.data ] $./perf report --stdio -F overhead,comm,dso,sym Error: The perf.data.old file has no samples! After applying patch: $ sudo ./perf probe -m /linux/samples/kobject/kobject-example.ko foo_show Added new event: probe:foo_show (on foo_show in kobject_example) You can now use it in all perf tools, such as: perf record -e probe:foo_show -aR sleep 1 $ sudo cat /sys/kernel/debug/tracing/kprobe_events p:probe/foo_show kobject_example:foo_show $ insmod kobject-example.ko $ lsmod Module Size Used by kobject_example 16384 0 Generate read to /sys/kernel/kobject_example/foo while recording data with below command $ sudo ./perf record -e probe:foo_show -a [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.097 MB perf.data (8 samples) ] $ sudo ./perf report --stdio -F overhead,comm,dso,sym ... # Samples: 8 of event 'probe:foo_show' # Event count (approx.): 8 # # Overhead Command Shared Object Symbol # ........ ....... ................. ............ # 100.00% cat [kobject_example] [k] foo_show Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1461680741-12517-2-git-send-email-ravi.bangoria@linux.vnet.ibm.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Arnaldo Carvalho de Melo authored
We get notifications for threads that gets created while we're tracing, but for preexisting threads we may end not having synthesized them, like when tracing a 'perf trace' session that will use '--pid' to trace some other thread. And besides we should probably stop synthesizing those records and instead read thread information in a lazy way, i.e. just when we need, like done in this patch: Now the 'pid_t' argument in 'perf_event_open' gets translated to a COMM: # perf trace -e perf_event_open perf stat -e cycles -p 31601 0.027 ( 0.027 ms): perf/23393 perf_event_open(attr_uptr: 0x2fdd0d8, pid: 31601 (abrt-dump-journ), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ = 3 ^C And in other syscalls containing pid_t without thread->comm_set at the time of the formatting. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-ioeps6dlwst17d6oozc9shtk@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Arnaldo Carvalho de Melo authored
Will be used for lazy comm loading in 'perf trace'. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-7ogbkuoka1y2qsmcckqxvl5m@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Arnaldo Carvalho de Melo authored
To read things like /proc/self/comm. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-ztpkbmseidt0hq2psr46o0h9@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Arnaldo Carvalho de Melo authored
Leave it alone so that it ends up assigned to SCA_PID via its type, 'pid_t', that will look up the pid on the machine thread rb_tree and possibly find its COMM. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-r7dujgmhtxxfajuunpt1bkuo@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Arnaldo Carvalho de Melo authored
To reduce the size of builtin-trace.c. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-8r3gmymyn3r0ynt4yuzspp9g@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Masami Hiramatsu authored
Set kprobe group name as "probe" if it is not given. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Hemant Kumar <hemant@linux.vnet.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20160426090413.11891.95640.stgit@devboxSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Masami Hiramatsu authored
Since other methods return 0 if succeeded (or filedesc), let probe_file__add_event() return 0 instead of the length of written bytes. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Hemant Kumar <hemant@linux.vnet.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20160426090303.11891.18232.stgit@devboxSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Masami Hiramatsu authored
As a utility function, add lsdir() which reads given directory and store entry name into a strlist. lsdir accepts a filter function so that user can filter out unneeded entries. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Hemant Kumar <hemant@linux.vnet.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20160426090242.11891.79014.stgit@devbox [ Do not use the 'dirname' it is used in some distros ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Masami Hiramatsu authored
Fix a bug to close target elf file in get_text_start_address(). Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20160426064737.1443.44093.stgit@devboxSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Wang Nan authored
Don't read broken data after 'head' pointer. Following commits will feed perf_evlist__mmap_read() with some 'head' pointers not maintained by kernel. If 'head' pointer breaks an event, we should avoid reading from the broken event. This can happen in backward ring buffer. For example: old head | | V V +---+------+----------+----+-----+--+ |..E|D....D|C........C|B..B|A....|E.| +---+------+----------+----+-----+--+ 'old' pointer points to the beginning of 'A' and trying read from it, but 'A' has been overwritten. In this case, don't try to read from 'A', simply return NULL. Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1461637738-62722-2-git-send-email-wangnan0@huawei.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
- 25 Apr, 2016 14 commits
-
-
Kan Liang authored
The accumulated period for dummy entry should also be 0. Otherwise, the total overhead could be overcounted. $ perf record -e '{LLC-load-misses,cpu/instructions/}' --call-graph=lbr ./tchain $ perf report --stdio # To display the perf.data header info, please use --header/--header-only options. # # Total Lost Samples: 0 # # Samples: 21K of event 'anon group { LLC-load-misses, cpu/instructions/ }' # Event count (approx.): 16313667937 # # Children Self Command Shared Object Symbol # ................ ................ ........... ................ ............................ # 4769.98% 0.01% 0.00% 0.01% tchain_edit [kernel.vmlinux] [k] update_fast_timekeeper 4356.18% 0.01% 0.00% 0.01% tchain_edit [kernel.vmlinux] [k] trigger_load_balance 3181.12% 0.01% 0.00% 0.01% tchain_edit [kernel.vmlinux] [k] irq_work_tick 1592.37% 0.00% 0.00% 0.00% tchain_edit [kernel.vmlinux] [k] cpu_needs_another_gp Signed-off-by: Kan Liang <kan.liang@intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/1461565689-5862-1-git-send-email-kan.liang@intel.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Colin Ian King authored
The check for the maximum code is off-by-one; the current comparison of a code that is INTEL_PT_ERR_MAX will cause the strlcpy to perform an out of bounds array access on the intel_pt_err_msgs array. Fix this with a >= comparison. Signed-off-by: Colin Ian King <colin.king@canonical.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1461524203-10224-1-git-send-email-colin.king@canonical.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Davidlohr Bueso authored
Given that the 'val' parameter is ignored for FUTEX_LOCK_PI, get rid of the bogus deadlock detection flag in the wrapper code and avoid the extra argument, making it resemble its unlock counterpart. And if nothing else, we already only pass 0 anyway. Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Cc: Davidlohr Bueso <dbueso@suse.de> Link: http://lkml.kernel.org/r/1461208447-29328-1-git-send-email-dave@stgolabs.netSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Colin Ian King authored
The current assert check is checking an assignment, which will always be true. Instead, the assert should be checking if scale is equal to 0.122 Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1461419154-16918-1-git-send-email-colin.king@canonical.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Eric Engestrom authored
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1461577678-29517-1-git-send-email-eric.engestrom@imgtec.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Jiri Olsa authored
Fix perf_clean target to follow the same logic as perf target. Fixes the following make invokation: $ cd <kernelsrc> && make tools/perf_clean Reported-by: TJ <linux@iam.tj> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=116411 Link: http://lkml.kernel.org/r/1461615438-27894-2-git-send-email-jolsa@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Jiri Olsa authored
Turn current clean output: $ make clean rm -f arch/x86/include/generated/asm/syscalls_64.c CLEAN libbpf CLEAN libapi into: $ make clean CLEAN x86 CLEAN libapi CLEAN libbpf Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: TJ <linux@iam.tj> Link: http://lkml.kernel.org/r/1461615438-27894-1-git-send-email-jolsa@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Arnaldo Carvalho de Melo authored
While trying to use --call-graph lbr in 'perf trace', since we only are interested in the callchain for userspace, up to the callchain, I found that 'perf evlist' is not decoding the branch_sample_type field, fix it. Before: # perf record --call-graph lbr usleep 1 # perf evlist -v cycles:ppp: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CALLCHAIN|CPU|PERIOD|BRANCH_STACK, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, branch_sample_type: 51201 ^^^^^^^^^^^^^^^^^^^^^^^^^ After: # perf evlist -v cycles:ppp: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CALLCHAIN|CPU|PERIOD|BRANCH_STACK, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, branch_sample_type: USER|CALL_STACK|NO_FLAGS|NO_CYCLES ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-hozai7974u0ulgx13k96fcaw@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Arnaldo Carvalho de Melo authored
To check deeply nested page fault callchains. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-wuji34xx003kr88nmqt6jkgf@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Arnaldo Carvalho de Melo authored
Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-shj0fazntmskhjild5i6x73l@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Chris Phlipot authored
This fixes a bug caused by an unitialized callchain cursor. The crash frist appeared in: 6f736735 ("perf evsel: Require that callchains be resolved before calling fprintf_{sym,callchain}") The callchain cursor is a struct that contains pointers, that when uninitialized will cause unpredictable behavior (usually a crash) when trying to append to the callchain. The existing implementation has the following issues: 1. The callchain cursor used is not initialized, resulting in unpredictable behavior when used. 2. The cursor is declared on the stack. Even if it is properly initalized, the implmentation will leak memory when the function returns, since all the references to the callchain_nodes allocated by callchain_cursor_append will be lost when the cursor goes out of scope. 3. Storing the cursor on the stack is inefficient. Even if memory is properly freed when it goes out of scope, a performance penalty will be incurred due to reallocation of callchain nodes. callchain_cursor_append is designed to avoid these reallocations when an existing cursor is reused. This patch fixes the crash by replacing cursor_callchain with a reference to the global callchain_cursor which also resolves all 3 issues mentioned above. How to reproduce the crash: $ perf record --call-graph=dwarf stress -t 1 -c 1 $ perf script > /dev/null Segfault Signed-off-by: Chris Phlipot <cphlipot0@gmail.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Fixes: 6f736735 ("perf evsel: Require that callchains be resolved before calling fprintf_{sym,callchain}") Link: http://lkml.kernel.org/r/1461119531-2529-1-git-send-email-cphlipot0@gmail.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Arnaldo Carvalho de Melo authored
Forgot about page faults, a software event, when adding support for callchains, fix it: # trace --no-syscalls --pf maj --call dwarf 0.000 ( 0.000 ms): Xorg/2068 majfault [sfbSegment1+0x0] => /usr/lib64/xorg/modules/drivers/intel_drv.so@0x11b490 (x.) sfbSegment1+0x0 (/usr/lib64/xorg/modules/drivers/intel_drv.so) fbPolySegment32+0x361 (/usr/lib64/xorg/modules/drivers/intel_drv.so) sna_poly_segment+0x743 (/usr/lib64/xorg/modules/drivers/intel_drv.so) damagePolySegment+0x77 (/usr/libexec/Xorg) ProcPolySegment+0xe7 (/usr/libexec/Xorg) Dispatch+0x25f (/usr/libexec/Xorg) dix_main+0x3c3 (/usr/libexec/Xorg) __libc_start_main+0xf0 (/usr/lib64/libc-2.22.so) _start+0x29 (/usr/libexec/Xorg) 0.257 ( 0.000 ms): Xorg/2068 majfault [miZeroClipLine+0x0] => /usr/libexec/Xorg@0x18e830 (x.) miZeroClipLine+0x0 (/usr/libexec/Xorg) _fbSegment+0x2c0 (/usr/lib64/xorg/modules/drivers/intel_drv.so) sfbSegment1+0x67 (/usr/lib64/xorg/modules/drivers/intel_drv.so) fbPolySegment32+0x361 (/usr/lib64/xorg/modules/drivers/intel_drv.so) sna_poly_segment+0x743 (/usr/lib64/xorg/modules/drivers/intel_drv.so) damagePolySegment+0x77 (/usr/libexec/Xorg) ProcPolySegment+0xe7 (/usr/libexec/Xorg) Dispatch+0x25f (/usr/libexec/Xorg) dix_main+0x3c3 (/usr/libexec/Xorg) __libc_start_main+0xf0 (/usr/lib64/libc-2.22.so) _start+0x29 (/usr/libexec/Xorg) ^C# Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-8h6ssirw5z15qyhy2lwd6f89@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Arnaldo Carvalho de Melo authored
Prep work for next patches, where we'll need access to the created evsels, to possibly configure callchains. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-2pcgsgnkgellhlcao4aub8tu@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Andrey Ryabinin authored
write_buildid() increments 'name_len' with intention to take into account trailing zero byte. However, 'name_len' was already incremented in machine__write_buildid_table() before. So this leads to out-of-bounds read in do_write(): $ ./perf record sleep 0 [ perf record: Woken up 1 times to write data ] ================================================================= ==15899==ERROR: AddressSanitizer: global-buffer-overflow on address 0x00000099fc92 at pc 0x7f1aa9c7eab5 bp 0x7fff940f84d0 sp 0x7fff940f7c78 READ of size 19 at 0x00000099fc92 thread T0 #0 0x7f1aa9c7eab4 (/usr/lib/gcc/x86_64-pc-linux-gnu/5.3.0/libasan.so.2+0x44ab4) #1 0x649c5b in do_write util/header.c:67 #2 0x649c5b in write_padded util/header.c:82 #3 0x57e8bc in write_buildid util/build-id.c:239 #4 0x57e8bc in machine__write_buildid_table util/build-id.c:278 ... 0x00000099fc92 is located 0 bytes to the right of global variable '*.LC99' defined in 'util/symbol.c' (0x99fc80) of size 18 '*.LC99' is ascii string '[kernel.kallsyms]' ... Shadow bytes around the buggy address: 0x00008012bf80: f9 f9 f9 f9 00 00 00 00 00 00 03 f9 f9 f9 f9 f9 =>0x00008012bf90: 00 00[02]f9 f9 f9 f9 f9 00 00 00 00 00 05 f9 f9 0x00008012bfa0: f9 f9 f9 f9 00 03 f9 f9 f9 f9 f9 f9 00 00 00 00 Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1461053847-5633-1-git-send-email-aryabinin@virtuozzo.com [ Remove the off-by one at the origin, to keep len(s) == strlen(s) assumption ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
- 23 Apr, 2016 11 commits
-
-
Ingo Molnar authored
Merge tag 'perf-core-for-mingo-20160419' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: Build fixes: - Fix 'perf trace' build when DWARF unwind isn't available (Arnaldo Carvalho de Melo) - Remove x86 references from arch-neutral Build, fixing it in !x86 arches, reported as breaking the build for powerpc64le in linux-next (Arnaldo Carvalho de Melo) Infrastructure changes: - Do memset() variable 'st' using the correct size in the jit code (Colin Ian King) - Fix postgresql ubuntu 'perf script' install instructions (Chris Phlipot) - Use callchain_param more thoroughly when checking how callchains were configured, eventually will be the only way to look for callchain parameters (Arnaldo Carvalho de Melo) - Fix some issues in the 'perf test kallsyms' entry (Arnaldo Carvalho de Melo) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
-
Peter Zijlstra authored
With the array aligned as per events/intel/core.c it was fairly obvious we missed one, add it in. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Signed-off-by: Ingo Molnar <mingo@kernel.org>
-
Peter Zijlstra authored
Re-order the model array to match the order in events/intel/core.c, to easier spot gaps and such. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Signed-off-by: Ingo Molnar <mingo@kernel.org>
-
Srinivas Pandruvada authored
Add Skylake client support for RAPL domains. In addition to RAPL domains in Broadwell clients, it has support for platform domain (aka PSys). The PSys domain controls the entire SoC instead of just a CPU package. Unlike package domain, PSys support requires more than just processor level implementation. The other parts in the system need additional HW level signaling, which OEMs need to support. When not supported, the energy counter register in PSys domain returns 0. Also corrected error in comment for GPU counter, which previously was DRAM counter. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com [ Cnverted to model_match stuff. ] Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: bp@alien8.de Cc: hpa@zytor.com Cc: jacob.jun.pan@linux.intel.com Cc: rjw@rjwysocki.net Link: http://lkml.kernel.org/r/1460930581-29748-2-git-send-email-srinivas.pandruvada@linux.intel.comSigned-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>
-
Wang Nan authored
This patch introduces 'write_backward' bit to perf_event_attr, which controls the direction of a ring buffer. After set, the corresponding ring buffer is written from end to beginning. This feature is design to support reading from overwritable ring buffer. Ring buffer can be created by mapping a perf event fd. Kernel puts event records into ring buffer, user tooling like perf fetch them from address returned by mmap(). To prevent racing between kernel and tooling, they communicate to each other through 'head' and 'tail' pointers. Kernel maintains 'head' pointer, points it to the next free area (tail of the last record). Tooling maintains 'tail' pointer, points it to the tail of last consumed record (record has already been fetched). Kernel determines the available space in a ring buffer using these two pointers to avoid overwrite unfetched records. By mapping without 'PROT_WRITE', an overwritable ring buffer is created. Different from normal ring buffer, tooling is unable to maintain 'tail' pointer because writing is forbidden. Therefore, for this type of ring buffers, kernel overwrite old records unconditionally, works like flight recorder. This feature would be useful if reading from overwritable ring buffer were as easy as reading from normal ring buffer. However, there's an obscure problem. The following figure demonstrates a full overwritable ring buffer. In this figure, the 'head' pointer points to the end of last record, and a long record 'E' is pending. For a normal ring buffer, a 'tail' pointer would have pointed to position (X), so kernel knows there's no more space in the ring buffer. However, for an overwritable ring buffer, kernel ignore the 'tail' pointer. (X) head . | . V +------+-------+----------+------+---+ |A....A|B.....B|C........C|D....D| | +------+-------+----------+------+---+ Record 'A' is overwritten by event 'E': head | V +--+---+-------+----------+------+---+ |.E|..A|B.....B|C........C|D....D|E..| +--+---+-------+----------+------+---+ Now tooling decides to read from this ring buffer. However, none of these two natural positions, 'head' and the start of this ring buffer, are pointing to the head of a record. Even the full ring buffer can be accessed by tooling, it is unable to find a position to start decoding. The first attempt tries to solve this problem AFAIK can be found from [1]. It makes kernel to maintain 'tail' pointer: updates it when ring buffer is half full. However, this approach introduces overhead to fast path. Test result shows a 1% overhead [2]. In addition, this method utilizes no more tham 50% records. Another attempt can be found from [3], which allows putting the size of an event at the end of each record. This approach allows tooling to find records in a backward manner from 'head' pointer by reading size of a record from its tail. However, because of alignment requirement, it needs 8 bytes to record the size of a record, which is a huge waste. Its performance is also not good, because more data need to be written. This approach also introduces some extra branch instructions to fast path. 'write_backward' is a better solution to this problem. Following figure demonstrates the state of the overwritable ring buffer when 'write_backward' is set before overwriting: head | V +---+------+----------+-------+------+ | |D....D|C........C|B.....B|A....A| +---+------+----------+-------+------+ and after overwriting: head | V +---+------+----------+-------+---+--+ |..E|D....D|C........C|B.....B|A..|E.| +---+------+----------+-------+---+--+ In each situation, 'head' points to the beginning of the newest record. From this record, tooling can iterate over the full ring buffer and fetch records one by one. The only limitation that needs to be considered is back-to-back reading. Due to the non-deterministic of user programs, it is impossible to ensure the ring buffer keeps stable during reading. Consider an extreme situation: tooling is scheduled out after reading record 'D', then a burst of events come, eat up the whole ring buffer (one or multiple rounds). When the tooling process comes back, reading after 'D' is incorrect now. To prevent this problem, we need to find a way to ensure the ring buffer is stable during reading. ioctl(PERF_EVENT_IOC_PAUSE_OUTPUT) is suggested because its overhead is lower than ioctl(PERF_EVENT_IOC_ENABLE). By carefully verifying 'header' pointer, reader can avoid pausing the ring-buffer. For example: /* A union of all possible events */ union perf_event event; p = head = perf_mmap__read_head(); while (true) { /* copy header of next event */ fetch(&event.header, p, sizeof(event.header)); /* read 'head' pointer */ head = perf_mmap__read_head(); /* check overwritten: is the header good? */ if (!verify(sizeof(event.header), p, head)) break; /* copy the whole event */ fetch(&event, p, event.header.size); /* read 'head' pointer again */ head = perf_mmap__read_head(); /* is the whole event good? */ if (!verify(event.header.size, p, head)) break; p += event.header.size; } However, the overhead is high because: a) In-place decoding is not safe. Copying-verifying-decoding is required. b) Fetching 'head' pointer requires additional synchronization. (From Alexei Starovoitov: Even when this trick works, pause is needed for more than stability of reading. When we collect the events into overwrite buffer we're waiting for some other trigger (like all cpu utilization spike or just one cpu running and all others are idle) and when it happens the buffer has valuable info from the past. At this point new events are no longer interesting and buffer should be paused, events read and unpaused until next trigger comes.) This patch utilizes event's default overflow_handler introduced previously. perf_event_output_backward() is created as the default overflow handler for backward ring buffers. To avoid extra overhead to fast path, original perf_event_output() becomes __perf_event_output() and marked '__always_inline'. In theory, there's no extra overhead introduced to fast path. Performance testing: Calling 3000000 times of 'close(-1)', use gettimeofday() to check duration. Use 'perf record -o /dev/null -e raw_syscalls:*' to capture system calls. In ns. Testing environment: CPU : Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz Kernel : v4.5.0 MEAN STDVAR BASE 800214.950 2853.083 PRE1 2253846.700 9997.014 PRE2 2257495.540 8516.293 POST 2250896.100 8933.921 Where 'BASE' is pure performance without capturing. 'PRE1' is test result of pure 'v4.5.0' kernel. 'PRE2' is test result before this patch. 'POST' is test result after this patch. See [4] for the detailed experimental setup. Considering the stdvar, this patch doesn't introduce performance overhead to the fast path. [1] http://lkml.iu.edu/hypermail/linux/kernel/1304.1/04584.html [2] http://lkml.iu.edu/hypermail/linux/kernel/1307.1/00535.html [3] http://lkml.iu.edu/hypermail/linux/kernel/1512.0/01265.html [4] http://lkml.kernel.org/g/56F89DCD.1040202@huawei.comSigned-off-by: Wang Nan <wangnan0@huawei.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Alexei Starovoitov <ast@kernel.org> Cc: <acme@kernel.org> Cc: <pi3orama@163.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: Zefan Li <lizefan@huawei.com> Link: http://lkml.kernel.org/r/1459865478-53413-1-git-send-email-wangnan0@huawei.com [ Fixed the changelog some more. ] Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>
-
Kan Liang authored
LBR filtering is also supported on the Silvermont and Airmont microarchitectures. The layout of MSR_LBR_SELECT is the same as Nehalem. Signed-off-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1460706825-46163-1-git-send-email-kan.liang@intel.comSigned-off-by: Ingo Molnar <mingo@kernel.org>
-
Kan Liang authored
Add perf core PMU support for Intel Goldmont CPU cores: - The init code is based on Silvermont. - There is a new cache event list, based on the Silvermont cache event list. - Goldmont has 32 LBR entries. It also uses new LBRv6 format, which report the cycle information using upper 16-bit of the LBR_TO. - It's recommended to use CPU_CLK_UNHALTED.CORE_P + NPEBS for precise cycles. For details, please refer to the latest SDM058: http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-vol-3b-part-2-manual.pdfSigned-off-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1460706167-45320-1-git-send-email-kan.liang@intel.comSigned-off-by: Ingo Molnar <mingo@kernel.org>
-
Ingo Molnar authored
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-
Peter Zijlstra authored
Markus reported that 0 should also disable the throttling we per Documentation/sysctl/kernel.txt. Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Fixes: 91a612ee ("perf/core: Fix dynamic interrupt throttle") Signed-off-by: Ingo Molnar <mingo@kernel.org>
-
Srinivas Pandruvada authored
Added one missing Haswell model. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: bp@alien8.de Cc: hpa@zytor.com Link: http://lkml.kernel.org/r/1460907809-11897-1-git-send-email-srinivas.pandruvada@linux.intel.comSigned-off-by: Ingo Molnar <mingo@kernel.org>
-
Andi Kleen authored
Everything the same as base Skylake, just a new model number. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1460751933-2264-1-git-send-email-andi@firstfloor.orgSigned-off-by: Ingo Molnar <mingo@kernel.org>
-