- 16 Nov, 2022 9 commits
-
-
Namhyung Kim authored
We don't know how long cgroup name is, but at least we can align short ones like below. $ perf stat -a --for-each-cgroup system.slice,user.slice true Performance counter stats for 'system wide': 0.13 msec cpu-clock system.slice # 0.010 CPUs utilized 4 context-switches system.slice # 31.989 K/sec 1 cpu-migrations system.slice # 7.997 K/sec 0 page-faults system.slice # 0.000 /sec 450,673 cycles system.slice # 3.604 GHz (92.41%) 161,216 instructions system.slice # 0.36 insn per cycle (92.41%) 32,678 branches system.slice # 261.332 M/sec (92.41%) 2,628 branch-misses system.slice # 8.04% of all branches (92.41%) 14.29 msec cpu-clock user.slice # 1.163 CPUs utilized 35 context-switches user.slice # 2.449 K/sec 12 cpu-migrations user.slice # 839.691 /sec 57 page-faults user.slice # 3.989 K/sec 49,683,026 cycles user.slice # 3.477 GHz (99.38%) 110,790,266 instructions user.slice # 2.23 insn per cycle (99.38%) 24,552,255 branches user.slice # 1.718 G/sec (99.38%) 127,779 branch-misses user.slice # 0.52% of all branches (99.38%) 0.012289431 seconds time elapsed Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221114230227.1255976-10-namhyung@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Namhyung Kim authored
Unfortunately, event running time, percentage and noise data are printed in different positions in normal output than CSV/JSON. I think it's better to put such details in where it actually prints. So add before_metric argument to print_noise() and print_running() and call them twice before and after the metric. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221114230227.1255976-9-namhyung@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Namhyung Kim authored
In the printout() function, it checks if the event is bad (i.e. not counted or not supported) and print the result. But it does the same what abs_printout() is doing. So add an argument to indicate the value is ok or not and use the same function in both cases. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221114230227.1255976-8-namhyung@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Namhyung Kim authored
And split it for each output mode like others. I believe it makes the code simpler and more intuitive. Now abs_printout() becomes just to call sub-functions. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221114230227.1255976-7-namhyung@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Namhyung Kim authored
The aggr_printout() function is to print aggr_id and count (nr). Split it for each output mode to simplify the code. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221114230227.1255976-6-namhyung@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Namhyung Kim authored
Likewise, split print_cgroup() for each output mode. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221114230227.1255976-5-namhyung@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Namhyung Kim authored
Likewise, split print_noise_pct() for each output mode. Although it's a tiny function, more logic will be added soon so it'd be better split it and treat it in the same way. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221114230227.1255976-4-namhyung@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Namhyung Kim authored
To make the code more obvious and hopefully simpler, factor out the code for each output mode - stdio, CSV, JSON. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221114230227.1255976-3-namhyung@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Namhyung Kim authored
The --interval-clear option makes perf stat to clear the terminal at each interval. But it doesn't need to clear the screen when it saves to a file. Make it fail when it's enabled with the output options. $ perf stat -I 1 --interval-clear -o myfile true --interval-clear does not work with output Usage: perf stat [<options>] [<command>] -o, --output <file> output file name --log-fd <n> log output to fd, instead of stderr --interval-clear clear screen in between new interval Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221114230227.1255976-2-namhyung@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
- 15 Nov, 2022 8 commits
-
-
Ian Rogers authored
Previously print_pmu_events() would compute the values to be printed, place them in struct sevent, sort them and then print them. Modify the code so that struct sevent holds just the PMU and event, sort these and then in the main print loop calculate aliases for names, etc. This avoids memory allocations for copied values as they are computed then printed. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Xin Gao <gaoxin@cdjrlc.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20221114210723.2749751-9-irogers@google.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Ian Rogers authored
The current code computes an array of symbol names then sorts and prints them. Use a strlist to create a list of names that is sorted and then print it. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Xin Gao <gaoxin@cdjrlc.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20221114210723.2749751-8-irogers@google.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Ian Rogers authored
The current code computes an array of cache names then sorts and prints them. Use a strlist to create a list of names that is sorted. Keep the hybrid names, it is unclear how to generalize it, but drop the computation of evt_pmus that is never used. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kang Minchul <tegongkang@gmail.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Xin Gao <gaoxin@cdjrlc.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20221114210723.2749751-7-irogers@google.com [ Fixed up clash with cf9f67b3 ("perf print-events: Remove redundant comparison with zero")] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Ian Rogers authored
Deprecate the --cputype option and add a --unit option where '--unit cpu_atom' behaves like '--cputype atom'. The --unit option can be used with arbitrary PMUs, for example: ``` $ perf list --unit msr pmu List of pre-defined events (to be used in -e or -M): msr/aperf/ [Kernel PMU event] msr/cpu_thermal_margin/ [Kernel PMU event] msr/mperf/ [Kernel PMU event] msr/pperf/ [Kernel PMU event] msr/smi/ [Kernel PMU event] msr/tsc/ [Kernel PMU event] ``` Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Xin Gao <gaoxin@cdjrlc.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20221114210723.2749751-6-irogers@google.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Ian Rogers authored
In print_tracepoint_events() use tracing_events__scandir_alphasort() and scandir alphasort so that the subsystem and events are sorted and don't need a secondary qsort. Locally this results in the following change: ... ext4:ext4_zero_range [Tracepoint event] - fib6:fib6_table_lookup [Tracepoint event] fib:fib_table_lookup [Tracepoint event] + fib6:fib6_table_lookup [Tracepoint event] filelock:break_lease_block [Tracepoint event] ... ie fib6 now is after fib and not before it. This is more consistent with how numbers are more generally sorted, such as: ... syscalls:sys_enter_renameat [Tracepoint event] syscalls:sys_enter_renameat2 [Tracepoint event] ... and so an improvement over the qsort approach. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Xin Gao <gaoxin@cdjrlc.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20221114210723.2749751-5-irogers@google.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Ian Rogers authored
tracing_events__opendir() allows iteration over files in <debugfs>/tracing/events but with an arbitrary sort order. Add a scandir alternative where the results are alphabetically sorted. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Xin Gao <gaoxin@cdjrlc.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20221114210723.2749751-4-irogers@google.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Ian Rogers authored
Add documentation to 'struct perf_pmu' and the associated structs of 'perf_pmu_alias' and 'perf_pmu_format'. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Xin Gao <gaoxin@cdjrlc.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20221114210723.2749751-3-irogers@google.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Ian Rogers authored
Replace usage with perf_pmu__is_hybrid(). Suggested-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Xin Gao <gaoxin@cdjrlc.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20221114210723.2749751-2-irogers@google.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
- 14 Nov, 2022 11 commits
-
-
Namhyung Kim authored
It should have a comma after 'cpus' for socket and die aggregation mode. The output of the following command shows the issue. $ sudo perf stat -a --per-socket -x, --metric-only -I1 true Before: +--- here V time,socket,cpusGhz,insn per cycle,branch-misses of all branches, 0.000908461,S0,8,0.950,1.65,1.21, After: time,socket,cpus,GHz,insn per cycle,branch-misses of all branches, 0.000683094,S0,8,0.593,2.00,0.60, Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221112032244.1077370-12-namhyung@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Namhyung Kim authored
It should not print "summary" for each event when --metric-only is set. Before: $ sudo perf stat -a --per-socket --summary -x, --metric-only true time,socket,cpusGhz,insn per cycle,branch-misses of all branches, 0.000709079,S0,8,0.893,2.40,0.45, S0,8, summary, summary, summary, summary, summary,0.893, summary,2.40, summary, summary,0.45, After: $ sudo perf stat -a --per-socket --summary -x, --metric-only true time,socket,cpusGHz,insn per cycle,branch-misses of all branches, 0.000882297,S0,8,0.598,1.64,0.64, summary,S0,8,0.598,1.64,0.64, Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221112032244.1077370-11-namhyung@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Arnaldo Carvalho de Melo authored
To pick up fixes that went thru perf/urgent. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Namhyung Kim authored
The pm variable holds an appropriate function to print metrics for CSV anf JSON already. So we can combine the if statement to simplify the code a little bit. This also matches to the above condition for non-CSV and non-JSON case. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221107213314.3239159-10-namhyung@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Namhyung Kim authored
The num_print_interval and config->interval_clear should be checked together like other places like later in the function. Otherwise, the --interval-clear option could print the headers for the CSV or JSON output unnecessarily. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221107213314.3239159-9-namhyung@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Namhyung Kim authored
It missed to print a matching header line for intervals. Before: # perf stat -a -e cycles,instructions --metric-only -j -I 500 {"unit" : "insn per cycle"} {"interval" : 0.500544283}{"metric-value" : "1.96"} ^C After: # perf stat -a -e cycles,instructions --metric-only -j -I 500 {"unit" : "sec"}{"unit" : "insn per cycle"} {"interval" : 0.500515681}{"metric-value" : "2.31"} ^C Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221107213314.3239159-8-namhyung@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Namhyung Kim authored
Currently --metric-only with --json indents header lines. This is not needed for JSON. $ perf stat -aA --metric-only -j true {"unit" : "GHz"}{"unit" : "insn per cycle"}{"unit" : "branch-misses of all branches"} {"cpu" : "0", {"metric-value" : "0.101"}{"metric-value" : "0.86"}{"metric-value" : "1.91"} {"cpu" : "1", {"metric-value" : "0.102"}{"metric-value" : "0.87"}{"metric-value" : "2.02"} {"cpu" : "2", {"metric-value" : "0.085"}{"metric-value" : "1.02"}{"metric-value" : "1.69"} ... Note that the other lines are broken JSON, but it will be handled later. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221107213314.3239159-7-namhyung@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Namhyung Kim authored
Currently it prints all metric headers for JSON output. But actually it skips some metrics with valid_only_metric(). So the output looks like: $ perf stat --metric-only --json true {"unit" : "CPUs utilized", "unit" : "/sec", "unit" : "/sec", "unit" : "/sec", "unit" : "GHz", "unit" : "insn per cycle", "unit" : "/sec", "unit" : "branch-misses of all branches"} {"metric-value" : "3.861"}{"metric-value" : "0.79"}{"metric-value" : "3.04"} As you can see there are 8 units in the header but only 3 metric-values are there. It should skip the unused headers as well. Also each unit should be printed as a separate object like metric values. With this patch: $ perf stat --metric-only --json true {"unit" : "GHz"}{"unit" : "insn per cycle"}{"unit" : "branch-misses of all branches"} {"metric-value" : "4.166"}{"metric-value" : "0.73"}{"metric-value" : "2.96"} Fixes: df936cad ("perf stat: Add JSON output option") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Claire Jensen <cjense@google.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221107213314.3239159-6-namhyung@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Namhyung Kim authored
The struct perf_stat_output_ctx is set in a loop with the same values. Move the code out of the loop and keep the loop minimal. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221107213314.3239159-5-namhyung@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Namhyung Kim authored
The --interval-clear option makes perf stat to clear the terminal at each interval. But it doesn't need to clear the screen when it saves to a file. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221107213314.3239159-4-namhyung@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Namhyung Kim authored
When perf stat is called with very detailed events, the output doesn't align well like below: $ sudo perf stat -a -ddd sleep 1 Performance counter stats for 'system wide': 8,020.23 msec cpu-clock # 7.997 CPUs utilized 3,970 context-switches # 494.998 /sec 169 cpu-migrations # 21.072 /sec 586 page-faults # 73.065 /sec 649,568,060 cycles # 0.081 GHz (30.42%) 304,044,345 instructions # 0.47 insn per cycle (38.40%) 60,313,022 branches # 7.520 M/sec (38.89%) 2,766,919 branch-misses # 4.59% of all branches (39.26%) 74,422,951 L1-dcache-loads # 9.279 M/sec (39.39%) 8,025,568 L1-dcache-load-misses # 10.78% of all L1-dcache accesses (39.22%) 3,314,995 LLC-loads # 413.329 K/sec (30.83%) 1,225,619 LLC-load-misses # 36.97% of all LL-cache accesses (30.45%) <not supported> L1-icache-loads 20,420,493 L1-icache-load-misses # 0.00% of all L1-icache accesses (30.29%) 58,017,947 dTLB-loads # 7.234 M/sec (30.37%) 704,677 dTLB-load-misses # 1.21% of all dTLB cache accesses (30.27%) 234,225 iTLB-loads # 29.204 K/sec (30.29%) 417,166 iTLB-load-misses # 178.10% of all iTLB cache accesses (30.32%) <not supported> L1-dcache-prefetches <not supported> L1-dcache-prefetch-misses 1.002947355 seconds time elapsed Increase the METRIC_LEN by 3 so that it can align properly. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221107213314.3239159-3-namhyung@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
- 13 Nov, 2022 3 commits
-
-
Linus Torvalds authored
-
git://git.kernel.org/pub/scm/linux/kernel/git/mips/linuxLinus Torvalds authored
Pull MIPS fixes from Thomas Bogendoerfer: - fix jump label branch range check - check kmalloc failures in Loongson64 kexec - fix builds with clang-14 - fix char/int handling in pic32 * tag 'mips-fixes_6.1_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: MIPS: pic32: treat port as signed integer MIPS: jump_label: Fix compat branch range check mips: alchemy: gpio: Include the right header MIPS: Loongson64: Add WARN_ON on kexec related kmalloc failed MIPS: fix duplicate definitions for exported symbols mips: boot/compressed: use __NO_FORTIFY
-
git://git.kernel.org/pub/scm/linux/kernel/git/efi/efiLinus Torvalds authored
Pull EFI fixes from Ard Biesheuvel: - Force the use of SetVirtualAddressMap() on Ampera Altra arm64 machines, which crash in SetTime() if no virtual remapping is used This is the first time we've added an SMBIOS based quirk on arm64, but fortunately, we can just call a EFI protocol to grab the type #1 SMBIOS record when running in the stub, so we don't need all the machinery we have in the kernel proper to parse SMBIOS data. - Drop a spurious warning on misaligned runtime regions when using 16k or 64k pages on arm64 * tag 'efi-fixes-for-v6.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: arm64: efi: Fix handling of misaligned runtime regions and drop warning arm64: efi: Force the use of SetVirtualAddressMap() on Altra machines
-
- 12 Nov, 2022 6 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds authored
Pull SCSI fixes from James Bottomley: "Three small fixes, all in drivers. The sas one is in an unlikely error leg, the debug one is to make it more standards conformant and the ibmvfc one is to fix a user visible bug where a failover could lose all paths to the device" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: scsi_debug: Make the READ CAPACITY response compliant with ZBC scsi: scsi_transport_sas: Fix error handling in sas_phy_add() scsi: ibmvfc: Avoid path failures during live migration
-
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/soundLinus Torvalds authored
Pull additional sound fix from Takashi Iwai: "A regression fix for the latest memalloc helper change" * tag 'sound-fix-6.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: memalloc: Try dma_alloc_noncontiguous() at first
-
Takashi Iwai authored
The latest fix for the non-contiguous memalloc helper changed the allocation method for a non-IOMMU system to use only the fallback allocator. This should have worked, but it caused a problem sometimes when too many non-contiguous pages are allocated that can't be treated by HD-audio controller. As a quirk workaround, go back to the original strategy: use dma_alloc_noncontiguous() at first, and apply the fallback only when it fails, but only for non-IOMMU case. We'll need a better fix in the fallback code as well, but this workaround should paper over most cases. Fixes: 9736a325 ("ALSA: memalloc: Don't fall back for SG-buffer with IOMMU") Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/CAHk-=wgSH5ubdvt76gNwa004ooZAEJL_1Q-Fyw5M2FDdqL==dg@mail.gmail.com Link: https://lore.kernel.org/r/20221112084718.3305-1-tiwai@suse.deSigned-off-by: Takashi Iwai <tiwai@suse.de>
-
git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libataLinus Torvalds authored
Pull ata fixes from Damien Le Moal: "Several libata generic code fixes for rc5: - Add missing translation of the SYNCHRONIZE CACHE 16 scsi command as this command is mandatory for host-managed ZBC drives. The lack of support for it in libata-scsi was causing issues with some passthrough applications using ZBC drives (from Shin'ichiro). - Fix the error path of libata-transport host, port, link and device attributes initialization (from Yingliang). - Prevent issuing new commands to a drive that is in the NCQ error state and undergoing recovery (From Niklas). This bug went unnoticed for a long time as commands issued to a drive in error state are aborted immediately and retried by the scsi layer, hiding the useless abort-and-retry sequence" * tag 'ata-6.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata: ata: libata-core: do not issue non-internal commands once EH is pending ata: libata-transport: fix error handling in ata_tdev_add() ata: libata-transport: fix error handling in ata_tlink_add() ata: libata-transport: fix error handling in ata_tport_add() ata: libata-transport: fix double ata_host_put() in ata_tport_add() ata: libata-scsi: fix SYNCHRONIZE CACHE (16) command failure
-
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mmLinus Torvalds authored
Pull misc hotfixes from Andrew Morton: "22 hotfixes. Eight are cc:stable and the remainder address issues which were introduced post-6.0 or which aren't considered serious enough to justify a -stable backport" * tag 'mm-hotfixes-stable-2022-11-11' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (22 commits) docs: kmsan: fix formatting of "Example report" mm/damon/dbgfs: check if rm_contexts input is for a real context maple_tree: don't set a new maximum on the node when not reusing nodes maple_tree: fix depth tracking in maple_state arch/x86/mm/hugetlbpage.c: pud_huge() returns 0 when using 2-level paging fs: fix leaked psi pressure state nilfs2: fix use-after-free bug of ns_writer on remount x86/traps: avoid KMSAN bugs originating from handle_bug() kmsan: make sure PREEMPT_RT is off Kconfig.debug: ensure early check for KMSAN in CONFIG_KMSAN_WARN x86/uaccess: instrument copy_from_user_nmi() kmsan: core: kmsan_in_runtime() should return true in NMI context mm: hugetlb_vmemmap: include missing linux/moduleparam.h mm/shmem: use page_mapping() to detect page cache for uffd continue mm/memremap.c: map FS_DAX device memory as decrypted Partly revert "mm/thp: carry over dirty bit when thp splits on pmd" nilfs2: fix deadlock in nilfs_count_free_blocks() mm/mmap: fix memory leak in mmap_region() hugetlbfs: don't delete error page from pagecache maple_tree: reorganize testing to restore module testing ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linuxLinus Torvalds authored
Pull arm64 fixes from Catalin Marinas: - Another fix for rodata=full. Since rodata= is not a simple boolean on arm64 (accepting 'full' as well), it got inadvertently broken by changes in the core code. If rodata=on is the default and rodata=off is passed on the kernel command line, rodata_full is never disabled - Fix gcc compiler warning of shifting 0xc0 into bits 31:24 without an explicit conversion to u32 (triggered by the AMPERE1 MIDR definition) - Include asm/ptrace.h in asm/syscall_wrapper.h to fix an incomplete struct pt_regs type causing the BPF verifier to refuse to load a tracing program which accesses pt_regs * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64/syscall: Include asm/ptrace.h in syscall_wrapper header. arm64: Fix bit-shifting UB in the MIDR_CPU_MODEL() macro arm64: fix rodata=full again
-
- 11 Nov, 2022 3 commits
-
-
Niklas Cassel authored
While the ATA specification states that a device should return command aborted for all commands queued after the device has entered error state, since ATA only keeps the sense data for the latest command (in non-NCQ case), we really don't want to send block layer commands to the device after it has entered error state. (Only ATA EH commands should be sent, to read the sense data etc.) Currently, scsi_queue_rq() will check if scsi_host_in_recovery() (state is SHOST_RECOVERY), and if so, it will _not_ issue a command via: scsi_dispatch_cmd() -> host->hostt->queuecommand() (ata_scsi_queuecmd()) -> __ata_scsi_queuecmd() -> ata_scsi_translate() -> ata_qc_issue() Before commit e494f6a7 ("[SCSI] improved eh timeout handler"), when receiving a TFES error IRQ, the call chain looked like this: ahci_error_intr() -> ata_port_abort() -> ata_do_link_abort() -> ata_qc_complete() -> ata_qc_schedule_eh() -> blk_abort_request() -> blk_rq_timed_out() -> q->rq_timed_out_fn() (scsi_times_out()) -> scsi_eh_scmd_add() -> scsi_host_set_state(shost, SHOST_RECOVERY) Which meant that as soon as an error IRQ was serviced, SHOST_RECOVERY would be set. However, after commit e494f6a7 ("[SCSI] improved eh timeout handler"), scsi_times_out() will instead call scsi_abort_command() which will queue delayed work, and the worker function scmd_eh_abort_handler() will call scsi_eh_scmd_add(), which calls scsi_host_set_state(shost, SHOST_RECOVERY). So now, after the TFES error IRQ has been serviced, we need to wait for the SCSI workqueue to run its work before SHOST_RECOVERY gets set. It is worth noting that, even before commit e494f6a7 ("[SCSI] improved eh timeout handler"), we could receive an error IRQ from the time when scsi_queue_rq() checks scsi_host_in_recovery(), to the time when ata_scsi_queuecmd() is actually called. In order to handle both the delayed setting of SHOST_RECOVERY and the window where we can receive an error IRQ, add a check against ATA_PFLAG_EH_PENDING (which gets set when servicing the error IRQ), inside ata_scsi_queuecmd() itself, while holding the ap->lock. (Since the ap->lock is held while servicing IRQs.) Fixes: e494f6a7 ("[SCSI] improved eh timeout handler") Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Tested-by: John Garry <john.g.garry@oracle.com> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
-
git://git.kernel.dk/linuxLinus Torvalds authored
Pull block fixes from Jens Axboe: - NVMe pull request via Christoph: - Quiet user passthrough command errors (Keith Busch) - Fix memory leak in nvmet_subsys_attr_model_store_locked - Fix a memory leak in nvmet-auth (Sagi Grimberg) - Fix a potential NULL point deref in bfq (Yu) - Allocate command/response buffers separately for DMA for sed-opal, rather than rely on embedded alignment (Serge) * tag 'block-6.1-2022-11-11' of git://git.kernel.dk/linux: nvmet: fix a memory leak nvmet: fix memory leak in nvmet_subsys_attr_model_store_locked nvme: quiet user passthrough command errors block: sed-opal: kmalloc the cmd/resp buffers block, bfq: fix null pointer dereference in bfq_bio_bfqg()
-
git://git.kernel.dk/linuxLinus Torvalds authored
Pull io_uring fixes from Jens Axboe: "Nothing major, just a few minor tweaks: - Tweak for the TCP zero-copy io_uring self test (Pavel) - Rather than use our internal cached value of number of CQ events available, use what the user can see (Dylan) - Fix a typo in a comment, added in this release (me) - Don't allow wrapping while adding provided buffers (me) - Fix a double poll race, and add a lockdep assertion for it too (Pavel)" * tag 'io_uring-6.1-2022-11-11' of git://git.kernel.dk/linux: io_uring/poll: lockdep annote io_poll_req_insert_locked io_uring/poll: fix double poll req->flags races io_uring: check for rollover of buffer ID when providing buffers io_uring: calculate CQEs from the user visible value io_uring: fix typo in io_uring.h comment selftests/net: don't tests batched TCP io_uring zc
-