- 29 Apr, 2021 21 commits
-
-
Jin Yao authored
For cache events, they have pre-defined configs. The kernel needs to know where the cache event comes from (e.g. from cpu_core pmu or from cpu_atom pmu). But the perf type PERF_TYPE_HW_CACHE can't carry pmu information. Now the type PERF_TYPE_HW_CACHE is extended to be PMU aware type. The PMU type ID is stored at attr.config[63:32]. When enabling a hybrid cache event without specified pmu, such as, 'perf stat -e LLC-loads -a', two events are created automatically. One is for atom, the other is for core. # perf stat -e LLC-loads -a -vv -- sleep 1 Control descriptor is not initialized ------------------------------------------------------------ perf_event_attr: type 3 size 120 config 0x400000002 sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 3 ------------------------------------------------------------ ... ------------------------------------------------------------ perf_event_attr: type 3 size 120 config 0x400000002 sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 15 group_fd -1 flags 0x8 = 19 ------------------------------------------------------------ perf_event_attr: type 3 size 120 config 0x800000002 sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 16 group_fd -1 flags 0x8 = 20 ------------------------------------------------------------ ... ------------------------------------------------------------ perf_event_attr: type 3 size 120 config 0x800000002 sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 23 group_fd -1 flags 0x8 = 27 LLC-loads: 0: 1507 1001800280 1001800280 LLC-loads: 1: 666 1001812250 1001812250 LLC-loads: 2: 3353 1001813453 1001813453 LLC-loads: 3: 514 1001848795 1001848795 LLC-loads: 4: 627 1001952832 1001952832 LLC-loads: 5: 4399 1001451154 1001451154 LLC-loads: 6: 1240 1001481052 1001481052 LLC-loads: 7: 478 1001520348 1001520348 LLC-loads: 8: 691 1001551236 1001551236 LLC-loads: 9: 310 1001578945 1001578945 LLC-loads: 10: 1018 1001594354 1001594354 LLC-loads: 11: 3656 1001622355 1001622355 LLC-loads: 12: 882 1001661416 1001661416 LLC-loads: 13: 506 1001693963 1001693963 LLC-loads: 14: 3547 1001721013 1001721013 LLC-loads: 15: 1399 1001734818 1001734818 LLC-loads: 0: 1314 1001793826 1001793826 LLC-loads: 1: 2857 1001752764 1001752764 LLC-loads: 2: 646 1001830694 1001830694 LLC-loads: 3: 1612 1001864861 1001864861 LLC-loads: 4: 2244 1001912381 1001912381 LLC-loads: 5: 1255 1001943889 1001943889 LLC-loads: 6: 4624 1002021109 1002021109 LLC-loads: 7: 2703 1001959302 1001959302 LLC-loads: 24793 16026838264 16026838264 LLC-loads: 17255 8015078826 8015078826 Performance counter stats for 'system wide': 24,793 cpu_core/LLC-loads/ 17,255 cpu_atom/LLC-loads/ 1.001970988 seconds time elapsed 0x4 in 0x400000002 indicates the cpu_core pmu. 0x8 in 0x800000002 indicates the cpu_atom pmu. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210427070139.25256-10-yao.jin@linux.intel.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Jin Yao authored
Current hardware events has special perf types PERF_TYPE_HARDWARE. But it doesn't pass the PMU type in the user interface. For a hybrid system, the perf kernel doesn't know which PMU the events belong to. So now this type is extended to be PMU aware type. The PMU type ID is stored at attr.config[63:32]. PMU type ID is retrieved from sysfs. root@lkp-adl-d01:/sys/devices/cpu_atom# cat type 8 root@lkp-adl-d01:/sys/devices/cpu_core# cat type 4 When enabling a hybrid hardware event without specified pmu, such as, 'perf stat -e cycles -a', two events are created automatically. One is for atom, the other is for core. # perf stat -e cycles -a -vv -- sleep 1 Control descriptor is not initialized ------------------------------------------------------------ perf_event_attr: size 120 config 0x400000000 sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 3 ------------------------------------------------------------ ... ------------------------------------------------------------ perf_event_attr: size 120 config 0x400000000 sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 15 group_fd -1 flags 0x8 = 19 ------------------------------------------------------------ perf_event_attr: size 120 config 0x800000000 sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 16 group_fd -1 flags 0x8 = 20 ------------------------------------------------------------ ... ------------------------------------------------------------ perf_event_attr: size 120 config 0x800000000 sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 23 group_fd -1 flags 0x8 = 27 cycles: 0: 836272 1001525722 1001525722 cycles: 1: 628564 1001580453 1001580453 cycles: 2: 872693 1001605997 1001605997 cycles: 3: 70417 1001641369 1001641369 cycles: 4: 88593 1001726722 1001726722 cycles: 5: 470495 1001752993 1001752993 cycles: 6: 484733 1001840440 1001840440 cycles: 7: 1272477 1001593105 1001593105 cycles: 8: 209185 1001608616 1001608616 cycles: 9: 204391 1001633962 1001633962 cycles: 10: 264121 1001661745 1001661745 cycles: 11: 826104 1001689904 1001689904 cycles: 12: 89935 1001728861 1001728861 cycles: 13: 70639 1001756757 1001756757 cycles: 14: 185266 1001784810 1001784810 cycles: 15: 171094 1001825466 1001825466 cycles: 0: 129624 1001854843 1001854843 cycles: 1: 122533 1001840421 1001840421 cycles: 2: 90055 1001882506 1001882506 cycles: 3: 139607 1001896463 1001896463 cycles: 4: 141791 1001907838 1001907838 cycles: 5: 530927 1001883880 1001883880 cycles: 6: 143246 1001852529 1001852529 cycles: 7: 667769 1001872626 1001872626 cycles: 6744979 16026956922 16026956922 cycles: 1965552 8014991106 8014991106 Performance counter stats for 'system wide': 6,744,979 cpu_core/cycles/ 1,965,552 cpu_atom/cycles/ 1.001882711 seconds time elapsed 0x4 in 0x400000000 indicates the cpu_core pmu. 0x8 in 0x800000000 indicates the cpu_atom pmu. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210427070139.25256-9-yao.jin@linux.intel.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Jin Yao authored
It would be useful to let user know the pmu which the event belongs to. perf-stat has supported '--no-merge' option and it can print the pmu name after the event name, such as: "cycles [cpu_core]" Now this option is enabled by default for hybrid platform but change the format to: "cpu_core/cycles/" If user configs the name, we still use the user specified name. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> ink: https://lore.kernel.org/r/20210427070139.25256-8-yao.jin@linux.intel.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Jin Yao authored
The functions perf_pmu__is_hybrid and perf_pmu__find_hybrid_pmu can be used to identify the hybrid platform and return the found hybrid cpu pmu. All the detected hybrid pmus have been saved in 'perf_pmu__hybrid_pmus' list. So we just need to search this list. perf_pmu__hybrid_type_to_pmu converts the user specified string to hybrid pmu name. This is used to support the '--cputype' option in next patches. perf_pmu__has_hybrid checks the existing of hybrid pmu. Note that, we have to define it in pmu.c (make pmu-hybrid.c no more symbol dependency), otherwise perf test python would be failed. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210427070139.25256-7-yao.jin@linux.intel.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Jin Yao authored
We identify the cpu_core pmu and cpu_atom pmu by explicitly checking following files: For cpu_core, checks: "/sys/bus/event_source/devices/cpu_core/cpus" For cpu_atom, checks: "/sys/bus/event_source/devices/cpu_atom/cpus" If the 'cpus' file exists and it has data, the pmu exists. But in order not to hardcode the "cpu_core" and "cpu_atom", and make the code in a generic way. So if the path "/sys/bus/event_source/devices/cpu_xxx/cpus" exists, the hybrid pmu exists. All the detected hybrid pmus are linked to a global list 'perf_pmu__hybrid_pmus' and then next we just need to iterate the list to get all hybrid pmu by using perf_pmu__for_each_hybrid_pmu. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210427070139.25256-6-yao.jin@linux.intel.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Jin Yao authored
On hybrid platform, one event is available on one pmu (such as, available on cpu_core or on cpu_atom). This patch saves the pmu name to the pmu field of struct perf_pmu_alias. Then next we can know the pmu which the event can be enabled on. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210427070139.25256-5-yao.jin@linux.intel.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Jin Yao authored
Simplify the arguments of __perf_pmu__new_alias() by passing the whole 'struct pme_event' pointer. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210427070139.25256-4-yao.jin@linux.intel.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Jin Yao authored
For some Intel platforms, such as Alderlake, which is a hybrid platform and it consists of atom cpu and core cpu. Each cpu has dedicated event list. Part of events are available on core cpu, part of events are available on atom cpu. The kernel exports new cpu pmus: cpu_core and cpu_atom. The event in json is added with a new field "Unit" to indicate which pmu the event is available on. For example, one event in cache.json, { "BriefDescription": "Counts the number of load ops retired that", "CollectPEBSRecord": "2", "Counter": "0,1,2,3", "EventCode": "0xd2", "EventName": "MEM_LOAD_UOPS_RETIRED_MISC.MMIO", "PEBScounters": "0,1,2,3", "SampleAfterValue": "1000003", "UMask": "0x80", "Unit": "cpu_atom" }, The unit "cpu_atom" indicates this event is only available on "cpu_atom". In generated pmu-events.c, we can see: { .name = "mem_load_uops_retired_misc.mmio", .event = "period=1000003,umask=0x80,event=0xd2", .desc = "Counts the number of load ops retired that. Unit: cpu_atom ", .topic = "cache", .pmu = "cpu_atom", }, But if without this patch, the "uncore_" prefix is added before "cpu_atom", such as: .pmu = "uncore_cpu_atom" That would be a wrong pmu. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210427070139.25256-3-yao.jin@linux.intel.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Jin Yao authored
To get the changes in: Liang Kan's patch 55bcf6ef ("perf: Extend PERF_TYPE_HARDWARE and PERF_TYPE_HW_CACHE") Kan's patch is in the tip/perf/core branch. So the next perf tool patches need this interface for hybrid support. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210427070139.25256-2-yao.jin@linux.intel.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Namhyung Kim authored
It's sometimes useful to see how many samples vs other events in the data file with percent values. $ perf report --stat Aggregated stats: TOTAL events: 20064 MMAP events: 239 ( 1.2%) COMM events: 1518 ( 7.6%) EXIT events: 1 ( 0.0%) FORK events: 1517 ( 7.6%) SAMPLE events: 4015 (20.0%) MMAP2 events: 12769 (63.6%) FINISHED_ROUND events: 2 ( 0.0%) THREAD_MAP events: 1 ( 0.0%) CPU_MAP events: 1 ( 0.0%) TIME_CONV events: 1 ( 0.0%) cycles stats: SAMPLE events: 2475 instructions stats: SAMPLE events: 1540 Suggested-by: Andi Kleen <ak@linux.intel.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210427013717.1651674-7-namhyung@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Namhyung Kim authored
so that the compact output is shown by default. Also add 'report.skip-empty' config option to override the default. Users can also use --no-skip-empty command line option to change the behavior anytime. Committer testing: $ perf report --stat Aggregated stats: TOTAL events: 19 COMM events: 2 EXIT events: 1 SAMPLE events: 8 MMAP2 events: 4 FINISHED_ROUND events: 1 THREAD_MAP events: 1 CPU_MAP events: 1 TIME_CONV events: 1 cycles:u stats: SAMPLE events: 8 $ perf config report.skip-empty=false $ perf report --stat Aggregated stats: TOTAL events: 19 MMAP events: 0 LOST events: 0 COMM events: 2 EXIT events: 1 THROTTLE events: 0 UNTHROTTLE events: 0 FORK events: 0 READ events: 0 SAMPLE events: 8 MMAP2 events: 4 AUX events: 0 ITRACE_START events: 0 LOST_SAMPLES events: 0 SWITCH events: 0 SWITCH_CPU_WIDE events: 0 NAMESPACES events: 0 KSYMBOL events: 0 BPF_EVENT events: 0 CGROUP events: 0 TEXT_POKE events: 0 ATTR events: 0 EVENT_TYPE events: 0 TRACING_DATA events: 0 BUILD_ID events: 0 FINISHED_ROUND events: 1 ID_INDEX events: 0 AUXTRACE_INFO events: 0 AUXTRACE events: 0 AUXTRACE_ERROR events: 0 THREAD_MAP events: 1 CPU_MAP events: 1 STAT_CONFIG events: 0 STAT events: 0 STAT_ROUND events: 0 EVENT_UPDATE events: 0 TIME_CONV events: 1 FEATURE events: 0 COMPRESSED events: 0 cycles:u stats: SAMPLE events: 8 $ perf config report.skip-empty report.skip-empty=false $ Reviewed-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210427013717.1651674-6-namhyung@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Namhyung Kim authored
To make the output more readable, I think it's better to remove 0's in the output. Also the dummy event has no event stats so it just wasts the space. Let's use the --skip-empty option to suppress it. $ perf report --stat --skip-empty Aggregated stats: TOTAL events: 16530 MMAP events: 226 COMM events: 1596 EXIT events: 2 THROTTLE events: 121 UNTHROTTLE events: 117 FORK events: 1595 SAMPLE events: 719 MMAP2 events: 12147 CGROUP events: 2 FINISHED_ROUND events: 2 THREAD_MAP events: 1 CPU_MAP events: 1 TIME_CONV events: 1 cycles stats: SAMPLE events: 719 Reviewed-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210427013717.1651674-5-namhyung@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Namhyung Kim authored
To make the output identical with perf report -D, it needs to show per-event sample counts along with the aggregated stat at the end. Reviewed-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210427013717.1651674-4-namhyung@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Namhyung Kim authored
Each struct hists have events_stats but most of the fields were not used. It's to count number of samples and periods whether filtered or not. And other fields are used only by evlist. So it'd be better to split hists_stats and events_stats to reduce wasted memory in the struct hists. This makes the output of event statistics in the perf report compact by skipping 0 events in each evsel/hists. Reviewed-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210427013717.1651674-3-namhyung@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Namhyung Kim authored
It's mainly to count lost events for the warning so it should be ok to use the evlist->stats instead. This is needed for changes in the next commit. Reviewed-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210427013717.1651674-2-namhyung@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Nicholas Fraser authored
This adds a feature to export perf data to JSON. The resolved symbols are exported into the JSON so that external tools don't need to load the dsos themselves (or even have access to them at all.) This makes it easy to load and analyze perf data with standalone tools where direct perf or libbabeltrace integration is impractical. The exporter uses a minimal inline JSON encoding without any external dependencies. Currently it only outputs some headers and sample metadata but it's easily extensible. Use it like this: $ perf data convert --to-json out.json Committer notes: Fixup a __printf() bug that broke the build: util/data-convert-json.c:103:11: error: expected ‘)’ before numeric constant 103 | __(printf, 5, 6) | ^~ | ) util/data-convert-json.c: In function ‘output_sample_callchain_entry’: util/data-convert-json.c:124:2: error: implicit declaration of function ‘output_json_key_format’; did you mean ‘output_json_format’? [-Werror=implicit-function-declaration] 124 | output_json_key_format(out, false, 5, "ip", "\"0x%" PRIx64 "\"", ip); | ^~~~~~~~~~~~~~~~~~~~~~ | output_json_format Also had to add this patch to fix errors reported by various versions of clang: - if (al && al->sym && al->sym->name && strlen(al->sym->name) > 0) { + if (al && al->sym && al->sym->namelen) { al->sym->name is a zero sized array, to avoid one extra alloc in the symbol__new() constructor, sym->namelen carries its strlen. Committer testing: $ ls -la out.json ls: cannot access 'out.json': No such file or directory $ perf record sleep 0.1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.001 MB perf.data (8 samples) ] $ perf report --stats | grep -w SAMPLE SAMPLE events: 8 $ perf data convert --to-json out.json [ perf data convert: Converted 'perf.data' into JSON data 'out.json' ] [ perf data convert: Converted and wrote 0.002 MB (8 samples) ] $ ls -la out.json -rw-rw-r--. 1 acme acme 2017 Apr 26 17:29 out.json $ cat out.json { "linux-perf-json-version": 1, "headers": { "header-version": 1, "captured-on": "2021-04-26T20:28:57Z", "data-offset": 432, "data-size": 1016, "feat-offset": 1448, "hostname": "five", "os-release": "5.11.14-200.fc33.x86_64", "arch": "x86_64", "cpu-desc": "AMD Ryzen 9 3900X 12-Core Processor", "cpuid": "AuthenticAMD,23,113,0", "nrcpus-online": 24, "nrcpus-avail": 24, "perf-version": "5.12.gee134f3189bd", "cmdline": [ "/home/acme/bin/perf", "record", "sleep", "0.1" ] }, "samples": [ { "timestamp": 170517539043684, "pid": 375844, "tid": 375844, "comm": "sleep", "callchain": [ { "ip": "0xffffffffa6268827" } ] }, { "timestamp": 170517539048443, "pid": 375844, "tid": 375844, "comm": "sleep", "callchain": [ { "ip": "0xffffffffa661359d" } ] }, { "timestamp": 170517539051018, "pid": 375844, "tid": 375844, "comm": "sleep", "callchain": [ { "ip": "0xffffffffa6311e18" } ] }, { "timestamp": 170517539053652, "pid": 375844, "tid": 375844, "comm": "sleep", "callchain": [ { "ip": "0x7fdb77b4812b", "symbol": "_dl_start", "dso": "ld-2.32.so" } ] }, { "timestamp": 170517539055306, "pid": 375844, "tid": 375844, "comm": "sleep", "callchain": [ { "ip": "0xffffffffa6269286" } ] }, { "timestamp": 170517539057590, "pid": 375844, "tid": 375844, "comm": "sleep", "callchain": [ { "ip": "0xffffffffa62abd8b" } ] }, { "timestamp": 170517539067559, "pid": 375844, "tid": 375844, "comm": "sleep", "callchain": [ { "ip": "0x7fdb77b5e9e9", "symbol": "__GI___tunables_init", "dso": "ld-2.32.so" } ] }, { "timestamp": 170517539282452, "pid": 375844, "tid": 375844, "comm": "sleep", "callchain": [ { "ip": "0x7fdb779978d2", "symbol": "getenv", "dso": "libc-2.32.so" } ] } ] } $ Signed-off-by: Nicholas Fraser <nfraser@codeweavers.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Changbin Du <changbin.du@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tan Xiaojun <tanxiaojun@huawei.com> Cc: Ulrich Czekalla <uczekalla@codeweavers.com> Link: http://lore.kernel.org/lkml/3884969f-804d-2f53-c648-e2b0bd85edff@codeweavers.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Song Liu authored
Introduce bpf_counter_ops->disable(), which is used stop counting the event. Committer notes: Added a dummy bpf_counter__disable() to the python binding to avoid having 'perf test python' failing. bpf_counter isn't supported in the python binding. Signed-off-by: Song Liu <song@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Song Liu <songliubraving@fb.com> Cc: kernel-team@fb.com Link: https://lore.kernel.org/r/20210425214333.1090950-6-song@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Song Liu authored
Introduce 'b' modifier to event parser, which means use BPF program to manage this event. This is the same as --bpf-counters option, but only applies to this event. For example, perf stat -e cycles:b,cs # use bpf for cycles, but not cs perf stat -e cycles,cs --bpf-counters # use bpf for both cycles and cs Suggested-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Song Liu <song@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/r/20210425214333.1090950-5-song@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Song Liu authored
Currently, to use BPF to aggregate perf event counters, the user uses --bpf-counters option. Enable "use bpf by default" events with a config option, stat.bpf-counter-events. Events with name in the option will use BPF. This also enables mixed BPF event and regular event in the same sesssion. For example: perf config stat.bpf-counter-events=instructions perf stat -e instructions,cs The second command will use BPF for "instructions" but not "cs". Signed-off-by: Song Liu <song@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/r/20210425214333.1090950-4-song@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Song Liu authored
perf_attr_map could be shared among different version of perf binary. Add bperf_attr_map_compatible() to check whether the existing attr_map is compatible with current perf binary. Signed-off-by: Song Liu <song@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Song Liu <songliubraving@fb.com> Cc: kernel-team@fb.com Link: https://lore.kernel.org/r/20210425214333.1090950-3-song@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Song Liu authored
By following the same protocol, other tools can share hardware PMCs with perf. Move perf_event_attr_map_entry and BPF_PERF_DEFAULT_ATTR_MAP_PATH to bpf_perf.h for other tools to use. Signed-off-by: Song Liu <song@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Song Liu <songliubraving@fb.com> Cc: kernel-team@fb.com Link: https://lore.kernel.org/r/20210425214333.1090950-2-song@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
- 26 Apr, 2021 1 commit
-
-
Arnaldo Carvalho de Melo authored
To pick up fixes. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
- 25 Apr, 2021 7 commits
-
-
Linus Torvalds authored
-
Linus Torvalds authored
Merge tag 'perf-tools-fixes-for-v5.12-2021-04-25' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux Pull perf tools fixes from Arnaldo Carvalho de Melo: - Fix potential NULL pointer dereference in the auxtrace option parser - Fix access to PID in an array when setting a PID filter in 'perf ftrace' - Fix error return code in the 'perf data' tool and in maps__clone(), found using a static analysis tool from Huawei * tag 'perf-tools-fixes-for-v5.12-2021-04-25' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: perf map: Fix error return code in maps__clone() perf ftrace: Fix access to pid in array when setting a pid filter perf auxtrace: Fix potential NULL pointer dereference perf data: Fix error return code in perf_data__create_dir()
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull x86 perf fixes from Borislav Petkov: - Fix Broadwell Xeon's stepping in the PEBS isolation table of CPUs - Fix a panic when initializing perf uncore machinery on Haswell and Broadwell servers * tag 'perf_urgent_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/kvm: Fix Broadwell Xeon stepping in isolation_ucodes[] perf/x86/intel/uncore: Remove uncore extra PCI dev HSWEP_PCI_PCU_3
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull locking fix from Borislav Petkov: "Fix ordering in the queued writer lock's slowpath" * tag 'locking_urgent_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: locking/qrwlock: Fix ordering in queued_write_lock_slowpath()
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull scheduler fix from Borislav Petkov: "Fix a typo in a macro ifdeffery" * tag 'sched_urgent_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: preempt/dynamic: Fix typo in macro conditional statement
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull x86 fix from Borislav Petkov: "Fix an out-of-bounds memory access when setting up a crash kernel with kexec" * tag 'x86_urgent_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/crash: Fix crash_setup_memmap_entries() out-of-bounds access
-
Jin Yao authored
Kernel has supported COMETLAKE/COMETLAKE_L to use the SKYLAKE events and supported TIGERLAKE_L/TIGERLAKE/ROCKETLAKE to use the ICELAKE events. But pmu-events mapfile.csv is missing these model numbers. Now add the missing model numbers to mapfile.csv. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jin Yao <yao.jin@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20210329070903.8894-1-yao.jin@linux.intel.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
- 24 Apr, 2021 3 commits
-
-
git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds authored
Pull kvm fix from Paolo Bonzini: "Fix SRCU bug introduced in the merge window" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86/xen: Take srcu lock when accessing kvm_memslots()
-
Linus Torvalds authored
This reverts commit 0c85a7e8. The games with 'rm' are on (two separate instances) of a local variable, and make no difference. Quoting Aditya Pakki: "I was the author of the patch and it was the cause of the giant UMN revert. The patch is garbage and I was unaware of the steps involved in retracting it. I *believed* the maintainers would pull it, given it was already under Greg's list. The patch does not introduce any bugs but is pointless and is stupid. I accept my incompetence and for not requesting a revert earlier." Link: https://lwn.net/Articles/854319/Requested-by: Aditya Pakki <pakki001@umn.edu> Cc: Santosh Shilimkar <santosh.shilimkar@oracle.com> Cc: David S. Miller <davem@davemloft.net> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrlLinus Torvalds authored
Pull pin control fixes from Linus Walleij: "Late pin control fixes, would have been in the main pull request normally but hey I got lucky and we got another week to polish up v5.12 so here we go. One driver fix and one making the core debugfs work: - Fix the number of pins in the community of the Intel Lewisburg SoC - Show pin numbers for controllers with base = 0 in the new debugfs feature" * tag 'pinctrl-v5.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: pinctrl: core: Show pin numbers for the controllers with base = 0 pinctrl: lewisburg: Update number of pins in community
-
- 23 Apr, 2021 8 commits
-
-
Linus Torvalds authored
Merge misc fixes from Andrew Morton: "5 patches. Subsystems affected by this patch series: coda, overlayfs, and mm (pagecache and memcg)" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: tools/cgroup/slabinfo.py: updated to work on current kernel mm/filemap: fix mapping_seek_hole_data on THP & 32-bit mm/filemap: fix find_lock_entries hang on 32-bit THP ovl: fix reference counting in ovl_mmap error path coda: fix reference counting in coda_file_mmap error path
-
git://git.kernel.dk/linux-blockLinus Torvalds authored
Pull block fix from Jens Axboe: "A single fix for a behavioral regression in this series, when re-reading the partition table with partitions open" * tag 'block-5.12-2021-04-23' of git://git.kernel.dk/linux-block: block: return -EBUSY when there are open partitions in blkdev_reread_part
-
Vasily Averin authored
slabinfo.py script does not work with actual kernel version. First, it was unable to recognise SLUB susbsytem, and when I specified it manually it failed again with AttributeError: 'struct page' has no member 'obj_cgroups' .. and then again with File "tools/cgroup/memcg_slabinfo.py", line 221, in main memcg.kmem_caches.address_of_(), AttributeError: 'struct mem_cgroup' has no member 'kmem_caches' Link: https://lkml.kernel.org/r/cec1a75e-43b4-3d64-2084-d9f98fda037f@virtuozzo.comSigned-off-by: Vasily Averin <vvs@virtuozzo.com> Tested-by: Roman Gushchin <guro@fb.com> Acked-by: Roman Gushchin <guro@fb.com> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Hugh Dickins authored
No problem on 64-bit, or without huge pages, but xfstests generic/285 and other SEEK_HOLE/SEEK_DATA tests have regressed on huge tmpfs, and on 32-bit architectures, with the new mapping_seek_hole_data(). Several different bugs turned out to need fixing. u64 cast to stop losing bits when converting unsigned long to loff_t (and let's use shifts throughout, rather than mixed with * and /). Use round_up() when advancing pos, to stop assuming that pos was already THP-aligned when advancing it by THP-size. (This use of round_up() assumes that any THP has THP-aligned index: true at present and true going forward, but could be recoded to avoid the assumption.) Use xas_set() when iterating away from a THP, so that xa_index stays in synch with start, instead of drifting away to return bogus offset. Check start against end to avoid wrapping 32-bit xa_index to 0 (and to handle these additional cases, seek_data or not, it's easier to break the loop than goto: so rearrange exit from the function). [hughd@google.com: remove unneeded u64 casts, per Matthew] Link: https://lkml.kernel.org/r/alpine.LSU.2.11.2104221347240.1170@eggly.anvils Link: https://lkml.kernel.org/r/alpine.LSU.2.11.2104211737410.3299@eggly.anvils Fixes: 41139aa4 ("mm/filemap: add mapping_seek_hole_data") Signed-off-by: Hugh Dickins <hughd@google.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Dave Chinner <dchinner@redhat.com> Cc: Jan Kara <jack@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: William Kucharski <william.kucharski@oracle.com> Cc: Yang Shi <yang.shi@linux.alibaba.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Hugh Dickins authored
No problem on 64-bit, or without huge pages, but xfstests generic/308 hung uninterruptibly on 32-bit huge tmpfs. Since commit 0cc3b0ec ("Clarify (and fix) in 4.13 MAX_LFS_FILESIZE macros"), MAX_LFS_FILESIZE is only a PAGE_SIZE away from wrapping 32-bit xa_index to 0, so the new find_lock_entries() has to be extra careful when handling a THP. Link: https://lkml.kernel.org/r/alpine.LSU.2.11.2104211735430.3299@eggly.anvils Fixes: 5c211ba2 ("mm: add and use find_lock_entries") Signed-off-by: Hugh Dickins <hughd@google.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: William Kucharski <william.kucharski@oracle.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Jan Kara <jack@suse.cz> Cc: Dave Chinner <dchinner@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Yang Shi <yang.shi@linux.alibaba.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Christian König authored
mmap_region() now calls fput() on the vma->vm_file. Fix this by using vma_set_file() so it doesn't need to be handled manually here any more. Link: https://lkml.kernel.org/r/20210421132012.82354-2-christian.koenig@amd.com Fixes: 1527f926 ("mm: mmap: fix fput in error path v2") Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Jan Harkes <jaharkes@cs.cmu.edu> Cc: Miklos Szeredi <miklos@szeredi.hu> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: <stable@vger.kernel.org> [5.11+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Christian König authored
mmap_region() now calls fput() on the vma->vm_file. So we need to drop the extra reference on the coda file instead of the host file. Link: https://lkml.kernel.org/r/20210421132012.82354-1-christian.koenig@amd.com Fixes: 1527f926 ("mm: mmap: fix fput in error path v2") Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Acked-by: Jan Harkes <jaharkes@cs.cmu.edu> Cc: Miklos Szeredi <miklos@szeredi.hu> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: <stable@vger.kernel.org> [5.11+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Wanpeng Li authored
kvm_memslots() will be called by kvm_write_guest_offset_cached() so we should take the srcu lock. Let's pull the srcu lock operation from kvm_steal_time_set_preempted() again to fix xen part. Fixes: 30b5c851 ("KVM: x86/xen: Add support for vCPU runstate information") Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Message-Id: <1619166200-9215-1-git-send-email-wanpengli@tencent.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-