Commit f0bb4c0a authored by Linus Torvalds's avatar Linus Torvalds

Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf updates from Ingo Molnar:
 "Kernel improvements:

   - watchdog driver improvements by Li Zefan
   - Power7 CPI stack events related improvements by Sukadev Bhattiprolu
   - event multiplexing via hrtimers and other improvements by Stephane
     Eranian
   - kernel stack use optimization by Andrew Hunter
   - AMD IOMMU uncore PMU support by Suravee Suthikulpanit
   - NMI handling rate-limits by Dave Hansen
   - various hw_breakpoint fixes by Oleg Nesterov
   - hw_breakpoint overflow period sampling and related signal handling
     fixes by Jiri Olsa
   - Intel Haswell PMU support by Andi Kleen

  Tooling improvements:

   - Reset SIGTERM handler in workload child process, fix from David
     Ahern.
   - Makefile reorganization, prep work for Kconfig patches, from Jiri
     Olsa.
   - Add automated make test suite, from Jiri Olsa.
   - Add --percent-limit option to 'top' and 'report', from Namhyung
     Kim.
   - Sorting improvements, from Namhyung Kim.
   - Expand definition of sysfs format attribute, from Michael Ellerman.

  Tooling fixes:

   - 'perf tests' fixes from Jiri Olsa.
   - Make Power7 CPI stack events available in sysfs, from Sukadev
     Bhattiprolu.
   - Handle death by SIGTERM in 'perf record', fix from David Ahern.
   - Fix printing of perf_event_paranoid message, from David Ahern.
   - Handle realloc failures in 'perf kvm', from David Ahern.
   - Fix divide by 0 in variance, from David Ahern.
   - Save parent pid in thread struct, from David Ahern.
   - Handle JITed code in shared memory, from Andi Kleen.
   - Fixes for 'perf diff', from Jiri Olsa.
   - Remove some unused struct members, from Jiri Olsa.
   - Add missing liblk.a dependency for python/perf.so, fix from Jiri
     Olsa.
   - Respect CROSS_COMPILE in liblk.a, from Rabin Vincent.
   - No need to do locking when adding hists in perf report, only 'top'
     needs that, from Namhyung Kim.
   - Fix alignment of symbol column in in the hists browser (top,
     report) when -v is given, from NAmhyung Kim.
   - Fix 'perf top' -E option behavior, from Namhyung Kim.
   - Fix bug in isupper() and islower(), from Sukadev Bhattiprolu.
   - Fix compile errors in bp_signal 'perf test', from Sukadev
     Bhattiprolu.

  ... and more things"

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (102 commits)
  perf/x86: Disable PEBS-LL in intel_pmu_pebs_disable()
  perf/x86: Fix shared register mutual exclusion enforcement
  perf/x86/intel: Support full width counting
  x86: Add NMI duration tracepoints
  perf: Drop sample rate when sampling is too slow
  x86: Warn when NMI handlers take large amounts of time
  hw_breakpoint: Introduce "struct bp_cpuinfo"
  hw_breakpoint: Simplify *register_wide_hw_breakpoint()
  hw_breakpoint: Introduce cpumask_of_bp()
  hw_breakpoint: Simplify the "weight" usage in toggle_bp_slot() paths
  hw_breakpoint: Simplify list/idx mess in toggle_bp_slot() paths
  perf/x86/intel: Add mem-loads/stores support for Haswell
  perf/x86/intel: Support Haswell/v4 LBR format
  perf/x86/intel: Move NMI clearing to end of PMI handler
  perf/x86/intel: Add Haswell PEBS support
  perf/x86/intel: Add simple Haswell PMU support
  perf/x86/intel: Add Haswell PEBS record support
  perf/x86/intel: Fix sparse warning
  perf/x86/amd: AMD IOMMU Performance Counter PERF uncore PMU implementation
  perf/x86/amd: Add IOMMU Performance Counter resource management
  ...
parents a4883ef6 983433b5
...@@ -27,14 +27,36 @@ Description: Generic performance monitoring events ...@@ -27,14 +27,36 @@ Description: Generic performance monitoring events
"basename". "basename".
What: /sys/devices/cpu/events/PM_LD_MISS_L1 What: /sys/devices/cpu/events/PM_1PLUS_PPC_CMPL
/sys/devices/cpu/events/PM_LD_REF_L1
/sys/devices/cpu/events/PM_CYC
/sys/devices/cpu/events/PM_BRU_FIN /sys/devices/cpu/events/PM_BRU_FIN
/sys/devices/cpu/events/PM_GCT_NOSLOT_CYC
/sys/devices/cpu/events/PM_BRU_MPRED /sys/devices/cpu/events/PM_BRU_MPRED
/sys/devices/cpu/events/PM_INST_CMPL
/sys/devices/cpu/events/PM_CMPLU_STALL /sys/devices/cpu/events/PM_CMPLU_STALL
/sys/devices/cpu/events/PM_CMPLU_STALL_BRU
/sys/devices/cpu/events/PM_CMPLU_STALL_DCACHE_MISS
/sys/devices/cpu/events/PM_CMPLU_STALL_DFU
/sys/devices/cpu/events/PM_CMPLU_STALL_DIV
/sys/devices/cpu/events/PM_CMPLU_STALL_ERAT_MISS
/sys/devices/cpu/events/PM_CMPLU_STALL_FXU
/sys/devices/cpu/events/PM_CMPLU_STALL_IFU
/sys/devices/cpu/events/PM_CMPLU_STALL_LSU
/sys/devices/cpu/events/PM_CMPLU_STALL_REJECT
/sys/devices/cpu/events/PM_CMPLU_STALL_SCALAR
/sys/devices/cpu/events/PM_CMPLU_STALL_SCALAR_LONG
/sys/devices/cpu/events/PM_CMPLU_STALL_STORE
/sys/devices/cpu/events/PM_CMPLU_STALL_THRD
/sys/devices/cpu/events/PM_CMPLU_STALL_VECTOR
/sys/devices/cpu/events/PM_CMPLU_STALL_VECTOR_LONG
/sys/devices/cpu/events/PM_CYC
/sys/devices/cpu/events/PM_GCT_NOSLOT_BR_MPRED
/sys/devices/cpu/events/PM_GCT_NOSLOT_BR_MPRED_IC_MISS
/sys/devices/cpu/events/PM_GCT_NOSLOT_CYC
/sys/devices/cpu/events/PM_GCT_NOSLOT_IC_MISS
/sys/devices/cpu/events/PM_GRP_CMPL
/sys/devices/cpu/events/PM_INST_CMPL
/sys/devices/cpu/events/PM_LD_MISS_L1
/sys/devices/cpu/events/PM_LD_REF_L1
/sys/devices/cpu/events/PM_RUN_CYC
/sys/devices/cpu/events/PM_RUN_INST_CMPL
Date: 2013/01/08 Date: 2013/01/08
......
...@@ -9,6 +9,12 @@ Description: ...@@ -9,6 +9,12 @@ Description:
we want to export, so that userspace can deal with sane we want to export, so that userspace can deal with sane
name/value pairs. name/value pairs.
Userspace must be prepared for the possibility that attributes
define overlapping bit ranges. For example:
attr1 = 'config:0-23'
attr2 = 'config:0-7'
attr3 = 'config:12-35'
Example: 'config1:1,6-10,44' Example: 'config1:1,6-10,44'
Defines contents of attribute that occupies bits 1,6-10,44 of Defines contents of attribute that occupies bits 1,6-10,44 of
perf_event_attr::config1. perf_event_attr::config1.
...@@ -70,12 +70,12 @@ show up in /proc/sys/kernel: ...@@ -70,12 +70,12 @@ show up in /proc/sys/kernel:
- shmall - shmall
- shmmax [ sysv ipc ] - shmmax [ sysv ipc ]
- shmmni - shmmni
- softlockup_thresh
- stop-a [ SPARC only ] - stop-a [ SPARC only ]
- sysrq ==> Documentation/sysrq.txt - sysrq ==> Documentation/sysrq.txt
- tainted - tainted
- threads-max - threads-max
- unknown_nmi_panic - unknown_nmi_panic
- watchdog_thresh
- version - version
============================================================== ==============================================================
...@@ -427,6 +427,32 @@ This file shows up if CONFIG_DEBUG_STACKOVERFLOW is enabled. ...@@ -427,6 +427,32 @@ This file shows up if CONFIG_DEBUG_STACKOVERFLOW is enabled.
============================================================== ==============================================================
perf_cpu_time_max_percent:
Hints to the kernel how much CPU time it should be allowed to
use to handle perf sampling events. If the perf subsystem
is informed that its samples are exceeding this limit, it
will drop its sampling frequency to attempt to reduce its CPU
usage.
Some perf sampling happens in NMIs. If these samples
unexpectedly take too long to execute, the NMIs can become
stacked up next to each other so much that nothing else is
allowed to execute.
0: disable the mechanism. Do not monitor or correct perf's
sampling rate no matter how CPU time it takes.
1-100: attempt to throttle perf's sample rate to this
percentage of CPU. Note: the kernel calculates an
"expected" length of each sample event. 100 here means
100% of that expected length. Even if this is set to
100, you may still see sample throttling if this
length is exceeded. Set to 0 if you truly do not care
how much CPU is consumed.
==============================================================
pid_max: pid_max:
...@@ -604,15 +630,6 @@ without users and with a dead originative process will be destroyed. ...@@ -604,15 +630,6 @@ without users and with a dead originative process will be destroyed.
============================================================== ==============================================================
softlockup_thresh:
This value can be used to lower the softlockup tolerance threshold. The
default threshold is 60 seconds. If a cpu is locked up for 60 seconds,
the kernel complains. Valid values are 1-60 seconds. Setting this
tunable to zero will disable the softlockup detection altogether.
==============================================================
tainted: tainted:
Non-zero if the kernel has been tainted. Numeric values, which Non-zero if the kernel has been tainted. Numeric values, which
...@@ -648,3 +665,16 @@ that time, kernel debugging information is displayed on console. ...@@ -648,3 +665,16 @@ that time, kernel debugging information is displayed on console.
NMI switch that most IA32 servers have fires unknown NMI up, for NMI switch that most IA32 servers have fires unknown NMI up, for
example. If a system hangs up, try pressing the NMI switch. example. If a system hangs up, try pressing the NMI switch.
==============================================================
watchdog_thresh:
This value can be used to control the frequency of hrtimer and NMI
events and the soft and hard lockup thresholds. The default threshold
is 10 seconds.
The softlockup threshold is (2 * watchdog_thresh). Setting this
tunable to zero will disable lockup detection altogether.
==============================================================
NMI Trace Events
These events normally show up here:
/sys/kernel/debug/tracing/events/nmi
--
nmi_handler:
You might want to use this tracepoint if you suspect that your
NMI handlers are hogging large amounts of CPU time. The kernel
will warn if it sees long-running handlers:
INFO: NMI handler took too long to run: 9.207 msecs
and this tracepoint will allow you to drill down and get some
more details.
Let's say you suspect that perf_event_nmi_handler() is causing
you some problems and you only want to trace that handler
specifically. You need to find its address:
$ grep perf_event_nmi_handler /proc/kallsyms
ffffffff81625600 t perf_event_nmi_handler
Let's also say you are only interested in when that function is
really hogging a lot of CPU time, like a millisecond at a time.
Note that the kernel's output is in milliseconds, but the input
to the filter is in nanoseconds! You can filter on 'delta_ns':
cd /sys/kernel/debug/tracing/events/nmi/nmi_handler
echo 'handler==0xffffffff81625600 && delta_ns>1000000' > filter
echo 1 > enable
Your output would then look like:
$ cat /sys/kernel/debug/tracing/trace_pipe
<idle>-0 [000] d.h3 505.397558: nmi_handler: perf_event_nmi_handler() delta_ns: 3236765 handled: 1
<idle>-0 [000] d.h3 505.805893: nmi_handler: perf_event_nmi_handler() delta_ns: 3174234 handled: 1
<idle>-0 [000] d.h3 506.158206: nmi_handler: perf_event_nmi_handler() delta_ns: 3084642 handled: 1
<idle>-0 [000] d.h3 506.334346: nmi_handler: perf_event_nmi_handler() delta_ns: 3080351 handled: 1
...@@ -882,7 +882,7 @@ static int __init init_hw_perf_events(void) ...@@ -882,7 +882,7 @@ static int __init init_hw_perf_events(void)
} }
register_cpu_notifier(&metag_pmu_notifier); register_cpu_notifier(&metag_pmu_notifier);
ret = perf_pmu_register(&pmu, (char *)metag_pmu->name, PERF_TYPE_RAW); ret = perf_pmu_register(&pmu, metag_pmu->name, PERF_TYPE_RAW);
out: out:
return ret; return ret;
} }
......
...@@ -62,6 +62,29 @@ ...@@ -62,6 +62,29 @@
#define PME_PM_BRU_FIN 0x10068 #define PME_PM_BRU_FIN 0x10068
#define PME_PM_BRU_MPRED 0x400f6 #define PME_PM_BRU_MPRED 0x400f6
#define PME_PM_CMPLU_STALL_FXU 0x20014
#define PME_PM_CMPLU_STALL_DIV 0x40014
#define PME_PM_CMPLU_STALL_SCALAR 0x40012
#define PME_PM_CMPLU_STALL_SCALAR_LONG 0x20018
#define PME_PM_CMPLU_STALL_VECTOR 0x2001c
#define PME_PM_CMPLU_STALL_VECTOR_LONG 0x4004a
#define PME_PM_CMPLU_STALL_LSU 0x20012
#define PME_PM_CMPLU_STALL_REJECT 0x40016
#define PME_PM_CMPLU_STALL_ERAT_MISS 0x40018
#define PME_PM_CMPLU_STALL_DCACHE_MISS 0x20016
#define PME_PM_CMPLU_STALL_STORE 0x2004a
#define PME_PM_CMPLU_STALL_THRD 0x1001c
#define PME_PM_CMPLU_STALL_IFU 0x4004c
#define PME_PM_CMPLU_STALL_BRU 0x4004e
#define PME_PM_GCT_NOSLOT_IC_MISS 0x2001a
#define PME_PM_GCT_NOSLOT_BR_MPRED 0x4001a
#define PME_PM_GCT_NOSLOT_BR_MPRED_IC_MISS 0x4001c
#define PME_PM_GRP_CMPL 0x30004
#define PME_PM_1PLUS_PPC_CMPL 0x100f2
#define PME_PM_CMPLU_STALL_DFU 0x2003c
#define PME_PM_RUN_CYC 0x200f4
#define PME_PM_RUN_INST_CMPL 0x400fa
/* /*
* Layout of constraint bits: * Layout of constraint bits:
* 6666555555555544444444443333333333222222222211111111110000000000 * 6666555555555544444444443333333333222222222211111111110000000000
...@@ -393,6 +416,31 @@ POWER_EVENT_ATTR(LD_MISS_L1, LD_MISS_L1); ...@@ -393,6 +416,31 @@ POWER_EVENT_ATTR(LD_MISS_L1, LD_MISS_L1);
POWER_EVENT_ATTR(BRU_FIN, BRU_FIN) POWER_EVENT_ATTR(BRU_FIN, BRU_FIN)
POWER_EVENT_ATTR(BRU_MPRED, BRU_MPRED); POWER_EVENT_ATTR(BRU_MPRED, BRU_MPRED);
POWER_EVENT_ATTR(CMPLU_STALL_FXU, CMPLU_STALL_FXU);
POWER_EVENT_ATTR(CMPLU_STALL_DIV, CMPLU_STALL_DIV);
POWER_EVENT_ATTR(CMPLU_STALL_SCALAR, CMPLU_STALL_SCALAR);
POWER_EVENT_ATTR(CMPLU_STALL_SCALAR_LONG, CMPLU_STALL_SCALAR_LONG);
POWER_EVENT_ATTR(CMPLU_STALL_VECTOR, CMPLU_STALL_VECTOR);
POWER_EVENT_ATTR(CMPLU_STALL_VECTOR_LONG, CMPLU_STALL_VECTOR_LONG);
POWER_EVENT_ATTR(CMPLU_STALL_LSU, CMPLU_STALL_LSU);
POWER_EVENT_ATTR(CMPLU_STALL_REJECT, CMPLU_STALL_REJECT);
POWER_EVENT_ATTR(CMPLU_STALL_ERAT_MISS, CMPLU_STALL_ERAT_MISS);
POWER_EVENT_ATTR(CMPLU_STALL_DCACHE_MISS, CMPLU_STALL_DCACHE_MISS);
POWER_EVENT_ATTR(CMPLU_STALL_STORE, CMPLU_STALL_STORE);
POWER_EVENT_ATTR(CMPLU_STALL_THRD, CMPLU_STALL_THRD);
POWER_EVENT_ATTR(CMPLU_STALL_IFU, CMPLU_STALL_IFU);
POWER_EVENT_ATTR(CMPLU_STALL_BRU, CMPLU_STALL_BRU);
POWER_EVENT_ATTR(GCT_NOSLOT_IC_MISS, GCT_NOSLOT_IC_MISS);
POWER_EVENT_ATTR(GCT_NOSLOT_BR_MPRED, GCT_NOSLOT_BR_MPRED);
POWER_EVENT_ATTR(GCT_NOSLOT_BR_MPRED_IC_MISS, GCT_NOSLOT_BR_MPRED_IC_MISS);
POWER_EVENT_ATTR(GRP_CMPL, GRP_CMPL);
POWER_EVENT_ATTR(1PLUS_PPC_CMPL, 1PLUS_PPC_CMPL);
POWER_EVENT_ATTR(CMPLU_STALL_DFU, CMPLU_STALL_DFU);
POWER_EVENT_ATTR(RUN_CYC, RUN_CYC);
POWER_EVENT_ATTR(RUN_INST_CMPL, RUN_INST_CMPL);
static struct attribute *power7_events_attr[] = { static struct attribute *power7_events_attr[] = {
GENERIC_EVENT_PTR(CYC), GENERIC_EVENT_PTR(CYC),
GENERIC_EVENT_PTR(GCT_NOSLOT_CYC), GENERIC_EVENT_PTR(GCT_NOSLOT_CYC),
...@@ -411,6 +459,31 @@ static struct attribute *power7_events_attr[] = { ...@@ -411,6 +459,31 @@ static struct attribute *power7_events_attr[] = {
POWER_EVENT_PTR(LD_MISS_L1), POWER_EVENT_PTR(LD_MISS_L1),
POWER_EVENT_PTR(BRU_FIN), POWER_EVENT_PTR(BRU_FIN),
POWER_EVENT_PTR(BRU_MPRED), POWER_EVENT_PTR(BRU_MPRED),
POWER_EVENT_PTR(CMPLU_STALL_FXU),
POWER_EVENT_PTR(CMPLU_STALL_DIV),
POWER_EVENT_PTR(CMPLU_STALL_SCALAR),
POWER_EVENT_PTR(CMPLU_STALL_SCALAR_LONG),
POWER_EVENT_PTR(CMPLU_STALL_VECTOR),
POWER_EVENT_PTR(CMPLU_STALL_VECTOR_LONG),
POWER_EVENT_PTR(CMPLU_STALL_LSU),
POWER_EVENT_PTR(CMPLU_STALL_REJECT),
POWER_EVENT_PTR(CMPLU_STALL_ERAT_MISS),
POWER_EVENT_PTR(CMPLU_STALL_DCACHE_MISS),
POWER_EVENT_PTR(CMPLU_STALL_STORE),
POWER_EVENT_PTR(CMPLU_STALL_THRD),
POWER_EVENT_PTR(CMPLU_STALL_IFU),
POWER_EVENT_PTR(CMPLU_STALL_BRU),
POWER_EVENT_PTR(GCT_NOSLOT_IC_MISS),
POWER_EVENT_PTR(GCT_NOSLOT_BR_MPRED),
POWER_EVENT_PTR(GCT_NOSLOT_BR_MPRED_IC_MISS),
POWER_EVENT_PTR(GRP_CMPL),
POWER_EVENT_PTR(1PLUS_PPC_CMPL),
POWER_EVENT_PTR(CMPLU_STALL_DFU),
POWER_EVENT_PTR(RUN_CYC),
POWER_EVENT_PTR(RUN_INST_CMPL),
NULL NULL
}; };
......
...@@ -34,8 +34,6 @@ ...@@ -34,8 +34,6 @@
#include <asm/sys_ia32.h> #include <asm/sys_ia32.h>
#include <asm/smap.h> #include <asm/smap.h>
#define FIX_EFLAGS __FIX_EFLAGS
int copy_siginfo_to_user32(compat_siginfo_t __user *to, siginfo_t *from) int copy_siginfo_to_user32(compat_siginfo_t __user *to, siginfo_t *from)
{ {
int err = 0; int err = 0;
......
...@@ -29,6 +29,9 @@ ...@@ -29,6 +29,9 @@
#define ARCH_PERFMON_EVENTSEL_INV (1ULL << 23) #define ARCH_PERFMON_EVENTSEL_INV (1ULL << 23)
#define ARCH_PERFMON_EVENTSEL_CMASK 0xFF000000ULL #define ARCH_PERFMON_EVENTSEL_CMASK 0xFF000000ULL
#define HSW_IN_TX (1ULL << 32)
#define HSW_IN_TX_CHECKPOINTED (1ULL << 33)
#define AMD64_EVENTSEL_INT_CORE_ENABLE (1ULL << 36) #define AMD64_EVENTSEL_INT_CORE_ENABLE (1ULL << 36)
#define AMD64_EVENTSEL_GUESTONLY (1ULL << 40) #define AMD64_EVENTSEL_GUESTONLY (1ULL << 40)
#define AMD64_EVENTSEL_HOSTONLY (1ULL << 41) #define AMD64_EVENTSEL_HOSTONLY (1ULL << 41)
......
...@@ -7,10 +7,10 @@ ...@@ -7,10 +7,10 @@
#include <asm/processor-flags.h> #include <asm/processor-flags.h>
#define __FIX_EFLAGS (X86_EFLAGS_AC | X86_EFLAGS_OF | \ #define FIX_EFLAGS (X86_EFLAGS_AC | X86_EFLAGS_OF | \
X86_EFLAGS_DF | X86_EFLAGS_TF | X86_EFLAGS_SF | \ X86_EFLAGS_DF | X86_EFLAGS_TF | X86_EFLAGS_SF | \
X86_EFLAGS_ZF | X86_EFLAGS_AF | X86_EFLAGS_PF | \ X86_EFLAGS_ZF | X86_EFLAGS_AF | X86_EFLAGS_PF | \
X86_EFLAGS_CF) X86_EFLAGS_CF | X86_EFLAGS_RF)
void signal_fault(struct pt_regs *regs, void __user *frame, char *where); void signal_fault(struct pt_regs *regs, void __user *frame, char *where);
......
...@@ -170,6 +170,9 @@ ...@@ -170,6 +170,9 @@
#define MSR_KNC_EVNTSEL0 0x00000028 #define MSR_KNC_EVNTSEL0 0x00000028
#define MSR_KNC_EVNTSEL1 0x00000029 #define MSR_KNC_EVNTSEL1 0x00000029
/* Alternative perfctr range with full access. */
#define MSR_IA32_PMC0 0x000004c1
/* AMD64 MSRs. Not complete. See the architecture manual for a more /* AMD64 MSRs. Not complete. See the architecture manual for a more
complete list. */ complete list. */
......
...@@ -31,11 +31,15 @@ obj-$(CONFIG_PERF_EVENTS) += perf_event.o ...@@ -31,11 +31,15 @@ obj-$(CONFIG_PERF_EVENTS) += perf_event.o
ifdef CONFIG_PERF_EVENTS ifdef CONFIG_PERF_EVENTS
obj-$(CONFIG_CPU_SUP_AMD) += perf_event_amd.o perf_event_amd_uncore.o obj-$(CONFIG_CPU_SUP_AMD) += perf_event_amd.o perf_event_amd_uncore.o
ifdef CONFIG_AMD_IOMMU
obj-$(CONFIG_CPU_SUP_AMD) += perf_event_amd_iommu.o
endif
obj-$(CONFIG_CPU_SUP_INTEL) += perf_event_p6.o perf_event_knc.o perf_event_p4.o obj-$(CONFIG_CPU_SUP_INTEL) += perf_event_p6.o perf_event_knc.o perf_event_p4.o
obj-$(CONFIG_CPU_SUP_INTEL) += perf_event_intel_lbr.o perf_event_intel_ds.o perf_event_intel.o obj-$(CONFIG_CPU_SUP_INTEL) += perf_event_intel_lbr.o perf_event_intel_ds.o perf_event_intel.o
obj-$(CONFIG_CPU_SUP_INTEL) += perf_event_intel_uncore.o obj-$(CONFIG_CPU_SUP_INTEL) += perf_event_intel_uncore.o
endif endif
obj-$(CONFIG_X86_MCE) += mcheck/ obj-$(CONFIG_X86_MCE) += mcheck/
obj-$(CONFIG_MTRR) += mtrr/ obj-$(CONFIG_MTRR) += mtrr/
......
...@@ -403,7 +403,8 @@ int x86_pmu_hw_config(struct perf_event *event) ...@@ -403,7 +403,8 @@ int x86_pmu_hw_config(struct perf_event *event)
* check that PEBS LBR correction does not conflict with * check that PEBS LBR correction does not conflict with
* whatever the user is asking with attr->branch_sample_type * whatever the user is asking with attr->branch_sample_type
*/ */
if (event->attr.precise_ip > 1) { if (event->attr.precise_ip > 1 &&
x86_pmu.intel_cap.pebs_format < 2) {
u64 *br_type = &event->attr.branch_sample_type; u64 *br_type = &event->attr.branch_sample_type;
if (has_branch_stack(event)) { if (has_branch_stack(event)) {
...@@ -568,7 +569,7 @@ struct sched_state { ...@@ -568,7 +569,7 @@ struct sched_state {
struct perf_sched { struct perf_sched {
int max_weight; int max_weight;
int max_events; int max_events;
struct event_constraint **constraints; struct perf_event **events;
struct sched_state state; struct sched_state state;
int saved_states; int saved_states;
struct sched_state saved[SCHED_STATES_MAX]; struct sched_state saved[SCHED_STATES_MAX];
...@@ -577,7 +578,7 @@ struct perf_sched { ...@@ -577,7 +578,7 @@ struct perf_sched {
/* /*
* Initialize interator that runs through all events and counters. * Initialize interator that runs through all events and counters.
*/ */
static void perf_sched_init(struct perf_sched *sched, struct event_constraint **c, static void perf_sched_init(struct perf_sched *sched, struct perf_event **events,
int num, int wmin, int wmax) int num, int wmin, int wmax)
{ {
int idx; int idx;
...@@ -585,10 +586,10 @@ static void perf_sched_init(struct perf_sched *sched, struct event_constraint ** ...@@ -585,10 +586,10 @@ static void perf_sched_init(struct perf_sched *sched, struct event_constraint **
memset(sched, 0, sizeof(*sched)); memset(sched, 0, sizeof(*sched));
sched->max_events = num; sched->max_events = num;
sched->max_weight = wmax; sched->max_weight = wmax;
sched->constraints = c; sched->events = events;
for (idx = 0; idx < num; idx++) { for (idx = 0; idx < num; idx++) {
if (c[idx]->weight == wmin) if (events[idx]->hw.constraint->weight == wmin)
break; break;
} }
...@@ -635,8 +636,7 @@ static bool __perf_sched_find_counter(struct perf_sched *sched) ...@@ -635,8 +636,7 @@ static bool __perf_sched_find_counter(struct perf_sched *sched)
if (sched->state.event >= sched->max_events) if (sched->state.event >= sched->max_events)
return false; return false;
c = sched->constraints[sched->state.event]; c = sched->events[sched->state.event]->hw.constraint;
/* Prefer fixed purpose counters */ /* Prefer fixed purpose counters */
if (c->idxmsk64 & (~0ULL << INTEL_PMC_IDX_FIXED)) { if (c->idxmsk64 & (~0ULL << INTEL_PMC_IDX_FIXED)) {
idx = INTEL_PMC_IDX_FIXED; idx = INTEL_PMC_IDX_FIXED;
...@@ -694,7 +694,7 @@ static bool perf_sched_next_event(struct perf_sched *sched) ...@@ -694,7 +694,7 @@ static bool perf_sched_next_event(struct perf_sched *sched)
if (sched->state.weight > sched->max_weight) if (sched->state.weight > sched->max_weight)
return false; return false;
} }
c = sched->constraints[sched->state.event]; c = sched->events[sched->state.event]->hw.constraint;
} while (c->weight != sched->state.weight); } while (c->weight != sched->state.weight);
sched->state.counter = 0; /* start with first counter */ sched->state.counter = 0; /* start with first counter */
...@@ -705,12 +705,12 @@ static bool perf_sched_next_event(struct perf_sched *sched) ...@@ -705,12 +705,12 @@ static bool perf_sched_next_event(struct perf_sched *sched)
/* /*
* Assign a counter for each event. * Assign a counter for each event.
*/ */
int perf_assign_events(struct event_constraint **constraints, int n, int perf_assign_events(struct perf_event **events, int n,
int wmin, int wmax, int *assign) int wmin, int wmax, int *assign)
{ {
struct perf_sched sched; struct perf_sched sched;
perf_sched_init(&sched, constraints, n, wmin, wmax); perf_sched_init(&sched, events, n, wmin, wmax);
do { do {
if (!perf_sched_find_counter(&sched)) if (!perf_sched_find_counter(&sched))
...@@ -724,16 +724,19 @@ int perf_assign_events(struct event_constraint **constraints, int n, ...@@ -724,16 +724,19 @@ int perf_assign_events(struct event_constraint **constraints, int n,
int x86_schedule_events(struct cpu_hw_events *cpuc, int n, int *assign) int x86_schedule_events(struct cpu_hw_events *cpuc, int n, int *assign)
{ {
struct event_constraint *c, *constraints[X86_PMC_IDX_MAX]; struct event_constraint *c;
unsigned long used_mask[BITS_TO_LONGS(X86_PMC_IDX_MAX)]; unsigned long used_mask[BITS_TO_LONGS(X86_PMC_IDX_MAX)];
struct perf_event *e;
int i, wmin, wmax, num = 0; int i, wmin, wmax, num = 0;
struct hw_perf_event *hwc; struct hw_perf_event *hwc;
bitmap_zero(used_mask, X86_PMC_IDX_MAX); bitmap_zero(used_mask, X86_PMC_IDX_MAX);
for (i = 0, wmin = X86_PMC_IDX_MAX, wmax = 0; i < n; i++) { for (i = 0, wmin = X86_PMC_IDX_MAX, wmax = 0; i < n; i++) {
hwc = &cpuc->event_list[i]->hw;
c = x86_pmu.get_event_constraints(cpuc, cpuc->event_list[i]); c = x86_pmu.get_event_constraints(cpuc, cpuc->event_list[i]);
constraints[i] = c; hwc->constraint = c;
wmin = min(wmin, c->weight); wmin = min(wmin, c->weight);
wmax = max(wmax, c->weight); wmax = max(wmax, c->weight);
} }
...@@ -743,7 +746,7 @@ int x86_schedule_events(struct cpu_hw_events *cpuc, int n, int *assign) ...@@ -743,7 +746,7 @@ int x86_schedule_events(struct cpu_hw_events *cpuc, int n, int *assign)
*/ */
for (i = 0; i < n; i++) { for (i = 0; i < n; i++) {
hwc = &cpuc->event_list[i]->hw; hwc = &cpuc->event_list[i]->hw;
c = constraints[i]; c = hwc->constraint;
/* never assigned */ /* never assigned */
if (hwc->idx == -1) if (hwc->idx == -1)
...@@ -764,16 +767,35 @@ int x86_schedule_events(struct cpu_hw_events *cpuc, int n, int *assign) ...@@ -764,16 +767,35 @@ int x86_schedule_events(struct cpu_hw_events *cpuc, int n, int *assign)
/* slow path */ /* slow path */
if (i != n) if (i != n)
num = perf_assign_events(constraints, n, wmin, wmax, assign); num = perf_assign_events(cpuc->event_list, n, wmin,
wmax, assign);
/*
* Mark the event as committed, so we do not put_constraint()
* in case new events are added and fail scheduling.
*/
if (!num && assign) {
for (i = 0; i < n; i++) {
e = cpuc->event_list[i];
e->hw.flags |= PERF_X86_EVENT_COMMITTED;
}
}
/* /*
* scheduling failed or is just a simulation, * scheduling failed or is just a simulation,
* free resources if necessary * free resources if necessary
*/ */
if (!assign || num) { if (!assign || num) {
for (i = 0; i < n; i++) { for (i = 0; i < n; i++) {
e = cpuc->event_list[i];
/*
* do not put_constraint() on comitted events,
* because they are good to go
*/
if ((e->hw.flags & PERF_X86_EVENT_COMMITTED))
continue;
if (x86_pmu.put_event_constraints) if (x86_pmu.put_event_constraints)
x86_pmu.put_event_constraints(cpuc, cpuc->event_list[i]); x86_pmu.put_event_constraints(cpuc, e);
} }
} }
return num ? -EINVAL : 0; return num ? -EINVAL : 0;
...@@ -1152,6 +1174,11 @@ static void x86_pmu_del(struct perf_event *event, int flags) ...@@ -1152,6 +1174,11 @@ static void x86_pmu_del(struct perf_event *event, int flags)
struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events); struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
int i; int i;
/*
* event is descheduled
*/
event->hw.flags &= ~PERF_X86_EVENT_COMMITTED;
/* /*
* If we're called during a txn, we don't need to do anything. * If we're called during a txn, we don't need to do anything.
* The events never got scheduled and ->cancel_txn will truncate * The events never got scheduled and ->cancel_txn will truncate
...@@ -1249,10 +1276,20 @@ void perf_events_lapic_init(void) ...@@ -1249,10 +1276,20 @@ void perf_events_lapic_init(void)
static int __kprobes static int __kprobes
perf_event_nmi_handler(unsigned int cmd, struct pt_regs *regs) perf_event_nmi_handler(unsigned int cmd, struct pt_regs *regs)
{ {
int ret;
u64 start_clock;
u64 finish_clock;
if (!atomic_read(&active_events)) if (!atomic_read(&active_events))
return NMI_DONE; return NMI_DONE;
return x86_pmu.handle_irq(regs); start_clock = local_clock();
ret = x86_pmu.handle_irq(regs);
finish_clock = local_clock();
perf_sample_event_took(finish_clock - start_clock);
return ret;
} }
struct event_constraint emptyconstraint; struct event_constraint emptyconstraint;
......
...@@ -63,10 +63,12 @@ struct event_constraint { ...@@ -63,10 +63,12 @@ struct event_constraint {
int flags; int flags;
}; };
/* /*
* struct event_constraint flags * struct hw_perf_event.flags flags
*/ */
#define PERF_X86_EVENT_PEBS_LDLAT 0x1 /* ld+ldlat data address sampling */ #define PERF_X86_EVENT_PEBS_LDLAT 0x1 /* ld+ldlat data address sampling */
#define PERF_X86_EVENT_PEBS_ST 0x2 /* st data address sampling */ #define PERF_X86_EVENT_PEBS_ST 0x2 /* st data address sampling */
#define PERF_X86_EVENT_PEBS_ST_HSW 0x4 /* haswell style st data sampling */
#define PERF_X86_EVENT_COMMITTED 0x8 /* event passed commit_txn */
struct amd_nb { struct amd_nb {
int nb_id; /* NorthBridge id */ int nb_id; /* NorthBridge id */
...@@ -227,11 +229,14 @@ struct cpu_hw_events { ...@@ -227,11 +229,14 @@ struct cpu_hw_events {
* - inv * - inv
* - edge * - edge
* - cnt-mask * - cnt-mask
* - in_tx
* - in_tx_checkpointed
* The other filters are supported by fixed counters. * The other filters are supported by fixed counters.
* The any-thread option is supported starting with v3. * The any-thread option is supported starting with v3.
*/ */
#define FIXED_EVENT_FLAGS (X86_RAW_EVENT_MASK|HSW_IN_TX|HSW_IN_TX_CHECKPOINTED)
#define FIXED_EVENT_CONSTRAINT(c, n) \ #define FIXED_EVENT_CONSTRAINT(c, n) \
EVENT_CONSTRAINT(c, (1ULL << (32+n)), X86_RAW_EVENT_MASK) EVENT_CONSTRAINT(c, (1ULL << (32+n)), FIXED_EVENT_FLAGS)
/* /*
* Constraint on the Event code + UMask * Constraint on the Event code + UMask
...@@ -247,6 +252,11 @@ struct cpu_hw_events { ...@@ -247,6 +252,11 @@ struct cpu_hw_events {
__EVENT_CONSTRAINT(c, n, INTEL_ARCH_EVENT_MASK, \ __EVENT_CONSTRAINT(c, n, INTEL_ARCH_EVENT_MASK, \
HWEIGHT(n), 0, PERF_X86_EVENT_PEBS_ST) HWEIGHT(n), 0, PERF_X86_EVENT_PEBS_ST)
/* DataLA version of store sampling without extra enable bit. */
#define INTEL_PST_HSW_CONSTRAINT(c, n) \
__EVENT_CONSTRAINT(c, n, INTEL_ARCH_EVENT_MASK, \
HWEIGHT(n), 0, PERF_X86_EVENT_PEBS_ST_HSW)
#define EVENT_CONSTRAINT_END \ #define EVENT_CONSTRAINT_END \
EVENT_CONSTRAINT(0, 0, 0) EVENT_CONSTRAINT(0, 0, 0)
...@@ -301,6 +311,11 @@ union perf_capabilities { ...@@ -301,6 +311,11 @@ union perf_capabilities {
u64 pebs_arch_reg:1; u64 pebs_arch_reg:1;
u64 pebs_format:4; u64 pebs_format:4;
u64 smm_freeze:1; u64 smm_freeze:1;
/*
* PMU supports separate counter range for writing
* values > 32bit.
*/
u64 full_width_write:1;
}; };
u64 capabilities; u64 capabilities;
}; };
...@@ -375,6 +390,7 @@ struct x86_pmu { ...@@ -375,6 +390,7 @@ struct x86_pmu {
struct event_constraint *event_constraints; struct event_constraint *event_constraints;
struct x86_pmu_quirk *quirks; struct x86_pmu_quirk *quirks;
int perfctr_second_write; int perfctr_second_write;
bool late_ack;
/* /*
* sysfs attrs * sysfs attrs
...@@ -528,7 +544,7 @@ static inline void __x86_pmu_enable_event(struct hw_perf_event *hwc, ...@@ -528,7 +544,7 @@ static inline void __x86_pmu_enable_event(struct hw_perf_event *hwc,
void x86_pmu_enable_all(int added); void x86_pmu_enable_all(int added);
int perf_assign_events(struct event_constraint **constraints, int n, int perf_assign_events(struct perf_event **events, int n,
int wmin, int wmax, int *assign); int wmin, int wmax, int *assign);
int x86_schedule_events(struct cpu_hw_events *cpuc, int n, int *assign); int x86_schedule_events(struct cpu_hw_events *cpuc, int n, int *assign);
...@@ -633,6 +649,8 @@ extern struct event_constraint intel_snb_pebs_event_constraints[]; ...@@ -633,6 +649,8 @@ extern struct event_constraint intel_snb_pebs_event_constraints[];
extern struct event_constraint intel_ivb_pebs_event_constraints[]; extern struct event_constraint intel_ivb_pebs_event_constraints[];
extern struct event_constraint intel_hsw_pebs_event_constraints[];
struct event_constraint *intel_pebs_constraints(struct perf_event *event); struct event_constraint *intel_pebs_constraints(struct perf_event *event);
void intel_pmu_pebs_enable(struct perf_event *event); void intel_pmu_pebs_enable(struct perf_event *event);
......
...@@ -648,48 +648,48 @@ static __initconst const struct x86_pmu amd_pmu = { ...@@ -648,48 +648,48 @@ static __initconst const struct x86_pmu amd_pmu = {
.cpu_dead = amd_pmu_cpu_dead, .cpu_dead = amd_pmu_cpu_dead,
}; };
static int setup_event_constraints(void) static int __init amd_core_pmu_init(void)
{ {
if (boot_cpu_data.x86 == 0x15) if (!cpu_has_perfctr_core)
return 0;
switch (boot_cpu_data.x86) {
case 0x15:
pr_cont("Fam15h ");
x86_pmu.get_event_constraints = amd_get_event_constraints_f15h; x86_pmu.get_event_constraints = amd_get_event_constraints_f15h;
return 0; break;
}
static int setup_perfctr_core(void) default:
{ pr_err("core perfctr but no constraints; unknown hardware!\n");
if (!cpu_has_perfctr_core) {
WARN(x86_pmu.get_event_constraints == amd_get_event_constraints_f15h,
KERN_ERR "Odd, counter constraints enabled but no core perfctrs detected!");
return -ENODEV; return -ENODEV;
} }
WARN(x86_pmu.get_event_constraints == amd_get_event_constraints,
KERN_ERR "hw perf events core counters need constraints handler!");
/* /*
* If core performance counter extensions exists, we must use * If core performance counter extensions exists, we must use
* MSR_F15H_PERF_CTL/MSR_F15H_PERF_CTR msrs. See also * MSR_F15H_PERF_CTL/MSR_F15H_PERF_CTR msrs. See also
* x86_pmu_addr_offset(). * amd_pmu_addr_offset().
*/ */
x86_pmu.eventsel = MSR_F15H_PERF_CTL; x86_pmu.eventsel = MSR_F15H_PERF_CTL;
x86_pmu.perfctr = MSR_F15H_PERF_CTR; x86_pmu.perfctr = MSR_F15H_PERF_CTR;
x86_pmu.num_counters = AMD64_NUM_COUNTERS_CORE; x86_pmu.num_counters = AMD64_NUM_COUNTERS_CORE;
printk(KERN_INFO "perf: AMD core performance counters detected\n"); pr_cont("core perfctr, ");
return 0; return 0;
} }
__init int amd_pmu_init(void) __init int amd_pmu_init(void)
{ {
int ret;
/* Performance-monitoring supported from K7 and later: */ /* Performance-monitoring supported from K7 and later: */
if (boot_cpu_data.x86 < 6) if (boot_cpu_data.x86 < 6)
return -ENODEV; return -ENODEV;
x86_pmu = amd_pmu; x86_pmu = amd_pmu;
setup_event_constraints(); ret = amd_core_pmu_init();
setup_perfctr_core(); if (ret)
return ret;
/* Events are common for all AMDs */ /* Events are common for all AMDs */
memcpy(hw_cache_event_ids, amd_hw_cache_event_ids, memcpy(hw_cache_event_ids, amd_hw_cache_event_ids,
......
This diff is collapsed.
/*
* Copyright (C) 2013 Advanced Micro Devices, Inc.
*
* Author: Steven Kinney <Steven.Kinney@amd.com>
* Author: Suravee Suthikulpanit <Suraveee.Suthikulpanit@amd.com>
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
*/
#ifndef _PERF_EVENT_AMD_IOMMU_H_
#define _PERF_EVENT_AMD_IOMMU_H_
/* iommu pc mmio region register indexes */
#define IOMMU_PC_COUNTER_REG 0x00
#define IOMMU_PC_COUNTER_SRC_REG 0x08
#define IOMMU_PC_PASID_MATCH_REG 0x10
#define IOMMU_PC_DOMID_MATCH_REG 0x18
#define IOMMU_PC_DEVID_MATCH_REG 0x20
#define IOMMU_PC_COUNTER_REPORT_REG 0x28
/* maximun specified bank/counters */
#define PC_MAX_SPEC_BNKS 64
#define PC_MAX_SPEC_CNTRS 16
/* iommu pc reg masks*/
#define IOMMU_BASE_DEVID 0x0000
/* amd_iommu_init.c external support functions */
extern bool amd_iommu_pc_supported(void);
extern u8 amd_iommu_pc_get_max_banks(u16 devid);
extern u8 amd_iommu_pc_get_max_counters(u16 devid);
extern int amd_iommu_pc_get_set_reg_val(u16 devid, u8 bank, u8 cntr,
u8 fxn, u64 *value, bool is_write);
#endif /*_PERF_EVENT_AMD_IOMMU_H_*/
...@@ -13,6 +13,7 @@ ...@@ -13,6 +13,7 @@
#include <linux/slab.h> #include <linux/slab.h>
#include <linux/export.h> #include <linux/export.h>
#include <asm/cpufeature.h>
#include <asm/hardirq.h> #include <asm/hardirq.h>
#include <asm/apic.h> #include <asm/apic.h>
...@@ -190,6 +191,22 @@ struct attribute *snb_events_attrs[] = { ...@@ -190,6 +191,22 @@ struct attribute *snb_events_attrs[] = {
NULL, NULL,
}; };
static struct event_constraint intel_hsw_event_constraints[] = {
FIXED_EVENT_CONSTRAINT(0x00c0, 0), /* INST_RETIRED.ANY */
FIXED_EVENT_CONSTRAINT(0x003c, 1), /* CPU_CLK_UNHALTED.CORE */
FIXED_EVENT_CONSTRAINT(0x0300, 2), /* CPU_CLK_UNHALTED.REF */
INTEL_EVENT_CONSTRAINT(0x48, 0x4), /* L1D_PEND_MISS.* */
INTEL_UEVENT_CONSTRAINT(0x01c0, 0x2), /* INST_RETIRED.PREC_DIST */
INTEL_EVENT_CONSTRAINT(0xcd, 0x8), /* MEM_TRANS_RETIRED.LOAD_LATENCY */
/* CYCLE_ACTIVITY.CYCLES_L1D_PENDING */
INTEL_EVENT_CONSTRAINT(0x08a3, 0x4),
/* CYCLE_ACTIVITY.STALLS_L1D_PENDING */
INTEL_EVENT_CONSTRAINT(0x0ca3, 0x4),
/* CYCLE_ACTIVITY.CYCLES_NO_EXECUTE */
INTEL_EVENT_CONSTRAINT(0x04a3, 0xf),
EVENT_CONSTRAINT_END
};
static u64 intel_pmu_event_map(int hw_event) static u64 intel_pmu_event_map(int hw_event)
{ {
return intel_perfmon_event_map[hw_event]; return intel_perfmon_event_map[hw_event];
...@@ -872,7 +889,8 @@ static inline bool intel_pmu_needs_lbr_smpl(struct perf_event *event) ...@@ -872,7 +889,8 @@ static inline bool intel_pmu_needs_lbr_smpl(struct perf_event *event)
return true; return true;
/* implicit branch sampling to correct PEBS skid */ /* implicit branch sampling to correct PEBS skid */
if (x86_pmu.intel_cap.pebs_trap && event->attr.precise_ip > 1) if (x86_pmu.intel_cap.pebs_trap && event->attr.precise_ip > 1 &&
x86_pmu.intel_cap.pebs_format < 2)
return true; return true;
return false; return false;
...@@ -1167,15 +1185,11 @@ static int intel_pmu_handle_irq(struct pt_regs *regs) ...@@ -1167,15 +1185,11 @@ static int intel_pmu_handle_irq(struct pt_regs *regs)
cpuc = &__get_cpu_var(cpu_hw_events); cpuc = &__get_cpu_var(cpu_hw_events);
/* /*
* Some chipsets need to unmask the LVTPC in a particular spot * No known reason to not always do late ACK,
* inside the nmi handler. As a result, the unmasking was pushed * but just in case do it opt-in.
* into all the nmi handlers.
*
* This handler doesn't seem to have any issues with the unmasking
* so it was left at the top.
*/ */
apic_write(APIC_LVTPC, APIC_DM_NMI); if (!x86_pmu.late_ack)
apic_write(APIC_LVTPC, APIC_DM_NMI);
intel_pmu_disable_all(); intel_pmu_disable_all();
handled = intel_pmu_drain_bts_buffer(); handled = intel_pmu_drain_bts_buffer();
status = intel_pmu_get_status(); status = intel_pmu_get_status();
...@@ -1188,8 +1202,12 @@ static int intel_pmu_handle_irq(struct pt_regs *regs) ...@@ -1188,8 +1202,12 @@ static int intel_pmu_handle_irq(struct pt_regs *regs)
again: again:
intel_pmu_ack_status(status); intel_pmu_ack_status(status);
if (++loops > 100) { if (++loops > 100) {
WARN_ONCE(1, "perfevents: irq loop stuck!\n"); static bool warned = false;
perf_event_print_debug(); if (!warned) {
WARN(1, "perfevents: irq loop stuck!\n");
perf_event_print_debug();
warned = true;
}
intel_pmu_reset(); intel_pmu_reset();
goto done; goto done;
} }
...@@ -1235,6 +1253,13 @@ static int intel_pmu_handle_irq(struct pt_regs *regs) ...@@ -1235,6 +1253,13 @@ static int intel_pmu_handle_irq(struct pt_regs *regs)
done: done:
intel_pmu_enable_all(0); intel_pmu_enable_all(0);
/*
* Only unmask the NMI after the overflow counters
* have been reset. This avoids spurious NMIs on
* Haswell CPUs.
*/
if (x86_pmu.late_ack)
apic_write(APIC_LVTPC, APIC_DM_NMI);
return handled; return handled;
} }
...@@ -1425,7 +1450,6 @@ x86_get_event_constraints(struct cpu_hw_events *cpuc, struct perf_event *event) ...@@ -1425,7 +1450,6 @@ x86_get_event_constraints(struct cpu_hw_events *cpuc, struct perf_event *event)
if (x86_pmu.event_constraints) { if (x86_pmu.event_constraints) {
for_each_event_constraint(c, x86_pmu.event_constraints) { for_each_event_constraint(c, x86_pmu.event_constraints) {
if ((event->hw.config & c->cmask) == c->code) { if ((event->hw.config & c->cmask) == c->code) {
/* hw.flags zeroed at initialization */
event->hw.flags |= c->flags; event->hw.flags |= c->flags;
return c; return c;
} }
...@@ -1473,7 +1497,6 @@ intel_put_shared_regs_event_constraints(struct cpu_hw_events *cpuc, ...@@ -1473,7 +1497,6 @@ intel_put_shared_regs_event_constraints(struct cpu_hw_events *cpuc,
static void intel_put_event_constraints(struct cpu_hw_events *cpuc, static void intel_put_event_constraints(struct cpu_hw_events *cpuc,
struct perf_event *event) struct perf_event *event)
{ {
event->hw.flags = 0;
intel_put_shared_regs_event_constraints(cpuc, event); intel_put_shared_regs_event_constraints(cpuc, event);
} }
...@@ -1646,6 +1669,47 @@ static void core_pmu_enable_all(int added) ...@@ -1646,6 +1669,47 @@ static void core_pmu_enable_all(int added)
} }
} }
static int hsw_hw_config(struct perf_event *event)
{
int ret = intel_pmu_hw_config(event);
if (ret)
return ret;
if (!boot_cpu_has(X86_FEATURE_RTM) && !boot_cpu_has(X86_FEATURE_HLE))
return 0;
event->hw.config |= event->attr.config & (HSW_IN_TX|HSW_IN_TX_CHECKPOINTED);
/*
* IN_TX/IN_TX-CP filters are not supported by the Haswell PMU with
* PEBS or in ANY thread mode. Since the results are non-sensical forbid
* this combination.
*/
if ((event->hw.config & (HSW_IN_TX|HSW_IN_TX_CHECKPOINTED)) &&
((event->hw.config & ARCH_PERFMON_EVENTSEL_ANY) ||
event->attr.precise_ip > 0))
return -EOPNOTSUPP;
return 0;
}
static struct event_constraint counter2_constraint =
EVENT_CONSTRAINT(0, 0x4, 0);
static struct event_constraint *
hsw_get_event_constraints(struct cpu_hw_events *cpuc, struct perf_event *event)
{
struct event_constraint *c = intel_get_event_constraints(cpuc, event);
/* Handle special quirk on in_tx_checkpointed only in counter 2 */
if (event->hw.config & HSW_IN_TX_CHECKPOINTED) {
if (c->idxmsk64 & (1U << 2))
return &counter2_constraint;
return &emptyconstraint;
}
return c;
}
PMU_FORMAT_ATTR(event, "config:0-7" ); PMU_FORMAT_ATTR(event, "config:0-7" );
PMU_FORMAT_ATTR(umask, "config:8-15" ); PMU_FORMAT_ATTR(umask, "config:8-15" );
PMU_FORMAT_ATTR(edge, "config:18" ); PMU_FORMAT_ATTR(edge, "config:18" );
...@@ -1653,6 +1717,8 @@ PMU_FORMAT_ATTR(pc, "config:19" ); ...@@ -1653,6 +1717,8 @@ PMU_FORMAT_ATTR(pc, "config:19" );
PMU_FORMAT_ATTR(any, "config:21" ); /* v3 + */ PMU_FORMAT_ATTR(any, "config:21" ); /* v3 + */
PMU_FORMAT_ATTR(inv, "config:23" ); PMU_FORMAT_ATTR(inv, "config:23" );
PMU_FORMAT_ATTR(cmask, "config:24-31" ); PMU_FORMAT_ATTR(cmask, "config:24-31" );
PMU_FORMAT_ATTR(in_tx, "config:32");
PMU_FORMAT_ATTR(in_tx_cp, "config:33");
static struct attribute *intel_arch_formats_attr[] = { static struct attribute *intel_arch_formats_attr[] = {
&format_attr_event.attr, &format_attr_event.attr,
...@@ -1807,6 +1873,8 @@ static struct attribute *intel_arch3_formats_attr[] = { ...@@ -1807,6 +1873,8 @@ static struct attribute *intel_arch3_formats_attr[] = {
&format_attr_any.attr, &format_attr_any.attr,
&format_attr_inv.attr, &format_attr_inv.attr,
&format_attr_cmask.attr, &format_attr_cmask.attr,
&format_attr_in_tx.attr,
&format_attr_in_tx_cp.attr,
&format_attr_offcore_rsp.attr, /* XXX do NHM/WSM + SNB breakout */ &format_attr_offcore_rsp.attr, /* XXX do NHM/WSM + SNB breakout */
&format_attr_ldlat.attr, /* PEBS load latency */ &format_attr_ldlat.attr, /* PEBS load latency */
...@@ -1966,6 +2034,15 @@ static __init void intel_nehalem_quirk(void) ...@@ -1966,6 +2034,15 @@ static __init void intel_nehalem_quirk(void)
} }
} }
EVENT_ATTR_STR(mem-loads, mem_ld_hsw, "event=0xcd,umask=0x1,ldlat=3");
EVENT_ATTR_STR(mem-stores, mem_st_hsw, "event=0xd0,umask=0x82")
static struct attribute *hsw_events_attrs[] = {
EVENT_PTR(mem_ld_hsw),
EVENT_PTR(mem_st_hsw),
NULL
};
__init int intel_pmu_init(void) __init int intel_pmu_init(void)
{ {
union cpuid10_edx edx; union cpuid10_edx edx;
...@@ -2189,6 +2266,30 @@ __init int intel_pmu_init(void) ...@@ -2189,6 +2266,30 @@ __init int intel_pmu_init(void)
break; break;
case 60: /* Haswell Client */
case 70:
case 71:
case 63:
x86_pmu.late_ack = true;
memcpy(hw_cache_event_ids, snb_hw_cache_event_ids, sizeof(hw_cache_event_ids));
memcpy(hw_cache_extra_regs, snb_hw_cache_extra_regs, sizeof(hw_cache_extra_regs));
intel_pmu_lbr_init_snb();
x86_pmu.event_constraints = intel_hsw_event_constraints;
x86_pmu.pebs_constraints = intel_hsw_pebs_event_constraints;
x86_pmu.extra_regs = intel_snb_extra_regs;
x86_pmu.pebs_aliases = intel_pebs_aliases_snb;
/* all extra regs are per-cpu when HT is on */
x86_pmu.er_flags |= ERF_HAS_RSP_1;
x86_pmu.er_flags |= ERF_NO_HT_SHARING;
x86_pmu.hw_config = hsw_hw_config;
x86_pmu.get_event_constraints = hsw_get_event_constraints;
x86_pmu.cpu_events = hsw_events_attrs;
pr_cont("Haswell events, ");
break;
default: default:
switch (x86_pmu.version) { switch (x86_pmu.version) {
case 1: case 1:
...@@ -2227,7 +2328,7 @@ __init int intel_pmu_init(void) ...@@ -2227,7 +2328,7 @@ __init int intel_pmu_init(void)
* counter, so do not extend mask to generic counters * counter, so do not extend mask to generic counters
*/ */
for_each_event_constraint(c, x86_pmu.event_constraints) { for_each_event_constraint(c, x86_pmu.event_constraints) {
if (c->cmask != X86_RAW_EVENT_MASK if (c->cmask != FIXED_EVENT_FLAGS
|| c->idxmsk64 == INTEL_PMC_MSK_FIXED_REF_CYCLES) { || c->idxmsk64 == INTEL_PMC_MSK_FIXED_REF_CYCLES) {
continue; continue;
} }
...@@ -2237,5 +2338,12 @@ __init int intel_pmu_init(void) ...@@ -2237,5 +2338,12 @@ __init int intel_pmu_init(void)
} }
} }
/* Support full width counters using alternative MSR range */
if (x86_pmu.intel_cap.full_width_write) {
x86_pmu.max_period = x86_pmu.cntval_mask;
x86_pmu.perfctr = MSR_IA32_PMC0;
pr_cont("full-width counters, ");
}
return 0; return 0;
} }
...@@ -107,6 +107,19 @@ static u64 precise_store_data(u64 status) ...@@ -107,6 +107,19 @@ static u64 precise_store_data(u64 status)
return val; return val;
} }
static u64 precise_store_data_hsw(u64 status)
{
union perf_mem_data_src dse;
dse.val = 0;
dse.mem_op = PERF_MEM_OP_STORE;
dse.mem_lvl = PERF_MEM_LVL_NA;
if (status & 1)
dse.mem_lvl = PERF_MEM_LVL_L1;
/* Nothing else supported. Sorry. */
return dse.val;
}
static u64 load_latency_data(u64 status) static u64 load_latency_data(u64 status)
{ {
union intel_x86_pebs_dse dse; union intel_x86_pebs_dse dse;
...@@ -165,6 +178,22 @@ struct pebs_record_nhm { ...@@ -165,6 +178,22 @@ struct pebs_record_nhm {
u64 status, dla, dse, lat; u64 status, dla, dse, lat;
}; };
/*
* Same as pebs_record_nhm, with two additional fields.
*/
struct pebs_record_hsw {
struct pebs_record_nhm nhm;
/*
* Real IP of the event. In the Intel documentation this
* is called eventingrip.
*/
u64 real_ip;
/*
* TSX tuning information field: abort cycles and abort flags.
*/
u64 tsx_tuning;
};
void init_debug_store_on_cpu(int cpu) void init_debug_store_on_cpu(int cpu)
{ {
struct debug_store *ds = per_cpu(cpu_hw_events, cpu).ds; struct debug_store *ds = per_cpu(cpu_hw_events, cpu).ds;
...@@ -548,6 +577,42 @@ struct event_constraint intel_ivb_pebs_event_constraints[] = { ...@@ -548,6 +577,42 @@ struct event_constraint intel_ivb_pebs_event_constraints[] = {
EVENT_CONSTRAINT_END EVENT_CONSTRAINT_END
}; };
struct event_constraint intel_hsw_pebs_event_constraints[] = {
INTEL_UEVENT_CONSTRAINT(0x01c0, 0x2), /* INST_RETIRED.PRECDIST */
INTEL_PST_HSW_CONSTRAINT(0x01c2, 0xf), /* UOPS_RETIRED.ALL */
INTEL_UEVENT_CONSTRAINT(0x02c2, 0xf), /* UOPS_RETIRED.RETIRE_SLOTS */
INTEL_EVENT_CONSTRAINT(0xc4, 0xf), /* BR_INST_RETIRED.* */
INTEL_UEVENT_CONSTRAINT(0x01c5, 0xf), /* BR_MISP_RETIRED.CONDITIONAL */
INTEL_UEVENT_CONSTRAINT(0x04c5, 0xf), /* BR_MISP_RETIRED.ALL_BRANCHES */
INTEL_UEVENT_CONSTRAINT(0x20c5, 0xf), /* BR_MISP_RETIRED.NEAR_TAKEN */
INTEL_PLD_CONSTRAINT(0x01cd, 0x8), /* MEM_TRANS_RETIRED.* */
/* MEM_UOPS_RETIRED.STLB_MISS_LOADS */
INTEL_UEVENT_CONSTRAINT(0x11d0, 0xf),
/* MEM_UOPS_RETIRED.STLB_MISS_STORES */
INTEL_UEVENT_CONSTRAINT(0x12d0, 0xf),
INTEL_UEVENT_CONSTRAINT(0x21d0, 0xf), /* MEM_UOPS_RETIRED.LOCK_LOADS */
INTEL_UEVENT_CONSTRAINT(0x41d0, 0xf), /* MEM_UOPS_RETIRED.SPLIT_LOADS */
/* MEM_UOPS_RETIRED.SPLIT_STORES */
INTEL_UEVENT_CONSTRAINT(0x42d0, 0xf),
INTEL_UEVENT_CONSTRAINT(0x81d0, 0xf), /* MEM_UOPS_RETIRED.ALL_LOADS */
INTEL_PST_HSW_CONSTRAINT(0x82d0, 0xf), /* MEM_UOPS_RETIRED.ALL_STORES */
INTEL_UEVENT_CONSTRAINT(0x01d1, 0xf), /* MEM_LOAD_UOPS_RETIRED.L1_HIT */
INTEL_UEVENT_CONSTRAINT(0x02d1, 0xf), /* MEM_LOAD_UOPS_RETIRED.L2_HIT */
INTEL_UEVENT_CONSTRAINT(0x04d1, 0xf), /* MEM_LOAD_UOPS_RETIRED.L3_HIT */
/* MEM_LOAD_UOPS_RETIRED.HIT_LFB */
INTEL_UEVENT_CONSTRAINT(0x40d1, 0xf),
/* MEM_LOAD_UOPS_LLC_HIT_RETIRED.XSNP_MISS */
INTEL_UEVENT_CONSTRAINT(0x01d2, 0xf),
/* MEM_LOAD_UOPS_LLC_HIT_RETIRED.XSNP_HIT */
INTEL_UEVENT_CONSTRAINT(0x02d2, 0xf),
/* MEM_LOAD_UOPS_LLC_MISS_RETIRED.LOCAL_DRAM */
INTEL_UEVENT_CONSTRAINT(0x01d3, 0xf),
INTEL_UEVENT_CONSTRAINT(0x04c8, 0xf), /* HLE_RETIRED.Abort */
INTEL_UEVENT_CONSTRAINT(0x04c9, 0xf), /* RTM_RETIRED.Abort */
EVENT_CONSTRAINT_END
};
struct event_constraint *intel_pebs_constraints(struct perf_event *event) struct event_constraint *intel_pebs_constraints(struct perf_event *event)
{ {
struct event_constraint *c; struct event_constraint *c;
...@@ -588,6 +653,12 @@ void intel_pmu_pebs_disable(struct perf_event *event) ...@@ -588,6 +653,12 @@ void intel_pmu_pebs_disable(struct perf_event *event)
struct hw_perf_event *hwc = &event->hw; struct hw_perf_event *hwc = &event->hw;
cpuc->pebs_enabled &= ~(1ULL << hwc->idx); cpuc->pebs_enabled &= ~(1ULL << hwc->idx);
if (event->hw.constraint->flags & PERF_X86_EVENT_PEBS_LDLAT)
cpuc->pebs_enabled &= ~(1ULL << (hwc->idx + 32));
else if (event->hw.constraint->flags & PERF_X86_EVENT_PEBS_ST)
cpuc->pebs_enabled &= ~(1ULL << 63);
if (cpuc->enabled) if (cpuc->enabled)
wrmsrl(MSR_IA32_PEBS_ENABLE, cpuc->pebs_enabled); wrmsrl(MSR_IA32_PEBS_ENABLE, cpuc->pebs_enabled);
...@@ -697,6 +768,7 @@ static void __intel_pmu_pebs_event(struct perf_event *event, ...@@ -697,6 +768,7 @@ static void __intel_pmu_pebs_event(struct perf_event *event,
*/ */
struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events); struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
struct pebs_record_nhm *pebs = __pebs; struct pebs_record_nhm *pebs = __pebs;
struct pebs_record_hsw *pebs_hsw = __pebs;
struct perf_sample_data data; struct perf_sample_data data;
struct pt_regs regs; struct pt_regs regs;
u64 sample_type; u64 sample_type;
...@@ -706,7 +778,8 @@ static void __intel_pmu_pebs_event(struct perf_event *event, ...@@ -706,7 +778,8 @@ static void __intel_pmu_pebs_event(struct perf_event *event,
return; return;
fll = event->hw.flags & PERF_X86_EVENT_PEBS_LDLAT; fll = event->hw.flags & PERF_X86_EVENT_PEBS_LDLAT;
fst = event->hw.flags & PERF_X86_EVENT_PEBS_ST; fst = event->hw.flags & (PERF_X86_EVENT_PEBS_ST |
PERF_X86_EVENT_PEBS_ST_HSW);
perf_sample_data_init(&data, 0, event->hw.last_period); perf_sample_data_init(&data, 0, event->hw.last_period);
...@@ -717,9 +790,6 @@ static void __intel_pmu_pebs_event(struct perf_event *event, ...@@ -717,9 +790,6 @@ static void __intel_pmu_pebs_event(struct perf_event *event,
* if PEBS-LL or PreciseStore * if PEBS-LL or PreciseStore
*/ */
if (fll || fst) { if (fll || fst) {
if (sample_type & PERF_SAMPLE_ADDR)
data.addr = pebs->dla;
/* /*
* Use latency for weight (only avail with PEBS-LL) * Use latency for weight (only avail with PEBS-LL)
*/ */
...@@ -732,6 +802,9 @@ static void __intel_pmu_pebs_event(struct perf_event *event, ...@@ -732,6 +802,9 @@ static void __intel_pmu_pebs_event(struct perf_event *event,
if (sample_type & PERF_SAMPLE_DATA_SRC) { if (sample_type & PERF_SAMPLE_DATA_SRC) {
if (fll) if (fll)
data.data_src.val = load_latency_data(pebs->dse); data.data_src.val = load_latency_data(pebs->dse);
else if (event->hw.flags & PERF_X86_EVENT_PEBS_ST_HSW)
data.data_src.val =
precise_store_data_hsw(pebs->dse);
else else
data.data_src.val = precise_store_data(pebs->dse); data.data_src.val = precise_store_data(pebs->dse);
} }
...@@ -753,11 +826,18 @@ static void __intel_pmu_pebs_event(struct perf_event *event, ...@@ -753,11 +826,18 @@ static void __intel_pmu_pebs_event(struct perf_event *event,
regs.bp = pebs->bp; regs.bp = pebs->bp;
regs.sp = pebs->sp; regs.sp = pebs->sp;
if (event->attr.precise_ip > 1 && intel_pmu_pebs_fixup_ip(&regs)) if (event->attr.precise_ip > 1 && x86_pmu.intel_cap.pebs_format >= 2) {
regs.ip = pebs_hsw->real_ip;
regs.flags |= PERF_EFLAGS_EXACT;
} else if (event->attr.precise_ip > 1 && intel_pmu_pebs_fixup_ip(&regs))
regs.flags |= PERF_EFLAGS_EXACT; regs.flags |= PERF_EFLAGS_EXACT;
else else
regs.flags &= ~PERF_EFLAGS_EXACT; regs.flags &= ~PERF_EFLAGS_EXACT;
if ((event->attr.sample_type & PERF_SAMPLE_ADDR) &&
x86_pmu.intel_cap.pebs_format >= 1)
data.addr = pebs->dla;
if (has_branch_stack(event)) if (has_branch_stack(event))
data.br_stack = &cpuc->lbr_stack; data.br_stack = &cpuc->lbr_stack;
...@@ -806,35 +886,22 @@ static void intel_pmu_drain_pebs_core(struct pt_regs *iregs) ...@@ -806,35 +886,22 @@ static void intel_pmu_drain_pebs_core(struct pt_regs *iregs)
__intel_pmu_pebs_event(event, iregs, at); __intel_pmu_pebs_event(event, iregs, at);
} }
static void intel_pmu_drain_pebs_nhm(struct pt_regs *iregs) static void __intel_pmu_drain_pebs_nhm(struct pt_regs *iregs, void *at,
void *top)
{ {
struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events); struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
struct debug_store *ds = cpuc->ds; struct debug_store *ds = cpuc->ds;
struct pebs_record_nhm *at, *top;
struct perf_event *event = NULL; struct perf_event *event = NULL;
u64 status = 0; u64 status = 0;
int bit, n; int bit;
if (!x86_pmu.pebs_active)
return;
at = (struct pebs_record_nhm *)(unsigned long)ds->pebs_buffer_base;
top = (struct pebs_record_nhm *)(unsigned long)ds->pebs_index;
ds->pebs_index = ds->pebs_buffer_base; ds->pebs_index = ds->pebs_buffer_base;
n = top - at; for (; at < top; at += x86_pmu.pebs_record_size) {
if (n <= 0) struct pebs_record_nhm *p = at;
return;
/*
* Should not happen, we program the threshold at 1 and do not
* set a reset value.
*/
WARN_ONCE(n > x86_pmu.max_pebs_events, "Unexpected number of pebs records %d\n", n);
for ( ; at < top; at++) { for_each_set_bit(bit, (unsigned long *)&p->status,
for_each_set_bit(bit, (unsigned long *)&at->status, x86_pmu.max_pebs_events) { x86_pmu.max_pebs_events) {
event = cpuc->events[bit]; event = cpuc->events[bit];
if (!test_bit(bit, cpuc->active_mask)) if (!test_bit(bit, cpuc->active_mask))
continue; continue;
...@@ -857,6 +924,61 @@ static void intel_pmu_drain_pebs_nhm(struct pt_regs *iregs) ...@@ -857,6 +924,61 @@ static void intel_pmu_drain_pebs_nhm(struct pt_regs *iregs)
} }
} }
static void intel_pmu_drain_pebs_nhm(struct pt_regs *iregs)
{
struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
struct debug_store *ds = cpuc->ds;
struct pebs_record_nhm *at, *top;
int n;
if (!x86_pmu.pebs_active)
return;
at = (struct pebs_record_nhm *)(unsigned long)ds->pebs_buffer_base;
top = (struct pebs_record_nhm *)(unsigned long)ds->pebs_index;
ds->pebs_index = ds->pebs_buffer_base;
n = top - at;
if (n <= 0)
return;
/*
* Should not happen, we program the threshold at 1 and do not
* set a reset value.
*/
WARN_ONCE(n > x86_pmu.max_pebs_events,
"Unexpected number of pebs records %d\n", n);
return __intel_pmu_drain_pebs_nhm(iregs, at, top);
}
static void intel_pmu_drain_pebs_hsw(struct pt_regs *iregs)
{
struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
struct debug_store *ds = cpuc->ds;
struct pebs_record_hsw *at, *top;
int n;
if (!x86_pmu.pebs_active)
return;
at = (struct pebs_record_hsw *)(unsigned long)ds->pebs_buffer_base;
top = (struct pebs_record_hsw *)(unsigned long)ds->pebs_index;
n = top - at;
if (n <= 0)
return;
/*
* Should not happen, we program the threshold at 1 and do not
* set a reset value.
*/
WARN_ONCE(n > x86_pmu.max_pebs_events,
"Unexpected number of pebs records %d\n", n);
return __intel_pmu_drain_pebs_nhm(iregs, at, top);
}
/* /*
* BTS, PEBS probe and setup * BTS, PEBS probe and setup
*/ */
...@@ -888,6 +1010,12 @@ void intel_ds_init(void) ...@@ -888,6 +1010,12 @@ void intel_ds_init(void)
x86_pmu.drain_pebs = intel_pmu_drain_pebs_nhm; x86_pmu.drain_pebs = intel_pmu_drain_pebs_nhm;
break; break;
case 2:
pr_cont("PEBS fmt2%c, ", pebs_type);
x86_pmu.pebs_record_size = sizeof(struct pebs_record_hsw);
x86_pmu.drain_pebs = intel_pmu_drain_pebs_hsw;
break;
default: default:
printk(KERN_CONT "no PEBS fmt%d%c, ", format, pebs_type); printk(KERN_CONT "no PEBS fmt%d%c, ", format, pebs_type);
x86_pmu.pebs = 0; x86_pmu.pebs = 0;
......
...@@ -12,6 +12,16 @@ enum { ...@@ -12,6 +12,16 @@ enum {
LBR_FORMAT_LIP = 0x01, LBR_FORMAT_LIP = 0x01,
LBR_FORMAT_EIP = 0x02, LBR_FORMAT_EIP = 0x02,
LBR_FORMAT_EIP_FLAGS = 0x03, LBR_FORMAT_EIP_FLAGS = 0x03,
LBR_FORMAT_EIP_FLAGS2 = 0x04,
LBR_FORMAT_MAX_KNOWN = LBR_FORMAT_EIP_FLAGS2,
};
static enum {
LBR_EIP_FLAGS = 1,
LBR_TSX = 2,
} lbr_desc[LBR_FORMAT_MAX_KNOWN + 1] = {
[LBR_FORMAT_EIP_FLAGS] = LBR_EIP_FLAGS,
[LBR_FORMAT_EIP_FLAGS2] = LBR_EIP_FLAGS | LBR_TSX,
}; };
/* /*
...@@ -56,6 +66,8 @@ enum { ...@@ -56,6 +66,8 @@ enum {
LBR_FAR) LBR_FAR)
#define LBR_FROM_FLAG_MISPRED (1ULL << 63) #define LBR_FROM_FLAG_MISPRED (1ULL << 63)
#define LBR_FROM_FLAG_IN_TX (1ULL << 62)
#define LBR_FROM_FLAG_ABORT (1ULL << 61)
#define for_each_branch_sample_type(x) \ #define for_each_branch_sample_type(x) \
for ((x) = PERF_SAMPLE_BRANCH_USER; \ for ((x) = PERF_SAMPLE_BRANCH_USER; \
...@@ -81,9 +93,13 @@ enum { ...@@ -81,9 +93,13 @@ enum {
X86_BR_JMP = 1 << 9, /* jump */ X86_BR_JMP = 1 << 9, /* jump */
X86_BR_IRQ = 1 << 10,/* hw interrupt or trap or fault */ X86_BR_IRQ = 1 << 10,/* hw interrupt or trap or fault */
X86_BR_IND_CALL = 1 << 11,/* indirect calls */ X86_BR_IND_CALL = 1 << 11,/* indirect calls */
X86_BR_ABORT = 1 << 12,/* transaction abort */
X86_BR_IN_TX = 1 << 13,/* in transaction */
X86_BR_NO_TX = 1 << 14,/* not in transaction */
}; };
#define X86_BR_PLM (X86_BR_USER | X86_BR_KERNEL) #define X86_BR_PLM (X86_BR_USER | X86_BR_KERNEL)
#define X86_BR_ANYTX (X86_BR_NO_TX | X86_BR_IN_TX)
#define X86_BR_ANY \ #define X86_BR_ANY \
(X86_BR_CALL |\ (X86_BR_CALL |\
...@@ -95,6 +111,7 @@ enum { ...@@ -95,6 +111,7 @@ enum {
X86_BR_JCC |\ X86_BR_JCC |\
X86_BR_JMP |\ X86_BR_JMP |\
X86_BR_IRQ |\ X86_BR_IRQ |\
X86_BR_ABORT |\
X86_BR_IND_CALL) X86_BR_IND_CALL)
#define X86_BR_ALL (X86_BR_PLM | X86_BR_ANY) #define X86_BR_ALL (X86_BR_PLM | X86_BR_ANY)
...@@ -270,21 +287,31 @@ static void intel_pmu_lbr_read_64(struct cpu_hw_events *cpuc) ...@@ -270,21 +287,31 @@ static void intel_pmu_lbr_read_64(struct cpu_hw_events *cpuc)
for (i = 0; i < x86_pmu.lbr_nr; i++) { for (i = 0; i < x86_pmu.lbr_nr; i++) {
unsigned long lbr_idx = (tos - i) & mask; unsigned long lbr_idx = (tos - i) & mask;
u64 from, to, mis = 0, pred = 0; u64 from, to, mis = 0, pred = 0, in_tx = 0, abort = 0;
int skip = 0;
int lbr_flags = lbr_desc[lbr_format];
rdmsrl(x86_pmu.lbr_from + lbr_idx, from); rdmsrl(x86_pmu.lbr_from + lbr_idx, from);
rdmsrl(x86_pmu.lbr_to + lbr_idx, to); rdmsrl(x86_pmu.lbr_to + lbr_idx, to);
if (lbr_format == LBR_FORMAT_EIP_FLAGS) { if (lbr_flags & LBR_EIP_FLAGS) {
mis = !!(from & LBR_FROM_FLAG_MISPRED); mis = !!(from & LBR_FROM_FLAG_MISPRED);
pred = !mis; pred = !mis;
from = (u64)((((s64)from) << 1) >> 1); skip = 1;
}
if (lbr_flags & LBR_TSX) {
in_tx = !!(from & LBR_FROM_FLAG_IN_TX);
abort = !!(from & LBR_FROM_FLAG_ABORT);
skip = 3;
} }
from = (u64)((((s64)from) << skip) >> skip);
cpuc->lbr_entries[i].from = from; cpuc->lbr_entries[i].from = from;
cpuc->lbr_entries[i].to = to; cpuc->lbr_entries[i].to = to;
cpuc->lbr_entries[i].mispred = mis; cpuc->lbr_entries[i].mispred = mis;
cpuc->lbr_entries[i].predicted = pred; cpuc->lbr_entries[i].predicted = pred;
cpuc->lbr_entries[i].in_tx = in_tx;
cpuc->lbr_entries[i].abort = abort;
cpuc->lbr_entries[i].reserved = 0; cpuc->lbr_entries[i].reserved = 0;
} }
cpuc->lbr_stack.nr = i; cpuc->lbr_stack.nr = i;
...@@ -310,7 +337,7 @@ void intel_pmu_lbr_read(void) ...@@ -310,7 +337,7 @@ void intel_pmu_lbr_read(void)
* - in case there is no HW filter * - in case there is no HW filter
* - in case the HW filter has errata or limitations * - in case the HW filter has errata or limitations
*/ */
static int intel_pmu_setup_sw_lbr_filter(struct perf_event *event) static void intel_pmu_setup_sw_lbr_filter(struct perf_event *event)
{ {
u64 br_type = event->attr.branch_sample_type; u64 br_type = event->attr.branch_sample_type;
int mask = 0; int mask = 0;
...@@ -318,11 +345,8 @@ static int intel_pmu_setup_sw_lbr_filter(struct perf_event *event) ...@@ -318,11 +345,8 @@ static int intel_pmu_setup_sw_lbr_filter(struct perf_event *event)
if (br_type & PERF_SAMPLE_BRANCH_USER) if (br_type & PERF_SAMPLE_BRANCH_USER)
mask |= X86_BR_USER; mask |= X86_BR_USER;
if (br_type & PERF_SAMPLE_BRANCH_KERNEL) { if (br_type & PERF_SAMPLE_BRANCH_KERNEL)
if (perf_paranoid_kernel() && !capable(CAP_SYS_ADMIN))
return -EACCES;
mask |= X86_BR_KERNEL; mask |= X86_BR_KERNEL;
}
/* we ignore BRANCH_HV here */ /* we ignore BRANCH_HV here */
...@@ -337,13 +361,21 @@ static int intel_pmu_setup_sw_lbr_filter(struct perf_event *event) ...@@ -337,13 +361,21 @@ static int intel_pmu_setup_sw_lbr_filter(struct perf_event *event)
if (br_type & PERF_SAMPLE_BRANCH_IND_CALL) if (br_type & PERF_SAMPLE_BRANCH_IND_CALL)
mask |= X86_BR_IND_CALL; mask |= X86_BR_IND_CALL;
if (br_type & PERF_SAMPLE_BRANCH_ABORT_TX)
mask |= X86_BR_ABORT;
if (br_type & PERF_SAMPLE_BRANCH_IN_TX)
mask |= X86_BR_IN_TX;
if (br_type & PERF_SAMPLE_BRANCH_NO_TX)
mask |= X86_BR_NO_TX;
/* /*
* stash actual user request into reg, it may * stash actual user request into reg, it may
* be used by fixup code for some CPU * be used by fixup code for some CPU
*/ */
event->hw.branch_reg.reg = mask; event->hw.branch_reg.reg = mask;
return 0;
} }
/* /*
...@@ -391,9 +423,7 @@ int intel_pmu_setup_lbr_filter(struct perf_event *event) ...@@ -391,9 +423,7 @@ int intel_pmu_setup_lbr_filter(struct perf_event *event)
/* /*
* setup SW LBR filter * setup SW LBR filter
*/ */
ret = intel_pmu_setup_sw_lbr_filter(event); intel_pmu_setup_sw_lbr_filter(event);
if (ret)
return ret;
/* /*
* setup HW LBR filter, if any * setup HW LBR filter, if any
...@@ -415,7 +445,7 @@ int intel_pmu_setup_lbr_filter(struct perf_event *event) ...@@ -415,7 +445,7 @@ int intel_pmu_setup_lbr_filter(struct perf_event *event)
* decoded (e.g., text page not present), then X86_BR_NONE is * decoded (e.g., text page not present), then X86_BR_NONE is
* returned. * returned.
*/ */
static int branch_type(unsigned long from, unsigned long to) static int branch_type(unsigned long from, unsigned long to, int abort)
{ {
struct insn insn; struct insn insn;
void *addr; void *addr;
...@@ -435,6 +465,9 @@ static int branch_type(unsigned long from, unsigned long to) ...@@ -435,6 +465,9 @@ static int branch_type(unsigned long from, unsigned long to)
if (from == 0 || to == 0) if (from == 0 || to == 0)
return X86_BR_NONE; return X86_BR_NONE;
if (abort)
return X86_BR_ABORT | to_plm;
if (from_plm == X86_BR_USER) { if (from_plm == X86_BR_USER) {
/* /*
* can happen if measuring at the user level only * can happen if measuring at the user level only
...@@ -581,7 +614,13 @@ intel_pmu_lbr_filter(struct cpu_hw_events *cpuc) ...@@ -581,7 +614,13 @@ intel_pmu_lbr_filter(struct cpu_hw_events *cpuc)
from = cpuc->lbr_entries[i].from; from = cpuc->lbr_entries[i].from;
to = cpuc->lbr_entries[i].to; to = cpuc->lbr_entries[i].to;
type = branch_type(from, to); type = branch_type(from, to, cpuc->lbr_entries[i].abort);
if (type != X86_BR_NONE && (br_sel & X86_BR_ANYTX)) {
if (cpuc->lbr_entries[i].in_tx)
type |= X86_BR_IN_TX;
else
type |= X86_BR_NO_TX;
}
/* if type does not correspond, then discard */ /* if type does not correspond, then discard */
if (type == X86_BR_NONE || (br_sel & type) != type) { if (type == X86_BR_NONE || (br_sel & type) != type) {
......
...@@ -536,7 +536,7 @@ __snbep_cbox_get_constraint(struct intel_uncore_box *box, struct perf_event *eve ...@@ -536,7 +536,7 @@ __snbep_cbox_get_constraint(struct intel_uncore_box *box, struct perf_event *eve
if (!uncore_box_is_fake(box)) if (!uncore_box_is_fake(box))
reg1->alloc |= alloc; reg1->alloc |= alloc;
return 0; return NULL;
fail: fail:
for (; i >= 0; i--) { for (; i >= 0; i--) {
if (alloc & (0x1 << i)) if (alloc & (0x1 << i))
...@@ -644,7 +644,7 @@ snbep_pcu_get_constraint(struct intel_uncore_box *box, struct perf_event *event) ...@@ -644,7 +644,7 @@ snbep_pcu_get_constraint(struct intel_uncore_box *box, struct perf_event *event)
(!uncore_box_is_fake(box) && reg1->alloc)) (!uncore_box_is_fake(box) && reg1->alloc))
return NULL; return NULL;
again: again:
mask = 0xff << (idx * 8); mask = 0xffULL << (idx * 8);
raw_spin_lock_irqsave(&er->lock, flags); raw_spin_lock_irqsave(&er->lock, flags);
if (!__BITS_VALUE(atomic_read(&er->ref), idx, 8) || if (!__BITS_VALUE(atomic_read(&er->ref), idx, 8) ||
!((config1 ^ er->config) & mask)) { !((config1 ^ er->config) & mask)) {
...@@ -1923,7 +1923,7 @@ static u64 nhmex_mbox_alter_er(struct perf_event *event, int new_idx, bool modif ...@@ -1923,7 +1923,7 @@ static u64 nhmex_mbox_alter_er(struct perf_event *event, int new_idx, bool modif
{ {
struct hw_perf_event *hwc = &event->hw; struct hw_perf_event *hwc = &event->hw;
struct hw_perf_event_extra *reg1 = &hwc->extra_reg; struct hw_perf_event_extra *reg1 = &hwc->extra_reg;
int idx, orig_idx = __BITS_VALUE(reg1->idx, 0, 8); u64 idx, orig_idx = __BITS_VALUE(reg1->idx, 0, 8);
u64 config = reg1->config; u64 config = reg1->config;
/* get the non-shared control bits and shift them */ /* get the non-shared control bits and shift them */
...@@ -2723,15 +2723,16 @@ static void uncore_put_event_constraint(struct intel_uncore_box *box, struct per ...@@ -2723,15 +2723,16 @@ static void uncore_put_event_constraint(struct intel_uncore_box *box, struct per
static int uncore_assign_events(struct intel_uncore_box *box, int assign[], int n) static int uncore_assign_events(struct intel_uncore_box *box, int assign[], int n)
{ {
unsigned long used_mask[BITS_TO_LONGS(UNCORE_PMC_IDX_MAX)]; unsigned long used_mask[BITS_TO_LONGS(UNCORE_PMC_IDX_MAX)];
struct event_constraint *c, *constraints[UNCORE_PMC_IDX_MAX]; struct event_constraint *c;
int i, wmin, wmax, ret = 0; int i, wmin, wmax, ret = 0;
struct hw_perf_event *hwc; struct hw_perf_event *hwc;
bitmap_zero(used_mask, UNCORE_PMC_IDX_MAX); bitmap_zero(used_mask, UNCORE_PMC_IDX_MAX);
for (i = 0, wmin = UNCORE_PMC_IDX_MAX, wmax = 0; i < n; i++) { for (i = 0, wmin = UNCORE_PMC_IDX_MAX, wmax = 0; i < n; i++) {
hwc = &box->event_list[i]->hw;
c = uncore_get_event_constraint(box, box->event_list[i]); c = uncore_get_event_constraint(box, box->event_list[i]);
constraints[i] = c; hwc->constraint = c;
wmin = min(wmin, c->weight); wmin = min(wmin, c->weight);
wmax = max(wmax, c->weight); wmax = max(wmax, c->weight);
} }
...@@ -2739,7 +2740,7 @@ static int uncore_assign_events(struct intel_uncore_box *box, int assign[], int ...@@ -2739,7 +2740,7 @@ static int uncore_assign_events(struct intel_uncore_box *box, int assign[], int
/* fastpath, try to reuse previous register */ /* fastpath, try to reuse previous register */
for (i = 0; i < n; i++) { for (i = 0; i < n; i++) {
hwc = &box->event_list[i]->hw; hwc = &box->event_list[i]->hw;
c = constraints[i]; c = hwc->constraint;
/* never assigned */ /* never assigned */
if (hwc->idx == -1) if (hwc->idx == -1)
...@@ -2759,7 +2760,8 @@ static int uncore_assign_events(struct intel_uncore_box *box, int assign[], int ...@@ -2759,7 +2760,8 @@ static int uncore_assign_events(struct intel_uncore_box *box, int assign[], int
} }
/* slow path */ /* slow path */
if (i != n) if (i != n)
ret = perf_assign_events(constraints, n, wmin, wmax, assign); ret = perf_assign_events(box->event_list, n,
wmin, wmax, assign);
if (!assign || ret) { if (!assign || ret) {
for (i = 0; i < n; i++) for (i = 0; i < n; i++)
......
...@@ -337,10 +337,10 @@ ...@@ -337,10 +337,10 @@
NHMEX_M_PMON_CTL_SET_FLAG_SEL_MASK) NHMEX_M_PMON_CTL_SET_FLAG_SEL_MASK)
#define NHMEX_M_PMON_ZDP_CTL_FVC_MASK (((1 << 11) - 1) | (1 << 23)) #define NHMEX_M_PMON_ZDP_CTL_FVC_MASK (((1 << 11) - 1) | (1 << 23))
#define NHMEX_M_PMON_ZDP_CTL_FVC_EVENT_MASK(n) (0x7 << (11 + 3 * (n))) #define NHMEX_M_PMON_ZDP_CTL_FVC_EVENT_MASK(n) (0x7ULL << (11 + 3 * (n)))
#define WSMEX_M_PMON_ZDP_CTL_FVC_MASK (((1 << 12) - 1) | (1 << 24)) #define WSMEX_M_PMON_ZDP_CTL_FVC_MASK (((1 << 12) - 1) | (1 << 24))
#define WSMEX_M_PMON_ZDP_CTL_FVC_EVENT_MASK(n) (0x7 << (12 + 3 * (n))) #define WSMEX_M_PMON_ZDP_CTL_FVC_EVENT_MASK(n) (0x7ULL << (12 + 3 * (n)))
/* /*
* use the 9~13 bits to select event If the 7th bit is not set, * use the 9~13 bits to select event If the 7th bit is not set,
......
...@@ -14,6 +14,7 @@ ...@@ -14,6 +14,7 @@
#include <linux/kprobes.h> #include <linux/kprobes.h>
#include <linux/kdebug.h> #include <linux/kdebug.h>
#include <linux/nmi.h> #include <linux/nmi.h>
#include <linux/debugfs.h>
#include <linux/delay.h> #include <linux/delay.h>
#include <linux/hardirq.h> #include <linux/hardirq.h>
#include <linux/slab.h> #include <linux/slab.h>
...@@ -29,6 +30,9 @@ ...@@ -29,6 +30,9 @@
#include <asm/nmi.h> #include <asm/nmi.h>
#include <asm/x86_init.h> #include <asm/x86_init.h>
#define CREATE_TRACE_POINTS
#include <trace/events/nmi.h>
struct nmi_desc { struct nmi_desc {
spinlock_t lock; spinlock_t lock;
struct list_head head; struct list_head head;
...@@ -82,6 +86,15 @@ __setup("unknown_nmi_panic", setup_unknown_nmi_panic); ...@@ -82,6 +86,15 @@ __setup("unknown_nmi_panic", setup_unknown_nmi_panic);
#define nmi_to_desc(type) (&nmi_desc[type]) #define nmi_to_desc(type) (&nmi_desc[type])
static u64 nmi_longest_ns = 1 * NSEC_PER_MSEC;
static int __init nmi_warning_debugfs(void)
{
debugfs_create_u64("nmi_longest_ns", 0644,
arch_debugfs_dir, &nmi_longest_ns);
return 0;
}
fs_initcall(nmi_warning_debugfs);
static int __kprobes nmi_handle(unsigned int type, struct pt_regs *regs, bool b2b) static int __kprobes nmi_handle(unsigned int type, struct pt_regs *regs, bool b2b)
{ {
struct nmi_desc *desc = nmi_to_desc(type); struct nmi_desc *desc = nmi_to_desc(type);
...@@ -96,8 +109,27 @@ static int __kprobes nmi_handle(unsigned int type, struct pt_regs *regs, bool b2 ...@@ -96,8 +109,27 @@ static int __kprobes nmi_handle(unsigned int type, struct pt_regs *regs, bool b2
* can be latched at any given time. Walk the whole list * can be latched at any given time. Walk the whole list
* to handle those situations. * to handle those situations.
*/ */
list_for_each_entry_rcu(a, &desc->head, list) list_for_each_entry_rcu(a, &desc->head, list) {
handled += a->handler(type, regs); u64 before, delta, whole_msecs;
int decimal_msecs, thishandled;
before = local_clock();
thishandled = a->handler(type, regs);
handled += thishandled;
delta = local_clock() - before;
trace_nmi_handler(a->handler, (int)delta, thishandled);
if (delta < nmi_longest_ns)
continue;
nmi_longest_ns = delta;
whole_msecs = do_div(delta, (1000 * 1000));
decimal_msecs = do_div(delta, 1000) % 1000;
printk_ratelimited(KERN_INFO
"INFO: NMI handler (%ps) took too long to run: "
"%lld.%03d msecs\n", a->handler, whole_msecs,
decimal_msecs);
}
rcu_read_unlock(); rcu_read_unlock();
......
...@@ -43,12 +43,6 @@ ...@@ -43,12 +43,6 @@
#include <asm/sigframe.h> #include <asm/sigframe.h>
#ifdef CONFIG_X86_32
# define FIX_EFLAGS (__FIX_EFLAGS | X86_EFLAGS_RF)
#else
# define FIX_EFLAGS __FIX_EFLAGS
#endif
#define COPY(x) do { \ #define COPY(x) do { \
get_user_ex(regs->x, &sc->x); \ get_user_ex(regs->x, &sc->x); \
} while (0) } while (0)
...@@ -668,15 +662,17 @@ handle_signal(struct ksignal *ksig, struct pt_regs *regs) ...@@ -668,15 +662,17 @@ handle_signal(struct ksignal *ksig, struct pt_regs *regs)
if (!failed) { if (!failed) {
/* /*
* Clear the direction flag as per the ABI for function entry. * Clear the direction flag as per the ABI for function entry.
*/ *
regs->flags &= ~X86_EFLAGS_DF; * Clear RF when entering the signal handler, because
/* * it might disable possible debug exception from the
* signal handler.
*
* Clear TF when entering the signal handler, but * Clear TF when entering the signal handler, but
* notify any tracer that was single-stepping it. * notify any tracer that was single-stepping it.
* The tracer may want to single-step inside the * The tracer may want to single-step inside the
* handler too. * handler too.
*/ */
regs->flags &= ~X86_EFLAGS_TF; regs->flags &= ~(X86_EFLAGS_DF|X86_EFLAGS_RF|X86_EFLAGS_TF);
} }
signal_setup_done(failed, ksig, test_thread_flag(TIF_SINGLESTEP)); signal_setup_done(failed, ksig, test_thread_flag(TIF_SINGLESTEP));
} }
......
...@@ -99,7 +99,7 @@ struct ivhd_header { ...@@ -99,7 +99,7 @@ struct ivhd_header {
u64 mmio_phys; u64 mmio_phys;
u16 pci_seg; u16 pci_seg;
u16 info; u16 info;
u32 reserved; u32 efr;
} __attribute__((packed)); } __attribute__((packed));
/* /*
...@@ -154,6 +154,7 @@ bool amd_iommu_iotlb_sup __read_mostly = true; ...@@ -154,6 +154,7 @@ bool amd_iommu_iotlb_sup __read_mostly = true;
u32 amd_iommu_max_pasids __read_mostly = ~0; u32 amd_iommu_max_pasids __read_mostly = ~0;
bool amd_iommu_v2_present __read_mostly; bool amd_iommu_v2_present __read_mostly;
bool amd_iommu_pc_present __read_mostly;
bool amd_iommu_force_isolation __read_mostly; bool amd_iommu_force_isolation __read_mostly;
...@@ -369,23 +370,23 @@ static void iommu_disable(struct amd_iommu *iommu) ...@@ -369,23 +370,23 @@ static void iommu_disable(struct amd_iommu *iommu)
* mapping and unmapping functions for the IOMMU MMIO space. Each AMD IOMMU in * mapping and unmapping functions for the IOMMU MMIO space. Each AMD IOMMU in
* the system has one. * the system has one.
*/ */
static u8 __iomem * __init iommu_map_mmio_space(u64 address) static u8 __iomem * __init iommu_map_mmio_space(u64 address, u64 end)
{ {
if (!request_mem_region(address, MMIO_REGION_LENGTH, "amd_iommu")) { if (!request_mem_region(address, end, "amd_iommu")) {
pr_err("AMD-Vi: Can not reserve memory region %llx for mmio\n", pr_err("AMD-Vi: Can not reserve memory region %llx-%llx for mmio\n",
address); address, end);
pr_err("AMD-Vi: This is a BIOS bug. Please contact your hardware vendor\n"); pr_err("AMD-Vi: This is a BIOS bug. Please contact your hardware vendor\n");
return NULL; return NULL;
} }
return (u8 __iomem *)ioremap_nocache(address, MMIO_REGION_LENGTH); return (u8 __iomem *)ioremap_nocache(address, end);
} }
static void __init iommu_unmap_mmio_space(struct amd_iommu *iommu) static void __init iommu_unmap_mmio_space(struct amd_iommu *iommu)
{ {
if (iommu->mmio_base) if (iommu->mmio_base)
iounmap(iommu->mmio_base); iounmap(iommu->mmio_base);
release_mem_region(iommu->mmio_phys, MMIO_REGION_LENGTH); release_mem_region(iommu->mmio_phys, iommu->mmio_phys_end);
} }
/**************************************************************************** /****************************************************************************
...@@ -1085,7 +1086,18 @@ static int __init init_iommu_one(struct amd_iommu *iommu, struct ivhd_header *h) ...@@ -1085,7 +1086,18 @@ static int __init init_iommu_one(struct amd_iommu *iommu, struct ivhd_header *h)
iommu->cap_ptr = h->cap_ptr; iommu->cap_ptr = h->cap_ptr;
iommu->pci_seg = h->pci_seg; iommu->pci_seg = h->pci_seg;
iommu->mmio_phys = h->mmio_phys; iommu->mmio_phys = h->mmio_phys;
iommu->mmio_base = iommu_map_mmio_space(h->mmio_phys);
/* Check if IVHD EFR contains proper max banks/counters */
if ((h->efr != 0) &&
((h->efr & (0xF << 13)) != 0) &&
((h->efr & (0x3F << 17)) != 0)) {
iommu->mmio_phys_end = MMIO_REG_END_OFFSET;
} else {
iommu->mmio_phys_end = MMIO_CNTR_CONF_OFFSET;
}
iommu->mmio_base = iommu_map_mmio_space(iommu->mmio_phys,
iommu->mmio_phys_end);
if (!iommu->mmio_base) if (!iommu->mmio_base)
return -ENOMEM; return -ENOMEM;
...@@ -1160,6 +1172,33 @@ static int __init init_iommu_all(struct acpi_table_header *table) ...@@ -1160,6 +1172,33 @@ static int __init init_iommu_all(struct acpi_table_header *table)
return 0; return 0;
} }
static void init_iommu_perf_ctr(struct amd_iommu *iommu)
{
u64 val = 0xabcd, val2 = 0;
if (!iommu_feature(iommu, FEATURE_PC))
return;
amd_iommu_pc_present = true;
/* Check if the performance counters can be written to */
if ((0 != amd_iommu_pc_get_set_reg_val(0, 0, 0, 0, &val, true)) ||
(0 != amd_iommu_pc_get_set_reg_val(0, 0, 0, 0, &val2, false)) ||
(val != val2)) {
pr_err("AMD-Vi: Unable to write to IOMMU perf counter.\n");
amd_iommu_pc_present = false;
return;
}
pr_info("AMD-Vi: IOMMU performance counters supported\n");
val = readl(iommu->mmio_base + MMIO_CNTR_CONF_OFFSET);
iommu->max_banks = (u8) ((val >> 12) & 0x3f);
iommu->max_counters = (u8) ((val >> 7) & 0xf);
}
static int iommu_init_pci(struct amd_iommu *iommu) static int iommu_init_pci(struct amd_iommu *iommu)
{ {
int cap_ptr = iommu->cap_ptr; int cap_ptr = iommu->cap_ptr;
...@@ -1226,6 +1265,8 @@ static int iommu_init_pci(struct amd_iommu *iommu) ...@@ -1226,6 +1265,8 @@ static int iommu_init_pci(struct amd_iommu *iommu)
if (iommu->cap & (1UL << IOMMU_CAP_NPCACHE)) if (iommu->cap & (1UL << IOMMU_CAP_NPCACHE))
amd_iommu_np_cache = true; amd_iommu_np_cache = true;
init_iommu_perf_ctr(iommu);
if (is_rd890_iommu(iommu->dev)) { if (is_rd890_iommu(iommu->dev)) {
int i, j; int i, j;
...@@ -1278,7 +1319,7 @@ static void print_iommu_info(void) ...@@ -1278,7 +1319,7 @@ static void print_iommu_info(void)
if (iommu_feature(iommu, (1ULL << i))) if (iommu_feature(iommu, (1ULL << i)))
pr_cont(" %s", feat_str[i]); pr_cont(" %s", feat_str[i]);
} }
pr_cont("\n"); pr_cont("\n");
} }
} }
if (irq_remapping_enabled) if (irq_remapping_enabled)
...@@ -2232,3 +2273,84 @@ bool amd_iommu_v2_supported(void) ...@@ -2232,3 +2273,84 @@ bool amd_iommu_v2_supported(void)
return amd_iommu_v2_present; return amd_iommu_v2_present;
} }
EXPORT_SYMBOL(amd_iommu_v2_supported); EXPORT_SYMBOL(amd_iommu_v2_supported);
/****************************************************************************
*
* IOMMU EFR Performance Counter support functionality. This code allows
* access to the IOMMU PC functionality.
*
****************************************************************************/
u8 amd_iommu_pc_get_max_banks(u16 devid)
{
struct amd_iommu *iommu;
u8 ret = 0;
/* locate the iommu governing the devid */
iommu = amd_iommu_rlookup_table[devid];
if (iommu)
ret = iommu->max_banks;
return ret;
}
EXPORT_SYMBOL(amd_iommu_pc_get_max_banks);
bool amd_iommu_pc_supported(void)
{
return amd_iommu_pc_present;
}
EXPORT_SYMBOL(amd_iommu_pc_supported);
u8 amd_iommu_pc_get_max_counters(u16 devid)
{
struct amd_iommu *iommu;
u8 ret = 0;
/* locate the iommu governing the devid */
iommu = amd_iommu_rlookup_table[devid];
if (iommu)
ret = iommu->max_counters;
return ret;
}
EXPORT_SYMBOL(amd_iommu_pc_get_max_counters);
int amd_iommu_pc_get_set_reg_val(u16 devid, u8 bank, u8 cntr, u8 fxn,
u64 *value, bool is_write)
{
struct amd_iommu *iommu;
u32 offset;
u32 max_offset_lim;
/* Make sure the IOMMU PC resource is available */
if (!amd_iommu_pc_present)
return -ENODEV;
/* Locate the iommu associated with the device ID */
iommu = amd_iommu_rlookup_table[devid];
/* Check for valid iommu and pc register indexing */
if (WARN_ON((iommu == NULL) || (fxn > 0x28) || (fxn & 7)))
return -ENODEV;
offset = (u32)(((0x40|bank) << 12) | (cntr << 8) | fxn);
/* Limit the offset to the hw defined mmio region aperture */
max_offset_lim = (u32)(((0x40|iommu->max_banks) << 12) |
(iommu->max_counters << 8) | 0x28);
if ((offset < MMIO_CNTR_REG_OFFSET) ||
(offset > max_offset_lim))
return -EINVAL;
if (is_write) {
writel((u32)*value, iommu->mmio_base + offset);
writel((*value >> 32), iommu->mmio_base + offset + 4);
} else {
*value = readl(iommu->mmio_base + offset + 4);
*value <<= 32;
*value = readl(iommu->mmio_base + offset);
}
return 0;
}
EXPORT_SYMBOL(amd_iommu_pc_get_set_reg_val);
...@@ -56,6 +56,13 @@ extern int amd_iommu_domain_set_gcr3(struct iommu_domain *dom, int pasid, ...@@ -56,6 +56,13 @@ extern int amd_iommu_domain_set_gcr3(struct iommu_domain *dom, int pasid,
extern int amd_iommu_domain_clear_gcr3(struct iommu_domain *dom, int pasid); extern int amd_iommu_domain_clear_gcr3(struct iommu_domain *dom, int pasid);
extern struct iommu_domain *amd_iommu_get_v2_domain(struct pci_dev *pdev); extern struct iommu_domain *amd_iommu_get_v2_domain(struct pci_dev *pdev);
/* IOMMU Performance Counter functions */
extern bool amd_iommu_pc_supported(void);
extern u8 amd_iommu_pc_get_max_banks(u16 devid);
extern u8 amd_iommu_pc_get_max_counters(u16 devid);
extern int amd_iommu_pc_get_set_reg_val(u16 devid, u8 bank, u8 cntr, u8 fxn,
u64 *value, bool is_write);
#define PPR_SUCCESS 0x0 #define PPR_SUCCESS 0x0
#define PPR_INVALID 0x1 #define PPR_INVALID 0x1
#define PPR_FAILURE 0xf #define PPR_FAILURE 0xf
......
...@@ -38,9 +38,6 @@ ...@@ -38,9 +38,6 @@
#define ALIAS_TABLE_ENTRY_SIZE 2 #define ALIAS_TABLE_ENTRY_SIZE 2
#define RLOOKUP_TABLE_ENTRY_SIZE (sizeof(void *)) #define RLOOKUP_TABLE_ENTRY_SIZE (sizeof(void *))
/* Length of the MMIO region for the AMD IOMMU */
#define MMIO_REGION_LENGTH 0x4000
/* Capability offsets used by the driver */ /* Capability offsets used by the driver */
#define MMIO_CAP_HDR_OFFSET 0x00 #define MMIO_CAP_HDR_OFFSET 0x00
#define MMIO_RANGE_OFFSET 0x0c #define MMIO_RANGE_OFFSET 0x0c
...@@ -78,6 +75,10 @@ ...@@ -78,6 +75,10 @@
#define MMIO_STATUS_OFFSET 0x2020 #define MMIO_STATUS_OFFSET 0x2020
#define MMIO_PPR_HEAD_OFFSET 0x2030 #define MMIO_PPR_HEAD_OFFSET 0x2030
#define MMIO_PPR_TAIL_OFFSET 0x2038 #define MMIO_PPR_TAIL_OFFSET 0x2038
#define MMIO_CNTR_CONF_OFFSET 0x4000
#define MMIO_CNTR_REG_OFFSET 0x40000
#define MMIO_REG_END_OFFSET 0x80000
/* Extended Feature Bits */ /* Extended Feature Bits */
...@@ -507,6 +508,10 @@ struct amd_iommu { ...@@ -507,6 +508,10 @@ struct amd_iommu {
/* physical address of MMIO space */ /* physical address of MMIO space */
u64 mmio_phys; u64 mmio_phys;
/* physical end address of MMIO space */
u64 mmio_phys_end;
/* virtual address of MMIO space */ /* virtual address of MMIO space */
u8 __iomem *mmio_base; u8 __iomem *mmio_base;
...@@ -584,6 +589,10 @@ struct amd_iommu { ...@@ -584,6 +589,10 @@ struct amd_iommu {
/* The l2 indirect registers */ /* The l2 indirect registers */
u32 stored_l2[0x83]; u32 stored_l2[0x83];
/* The maximum PC banks and counters/bank (PCSup=1) */
u8 max_banks;
u8 max_counters;
}; };
struct devid_map { struct devid_map {
......
...@@ -73,13 +73,18 @@ struct perf_raw_record { ...@@ -73,13 +73,18 @@ struct perf_raw_record {
* *
* support for mispred, predicted is optional. In case it * support for mispred, predicted is optional. In case it
* is not supported mispred = predicted = 0. * is not supported mispred = predicted = 0.
*
* in_tx: running in a hardware transaction
* abort: aborting a hardware transaction
*/ */
struct perf_branch_entry { struct perf_branch_entry {
__u64 from; __u64 from;
__u64 to; __u64 to;
__u64 mispred:1, /* target mispredicted */ __u64 mispred:1, /* target mispredicted */
predicted:1,/* target predicted */ predicted:1,/* target predicted */
reserved:62; in_tx:1, /* in transaction */
abort:1, /* transaction abort */
reserved:60;
}; };
/* /*
...@@ -113,6 +118,8 @@ struct hw_perf_event_extra { ...@@ -113,6 +118,8 @@ struct hw_perf_event_extra {
int idx; /* index in shared_regs->regs[] */ int idx; /* index in shared_regs->regs[] */
}; };
struct event_constraint;
/** /**
* struct hw_perf_event - performance event hardware details: * struct hw_perf_event - performance event hardware details:
*/ */
...@@ -131,6 +138,8 @@ struct hw_perf_event { ...@@ -131,6 +138,8 @@ struct hw_perf_event {
struct hw_perf_event_extra extra_reg; struct hw_perf_event_extra extra_reg;
struct hw_perf_event_extra branch_reg; struct hw_perf_event_extra branch_reg;
struct event_constraint *constraint;
}; };
struct { /* software */ struct { /* software */
struct hrtimer hrtimer; struct hrtimer hrtimer;
...@@ -188,12 +197,13 @@ struct pmu { ...@@ -188,12 +197,13 @@ struct pmu {
struct device *dev; struct device *dev;
const struct attribute_group **attr_groups; const struct attribute_group **attr_groups;
char *name; const char *name;
int type; int type;
int * __percpu pmu_disable_count; int * __percpu pmu_disable_count;
struct perf_cpu_context * __percpu pmu_cpu_context; struct perf_cpu_context * __percpu pmu_cpu_context;
int task_ctx_nr; int task_ctx_nr;
int hrtimer_interval_ms;
/* /*
* Fully disable/enable this PMU, can be used to protect from the PMI * Fully disable/enable this PMU, can be used to protect from the PMI
...@@ -500,8 +510,9 @@ struct perf_cpu_context { ...@@ -500,8 +510,9 @@ struct perf_cpu_context {
struct perf_event_context *task_ctx; struct perf_event_context *task_ctx;
int active_oncpu; int active_oncpu;
int exclusive; int exclusive;
struct hrtimer hrtimer;
ktime_t hrtimer_interval;
struct list_head rotation_list; struct list_head rotation_list;
int jiffies_interval;
struct pmu *unique_pmu; struct pmu *unique_pmu;
struct perf_cgroup *cgrp; struct perf_cgroup *cgrp;
}; };
...@@ -517,7 +528,7 @@ struct perf_output_handle { ...@@ -517,7 +528,7 @@ struct perf_output_handle {
#ifdef CONFIG_PERF_EVENTS #ifdef CONFIG_PERF_EVENTS
extern int perf_pmu_register(struct pmu *pmu, char *name, int type); extern int perf_pmu_register(struct pmu *pmu, const char *name, int type);
extern void perf_pmu_unregister(struct pmu *pmu); extern void perf_pmu_unregister(struct pmu *pmu);
extern int perf_num_counters(void); extern int perf_num_counters(void);
...@@ -695,10 +706,17 @@ static inline void perf_callchain_store(struct perf_callchain_entry *entry, u64 ...@@ -695,10 +706,17 @@ static inline void perf_callchain_store(struct perf_callchain_entry *entry, u64
extern int sysctl_perf_event_paranoid; extern int sysctl_perf_event_paranoid;
extern int sysctl_perf_event_mlock; extern int sysctl_perf_event_mlock;
extern int sysctl_perf_event_sample_rate; extern int sysctl_perf_event_sample_rate;
extern int sysctl_perf_cpu_time_max_percent;
extern void perf_sample_event_took(u64 sample_len_ns);
extern int perf_proc_update_handler(struct ctl_table *table, int write, extern int perf_proc_update_handler(struct ctl_table *table, int write,
void __user *buffer, size_t *lenp, void __user *buffer, size_t *lenp,
loff_t *ppos); loff_t *ppos);
extern int perf_cpu_time_max_percent_handler(struct ctl_table *table, int write,
void __user *buffer, size_t *lenp,
loff_t *ppos);
static inline bool perf_paranoid_tracepoint_raw(void) static inline bool perf_paranoid_tracepoint_raw(void)
{ {
...@@ -742,6 +760,7 @@ extern unsigned int perf_output_skip(struct perf_output_handle *handle, ...@@ -742,6 +760,7 @@ extern unsigned int perf_output_skip(struct perf_output_handle *handle,
unsigned int len); unsigned int len);
extern int perf_swevent_get_recursion_context(void); extern int perf_swevent_get_recursion_context(void);
extern void perf_swevent_put_recursion_context(int rctx); extern void perf_swevent_put_recursion_context(int rctx);
extern u64 perf_swevent_set_period(struct perf_event *event);
extern void perf_event_enable(struct perf_event *event); extern void perf_event_enable(struct perf_event *event);
extern void perf_event_disable(struct perf_event *event); extern void perf_event_disable(struct perf_event *event);
extern int __perf_event_disable(void *info); extern int __perf_event_disable(void *info);
...@@ -781,6 +800,7 @@ static inline void perf_event_fork(struct task_struct *tsk) { } ...@@ -781,6 +800,7 @@ static inline void perf_event_fork(struct task_struct *tsk) { }
static inline void perf_event_init(void) { } static inline void perf_event_init(void) { }
static inline int perf_swevent_get_recursion_context(void) { return -1; } static inline int perf_swevent_get_recursion_context(void) { return -1; }
static inline void perf_swevent_put_recursion_context(int rctx) { } static inline void perf_swevent_put_recursion_context(int rctx) { }
static inline u64 perf_swevent_set_period(struct perf_event *event) { return 0; }
static inline void perf_event_enable(struct perf_event *event) { } static inline void perf_event_enable(struct perf_event *event) { }
static inline void perf_event_disable(struct perf_event *event) { } static inline void perf_event_disable(struct perf_event *event) { }
static inline int __perf_event_disable(void *info) { return -1; } static inline int __perf_event_disable(void *info) { return -1; }
......
#undef TRACE_SYSTEM
#define TRACE_SYSTEM nmi
#if !defined(_TRACE_NMI_H) || defined(TRACE_HEADER_MULTI_READ)
#define _TRACE_NMI_H
#include <linux/ktime.h>
#include <linux/tracepoint.h>
TRACE_EVENT(nmi_handler,
TP_PROTO(void *handler, s64 delta_ns, int handled),
TP_ARGS(handler, delta_ns, handled),
TP_STRUCT__entry(
__field( void *, handler )
__field( s64, delta_ns)
__field( int, handled )
),
TP_fast_assign(
__entry->handler = handler;
__entry->delta_ns = delta_ns;
__entry->handled = handled;
),
TP_printk("%ps() delta_ns: %lld handled: %d",
__entry->handler,
__entry->delta_ns,
__entry->handled)
);
#endif /* _TRACE_NMI_H */
/* This part ust be outside protection */
#include <trace/define_trace.h>
...@@ -157,8 +157,11 @@ enum perf_branch_sample_type { ...@@ -157,8 +157,11 @@ enum perf_branch_sample_type {
PERF_SAMPLE_BRANCH_ANY_CALL = 1U << 4, /* any call branch */ PERF_SAMPLE_BRANCH_ANY_CALL = 1U << 4, /* any call branch */
PERF_SAMPLE_BRANCH_ANY_RETURN = 1U << 5, /* any return branch */ PERF_SAMPLE_BRANCH_ANY_RETURN = 1U << 5, /* any return branch */
PERF_SAMPLE_BRANCH_IND_CALL = 1U << 6, /* indirect calls */ PERF_SAMPLE_BRANCH_IND_CALL = 1U << 6, /* indirect calls */
PERF_SAMPLE_BRANCH_ABORT_TX = 1U << 7, /* transaction aborts */
PERF_SAMPLE_BRANCH_IN_TX = 1U << 8, /* in transaction */
PERF_SAMPLE_BRANCH_NO_TX = 1U << 9, /* not in transaction */
PERF_SAMPLE_BRANCH_MAX = 1U << 7, /* non-ABI */ PERF_SAMPLE_BRANCH_MAX = 1U << 10, /* non-ABI */
}; };
#define PERF_SAMPLE_BRANCH_PLM_ALL \ #define PERF_SAMPLE_BRANCH_PLM_ALL \
......
...@@ -542,7 +542,6 @@ asmlinkage void __init start_kernel(void) ...@@ -542,7 +542,6 @@ asmlinkage void __init start_kernel(void)
if (WARN(!irqs_disabled(), "Interrupts were enabled *very* early, fixing it\n")) if (WARN(!irqs_disabled(), "Interrupts were enabled *very* early, fixing it\n"))
local_irq_disable(); local_irq_disable();
idr_init_cache(); idr_init_cache();
perf_event_init();
rcu_init(); rcu_init();
tick_nohz_init(); tick_nohz_init();
radix_tree_init(); radix_tree_init();
...@@ -555,6 +554,7 @@ asmlinkage void __init start_kernel(void) ...@@ -555,6 +554,7 @@ asmlinkage void __init start_kernel(void)
softirq_init(); softirq_init();
timekeeping_init(); timekeeping_init();
time_init(); time_init();
perf_event_init();
profile_init(); profile_init();
call_function_init(); call_function_init();
WARN(!irqs_disabled(), "Interrupts were enabled early\n"); WARN(!irqs_disabled(), "Interrupts were enabled early\n");
......
This diff is collapsed.
This diff is collapsed.
...@@ -120,7 +120,6 @@ extern int blk_iopoll_enabled; ...@@ -120,7 +120,6 @@ extern int blk_iopoll_enabled;
/* Constants used for minimum and maximum */ /* Constants used for minimum and maximum */
#ifdef CONFIG_LOCKUP_DETECTOR #ifdef CONFIG_LOCKUP_DETECTOR
static int sixty = 60; static int sixty = 60;
static int neg_one = -1;
#endif #endif
static int zero; static int zero;
...@@ -814,7 +813,7 @@ static struct ctl_table kern_table[] = { ...@@ -814,7 +813,7 @@ static struct ctl_table kern_table[] = {
.maxlen = sizeof(int), .maxlen = sizeof(int),
.mode = 0644, .mode = 0644,
.proc_handler = proc_dowatchdog, .proc_handler = proc_dowatchdog,
.extra1 = &neg_one, .extra1 = &zero,
.extra2 = &sixty, .extra2 = &sixty,
}, },
{ {
...@@ -1044,6 +1043,15 @@ static struct ctl_table kern_table[] = { ...@@ -1044,6 +1043,15 @@ static struct ctl_table kern_table[] = {
.mode = 0644, .mode = 0644,
.proc_handler = perf_proc_update_handler, .proc_handler = perf_proc_update_handler,
}, },
{
.procname = "perf_cpu_time_max_percent",
.data = &sysctl_perf_cpu_time_max_percent,
.maxlen = sizeof(sysctl_perf_cpu_time_max_percent),
.mode = 0644,
.proc_handler = perf_cpu_time_max_percent_handler,
.extra1 = &zero,
.extra2 = &one_hundred,
},
#endif #endif
#ifdef CONFIG_KMEMCHECK #ifdef CONFIG_KMEMCHECK
{ {
......
include ../../scripts/Makefile.include include ../../scripts/Makefile.include
CC = $(CROSS_COMPILE)gcc
AR = $(CROSS_COMPILE)ar
# guard against environment variables # guard against environment variables
LIB_H= LIB_H=
LIB_OBJS= LIB_OBJS=
......
...@@ -13,7 +13,7 @@ SYNOPSIS ...@@ -13,7 +13,7 @@ SYNOPSIS
DESCRIPTION DESCRIPTION
----------- -----------
This command runs runs perf-buildid-list --with-hits, and collects the files This command runs runs perf-buildid-list --with-hits, and collects the files
with the buildids found so that analisys of perf.data contents can be possible with the buildids found so that analysis of perf.data contents can be possible
on another machine. on another machine.
......
...@@ -210,6 +210,10 @@ OPTIONS ...@@ -210,6 +210,10 @@ OPTIONS
Demangle symbol names to human readable form. It's enabled by default, Demangle symbol names to human readable form. It's enabled by default,
disable with --no-demangle. disable with --no-demangle.
--percent-limit::
Do not show entries which have an overhead under that percent.
(Default: 0).
SEE ALSO SEE ALSO
-------- --------
linkperf:perf-stat[1], linkperf:perf-annotate[1] linkperf:perf-stat[1], linkperf:perf-annotate[1]
...@@ -155,6 +155,10 @@ Default is to monitor all CPUS. ...@@ -155,6 +155,10 @@ Default is to monitor all CPUS.
Default: fractal,0.5,callee. Default: fractal,0.5,callee.
--percent-limit::
Do not show entries which have an overhead under that percent.
(Default: 0).
INTERACTIVE PROMPTING KEYS INTERACTIVE PROMPTING KEYS
-------------------------- --------------------------
......
This diff is collapsed.
...@@ -323,13 +323,20 @@ static void hists__baseline_only(struct hists *hists) ...@@ -323,13 +323,20 @@ static void hists__baseline_only(struct hists *hists)
static void hists__precompute(struct hists *hists) static void hists__precompute(struct hists *hists)
{ {
struct rb_node *next = rb_first(&hists->entries); struct rb_root *root;
struct rb_node *next;
if (sort__need_collapse)
root = &hists->entries_collapsed;
else
root = hists->entries_in;
next = rb_first(root);
while (next != NULL) { while (next != NULL) {
struct hist_entry *he = rb_entry(next, struct hist_entry, rb_node); struct hist_entry *he = rb_entry(next, struct hist_entry, rb_node_in);
struct hist_entry *pair = hist_entry__next_pair(he); struct hist_entry *pair = hist_entry__next_pair(he);
next = rb_next(&he->rb_node); next = rb_next(&he->rb_node_in);
if (!pair) if (!pair)
continue; continue;
...@@ -457,7 +464,7 @@ static void hists__process(struct hists *old, struct hists *new) ...@@ -457,7 +464,7 @@ static void hists__process(struct hists *old, struct hists *new)
hists__output_resort(new); hists__output_resort(new);
} }
hists__fprintf(new, true, 0, 0, stdout); hists__fprintf(new, true, 0, 0, 0, stdout);
} }
static int __cmd_diff(void) static int __cmd_diff(void)
...@@ -611,9 +618,7 @@ int cmd_diff(int argc, const char **argv, const char *prefix __maybe_unused) ...@@ -611,9 +618,7 @@ int cmd_diff(int argc, const char **argv, const char *prefix __maybe_unused)
setup_pager(); setup_pager();
sort_entry__setup_elide(&sort_dso, symbol_conf.dso_list, "dso", NULL); sort__setup_elide(NULL);
sort_entry__setup_elide(&sort_comm, symbol_conf.comm_list, "comm", NULL);
sort_entry__setup_elide(&sort_sym, symbol_conf.sym_list, "symbol", NULL);
return __cmd_diff(); return __cmd_diff();
} }
...@@ -328,6 +328,7 @@ static int kvm_events_hash_fn(u64 key) ...@@ -328,6 +328,7 @@ static int kvm_events_hash_fn(u64 key)
static bool kvm_event_expand(struct kvm_event *event, int vcpu_id) static bool kvm_event_expand(struct kvm_event *event, int vcpu_id)
{ {
int old_max_vcpu = event->max_vcpu; int old_max_vcpu = event->max_vcpu;
void *prev;
if (vcpu_id < event->max_vcpu) if (vcpu_id < event->max_vcpu)
return true; return true;
...@@ -335,9 +336,11 @@ static bool kvm_event_expand(struct kvm_event *event, int vcpu_id) ...@@ -335,9 +336,11 @@ static bool kvm_event_expand(struct kvm_event *event, int vcpu_id)
while (event->max_vcpu <= vcpu_id) while (event->max_vcpu <= vcpu_id)
event->max_vcpu += DEFAULT_VCPU_NUM; event->max_vcpu += DEFAULT_VCPU_NUM;
prev = event->vcpu;
event->vcpu = realloc(event->vcpu, event->vcpu = realloc(event->vcpu,
event->max_vcpu * sizeof(*event->vcpu)); event->max_vcpu * sizeof(*event->vcpu));
if (!event->vcpu) { if (!event->vcpu) {
free(prev);
pr_err("Not enough memory\n"); pr_err("Not enough memory\n");
return false; return false;
} }
......
...@@ -198,7 +198,6 @@ static void perf_record__sig_exit(int exit_status __maybe_unused, void *arg) ...@@ -198,7 +198,6 @@ static void perf_record__sig_exit(int exit_status __maybe_unused, void *arg)
return; return;
signal(signr, SIG_DFL); signal(signr, SIG_DFL);
kill(getpid(), signr);
} }
static bool perf_evlist__equal(struct perf_evlist *evlist, static bool perf_evlist__equal(struct perf_evlist *evlist,
...@@ -404,6 +403,7 @@ static int __cmd_record(struct perf_record *rec, int argc, const char **argv) ...@@ -404,6 +403,7 @@ static int __cmd_record(struct perf_record *rec, int argc, const char **argv)
signal(SIGCHLD, sig_handler); signal(SIGCHLD, sig_handler);
signal(SIGINT, sig_handler); signal(SIGINT, sig_handler);
signal(SIGUSR1, sig_handler); signal(SIGUSR1, sig_handler);
signal(SIGTERM, sig_handler);
if (!output_name) { if (!output_name) {
if (!fstat(STDOUT_FILENO, &st) && S_ISFIFO(st.st_mode)) if (!fstat(STDOUT_FILENO, &st) && S_ISFIFO(st.st_mode))
......
...@@ -52,6 +52,7 @@ struct perf_report { ...@@ -52,6 +52,7 @@ struct perf_report {
symbol_filter_t annotate_init; symbol_filter_t annotate_init;
const char *cpu_list; const char *cpu_list;
const char *symbol_filter_str; const char *symbol_filter_str;
float min_percent;
DECLARE_BITMAP(cpu_bitmap, MAX_NR_CPUS); DECLARE_BITMAP(cpu_bitmap, MAX_NR_CPUS);
}; };
...@@ -61,6 +62,11 @@ static int perf_report_config(const char *var, const char *value, void *cb) ...@@ -61,6 +62,11 @@ static int perf_report_config(const char *var, const char *value, void *cb)
symbol_conf.event_group = perf_config_bool(var, value); symbol_conf.event_group = perf_config_bool(var, value);
return 0; return 0;
} }
if (!strcmp(var, "report.percent-limit")) {
struct perf_report *rep = cb;
rep->min_percent = strtof(value, NULL);
return 0;
}
return perf_default_config(var, value, cb); return perf_default_config(var, value, cb);
} }
...@@ -187,6 +193,9 @@ static int perf_report__add_branch_hist_entry(struct perf_tool *tool, ...@@ -187,6 +193,9 @@ static int perf_report__add_branch_hist_entry(struct perf_tool *tool,
for (i = 0; i < sample->branch_stack->nr; i++) { for (i = 0; i < sample->branch_stack->nr; i++) {
if (rep->hide_unresolved && !(bi[i].from.sym && bi[i].to.sym)) if (rep->hide_unresolved && !(bi[i].from.sym && bi[i].to.sym))
continue; continue;
err = -ENOMEM;
/* /*
* The report shows the percentage of total branches captured * The report shows the percentage of total branches captured
* and not events sampled. Thus we use a pseudo period of 1. * and not events sampled. Thus we use a pseudo period of 1.
...@@ -195,7 +204,6 @@ static int perf_report__add_branch_hist_entry(struct perf_tool *tool, ...@@ -195,7 +204,6 @@ static int perf_report__add_branch_hist_entry(struct perf_tool *tool,
&bi[i], 1, 1); &bi[i], 1, 1);
if (he) { if (he) {
struct annotation *notes; struct annotation *notes;
err = -ENOMEM;
bx = he->branch_info; bx = he->branch_info;
if (bx->from.sym && use_browser == 1 && sort__has_sym) { if (bx->from.sym && use_browser == 1 && sort__has_sym) {
notes = symbol__annotation(bx->from.sym); notes = symbol__annotation(bx->from.sym);
...@@ -226,11 +234,12 @@ static int perf_report__add_branch_hist_entry(struct perf_tool *tool, ...@@ -226,11 +234,12 @@ static int perf_report__add_branch_hist_entry(struct perf_tool *tool,
} }
evsel->hists.stats.total_period += 1; evsel->hists.stats.total_period += 1;
hists__inc_nr_events(&evsel->hists, PERF_RECORD_SAMPLE); hists__inc_nr_events(&evsel->hists, PERF_RECORD_SAMPLE);
err = 0;
} else } else
return -ENOMEM; goto out;
} }
err = 0;
out: out:
free(bi);
return err; return err;
} }
...@@ -294,6 +303,7 @@ static int process_sample_event(struct perf_tool *tool, ...@@ -294,6 +303,7 @@ static int process_sample_event(struct perf_tool *tool,
{ {
struct perf_report *rep = container_of(tool, struct perf_report, tool); struct perf_report *rep = container_of(tool, struct perf_report, tool);
struct addr_location al; struct addr_location al;
int ret;
if (perf_event__preprocess_sample(event, machine, &al, sample, if (perf_event__preprocess_sample(event, machine, &al, sample,
rep->annotate_init) < 0) { rep->annotate_init) < 0) {
...@@ -308,28 +318,25 @@ static int process_sample_event(struct perf_tool *tool, ...@@ -308,28 +318,25 @@ static int process_sample_event(struct perf_tool *tool,
if (rep->cpu_list && !test_bit(sample->cpu, rep->cpu_bitmap)) if (rep->cpu_list && !test_bit(sample->cpu, rep->cpu_bitmap))
return 0; return 0;
if (sort__branch_mode == 1) { if (sort__mode == SORT_MODE__BRANCH) {
if (perf_report__add_branch_hist_entry(tool, &al, sample, ret = perf_report__add_branch_hist_entry(tool, &al, sample,
evsel, machine)) { evsel, machine);
if (ret < 0)
pr_debug("problem adding lbr entry, skipping event\n"); pr_debug("problem adding lbr entry, skipping event\n");
return -1;
}
} else if (rep->mem_mode == 1) { } else if (rep->mem_mode == 1) {
if (perf_report__add_mem_hist_entry(tool, &al, sample, ret = perf_report__add_mem_hist_entry(tool, &al, sample,
evsel, machine, event)) { evsel, machine, event);
if (ret < 0)
pr_debug("problem adding mem entry, skipping event\n"); pr_debug("problem adding mem entry, skipping event\n");
return -1;
}
} else { } else {
if (al.map != NULL) if (al.map != NULL)
al.map->dso->hit = 1; al.map->dso->hit = 1;
if (perf_evsel__add_hist_entry(evsel, &al, sample, machine)) { ret = perf_evsel__add_hist_entry(evsel, &al, sample, machine);
if (ret < 0)
pr_debug("problem incrementing symbol period, skipping event\n"); pr_debug("problem incrementing symbol period, skipping event\n");
return -1;
}
} }
return 0; return ret;
} }
static int process_read_event(struct perf_tool *tool, static int process_read_event(struct perf_tool *tool,
...@@ -384,7 +391,7 @@ static int perf_report__setup_sample_type(struct perf_report *rep) ...@@ -384,7 +391,7 @@ static int perf_report__setup_sample_type(struct perf_report *rep)
} }
} }
if (sort__branch_mode == 1) { if (sort__mode == SORT_MODE__BRANCH) {
if (!self->fd_pipe && if (!self->fd_pipe &&
!(sample_type & PERF_SAMPLE_BRANCH_STACK)) { !(sample_type & PERF_SAMPLE_BRANCH_STACK)) {
ui__error("Selected -b but no branch data. " ui__error("Selected -b but no branch data. "
...@@ -455,7 +462,7 @@ static int perf_evlist__tty_browse_hists(struct perf_evlist *evlist, ...@@ -455,7 +462,7 @@ static int perf_evlist__tty_browse_hists(struct perf_evlist *evlist,
continue; continue;
hists__fprintf_nr_sample_events(rep, hists, evname, stdout); hists__fprintf_nr_sample_events(rep, hists, evname, stdout);
hists__fprintf(hists, true, 0, 0, stdout); hists__fprintf(hists, true, 0, 0, rep->min_percent, stdout);
fprintf(stdout, "\n\n"); fprintf(stdout, "\n\n");
} }
...@@ -574,8 +581,8 @@ static int __cmd_report(struct perf_report *rep) ...@@ -574,8 +581,8 @@ static int __cmd_report(struct perf_report *rep)
if (use_browser > 0) { if (use_browser > 0) {
if (use_browser == 1) { if (use_browser == 1) {
ret = perf_evlist__tui_browse_hists(session->evlist, ret = perf_evlist__tui_browse_hists(session->evlist,
help, help, NULL,
NULL, rep->min_percent,
&session->header.env); &session->header.env);
/* /*
* Usually "ret" is the last pressed key, and we only * Usually "ret" is the last pressed key, and we only
...@@ -586,7 +593,7 @@ static int __cmd_report(struct perf_report *rep) ...@@ -586,7 +593,7 @@ static int __cmd_report(struct perf_report *rep)
} else if (use_browser == 2) { } else if (use_browser == 2) {
perf_evlist__gtk_browse_hists(session->evlist, help, perf_evlist__gtk_browse_hists(session->evlist, help,
NULL); NULL, rep->min_percent);
} }
} else } else
perf_evlist__tty_browse_hists(session->evlist, rep, help); perf_evlist__tty_browse_hists(session->evlist, rep, help);
...@@ -691,7 +698,19 @@ static int ...@@ -691,7 +698,19 @@ static int
parse_branch_mode(const struct option *opt __maybe_unused, parse_branch_mode(const struct option *opt __maybe_unused,
const char *str __maybe_unused, int unset) const char *str __maybe_unused, int unset)
{ {
sort__branch_mode = !unset; int *branch_mode = opt->value;
*branch_mode = !unset;
return 0;
}
static int
parse_percent_limit(const struct option *opt, const char *str,
int unset __maybe_unused)
{
struct perf_report *rep = opt->value;
rep->min_percent = strtof(str, NULL);
return 0; return 0;
} }
...@@ -700,6 +719,7 @@ int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused) ...@@ -700,6 +719,7 @@ int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused)
struct perf_session *session; struct perf_session *session;
struct stat st; struct stat st;
bool has_br_stack = false; bool has_br_stack = false;
int branch_mode = -1;
int ret = -1; int ret = -1;
char callchain_default_opt[] = "fractal,0.5,callee"; char callchain_default_opt[] = "fractal,0.5,callee";
const char * const report_usage[] = { const char * const report_usage[] = {
...@@ -796,17 +816,19 @@ int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused) ...@@ -796,17 +816,19 @@ int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused)
"Show a column with the sum of periods"), "Show a column with the sum of periods"),
OPT_BOOLEAN(0, "group", &symbol_conf.event_group, OPT_BOOLEAN(0, "group", &symbol_conf.event_group,
"Show event group information together"), "Show event group information together"),
OPT_CALLBACK_NOOPT('b', "branch-stack", &sort__branch_mode, "", OPT_CALLBACK_NOOPT('b', "branch-stack", &branch_mode, "",
"use branch records for histogram filling", parse_branch_mode), "use branch records for histogram filling", parse_branch_mode),
OPT_STRING(0, "objdump", &objdump_path, "path", OPT_STRING(0, "objdump", &objdump_path, "path",
"objdump binary to use for disassembly and annotations"), "objdump binary to use for disassembly and annotations"),
OPT_BOOLEAN(0, "demangle", &symbol_conf.demangle, OPT_BOOLEAN(0, "demangle", &symbol_conf.demangle,
"Disable symbol demangling"), "Disable symbol demangling"),
OPT_BOOLEAN(0, "mem-mode", &report.mem_mode, "mem access profile"), OPT_BOOLEAN(0, "mem-mode", &report.mem_mode, "mem access profile"),
OPT_CALLBACK(0, "percent-limit", &report, "percent",
"Don't show entries under that percent", parse_percent_limit),
OPT_END() OPT_END()
}; };
perf_config(perf_report_config, NULL); perf_config(perf_report_config, &report);
argc = parse_options(argc, argv, options, report_usage, 0); argc = parse_options(argc, argv, options, report_usage, 0);
...@@ -846,11 +868,11 @@ int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused) ...@@ -846,11 +868,11 @@ int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused)
has_br_stack = perf_header__has_feat(&session->header, has_br_stack = perf_header__has_feat(&session->header,
HEADER_BRANCH_STACK); HEADER_BRANCH_STACK);
if (sort__branch_mode == -1 && has_br_stack) if (branch_mode == -1 && has_br_stack)
sort__branch_mode = 1; sort__mode = SORT_MODE__BRANCH;
/* sort__branch_mode could be 0 if --no-branch-stack */ /* sort__mode could be NORMAL if --no-branch-stack */
if (sort__branch_mode == 1) { if (sort__mode == SORT_MODE__BRANCH) {
/* /*
* if no sort_order is provided, then specify * if no sort_order is provided, then specify
* branch-mode specific order * branch-mode specific order
...@@ -861,10 +883,12 @@ int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused) ...@@ -861,10 +883,12 @@ int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused)
} }
if (report.mem_mode) { if (report.mem_mode) {
if (sort__branch_mode == 1) { if (sort__mode == SORT_MODE__BRANCH) {
fprintf(stderr, "branch and mem mode incompatible\n"); fprintf(stderr, "branch and mem mode incompatible\n");
goto error; goto error;
} }
sort__mode = SORT_MODE__MEMORY;
/* /*
* if no sort_order is provided, then specify * if no sort_order is provided, then specify
* branch-mode specific order * branch-mode specific order
...@@ -929,25 +953,7 @@ int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused) ...@@ -929,25 +953,7 @@ int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused)
report.symbol_filter_str = argv[0]; report.symbol_filter_str = argv[0];
} }
sort_entry__setup_elide(&sort_comm, symbol_conf.comm_list, "comm", stdout); sort__setup_elide(stdout);
if (sort__branch_mode == 1) {
sort_entry__setup_elide(&sort_dso_from, symbol_conf.dso_from_list, "dso_from", stdout);
sort_entry__setup_elide(&sort_dso_to, symbol_conf.dso_to_list, "dso_to", stdout);
sort_entry__setup_elide(&sort_sym_from, symbol_conf.sym_from_list, "sym_from", stdout);
sort_entry__setup_elide(&sort_sym_to, symbol_conf.sym_to_list, "sym_to", stdout);
} else {
if (report.mem_mode) {
sort_entry__setup_elide(&sort_dso, symbol_conf.dso_list, "symbol_daddr", stdout);
sort_entry__setup_elide(&sort_dso, symbol_conf.dso_list, "dso_daddr", stdout);
sort_entry__setup_elide(&sort_dso, symbol_conf.dso_list, "mem", stdout);
sort_entry__setup_elide(&sort_dso, symbol_conf.dso_list, "local_weight", stdout);
sort_entry__setup_elide(&sort_dso, symbol_conf.dso_list, "tlb", stdout);
sort_entry__setup_elide(&sort_dso, symbol_conf.dso_list, "snoop", stdout);
}
sort_entry__setup_elide(&sort_dso, symbol_conf.dso_list, "dso", stdout);
sort_entry__setup_elide(&sort_sym, symbol_conf.sym_list, "symbol", stdout);
}
ret = __cmd_report(&report); ret = __cmd_report(&report);
if (ret == K_SWITCH_INPUT_DATA) { if (ret == K_SWITCH_INPUT_DATA) {
......
...@@ -70,10 +70,11 @@ ...@@ -70,10 +70,11 @@
static volatile int done; static volatile int done;
#define HEADER_LINE_NR 5
static void perf_top__update_print_entries(struct perf_top *top) static void perf_top__update_print_entries(struct perf_top *top)
{ {
if (top->print_entries > 9) top->print_entries = top->winsize.ws_row - HEADER_LINE_NR;
top->print_entries -= 9;
} }
static void perf_top__sig_winch(int sig __maybe_unused, static void perf_top__sig_winch(int sig __maybe_unused,
...@@ -82,13 +83,6 @@ static void perf_top__sig_winch(int sig __maybe_unused, ...@@ -82,13 +83,6 @@ static void perf_top__sig_winch(int sig __maybe_unused,
struct perf_top *top = arg; struct perf_top *top = arg;
get_term_dimensions(&top->winsize); get_term_dimensions(&top->winsize);
if (!top->print_entries
|| (top->print_entries+4) > top->winsize.ws_row) {
top->print_entries = top->winsize.ws_row;
} else {
top->print_entries += 4;
top->winsize.ws_row = top->print_entries;
}
perf_top__update_print_entries(top); perf_top__update_print_entries(top);
} }
...@@ -251,8 +245,11 @@ static struct hist_entry *perf_evsel__add_hist_entry(struct perf_evsel *evsel, ...@@ -251,8 +245,11 @@ static struct hist_entry *perf_evsel__add_hist_entry(struct perf_evsel *evsel,
{ {
struct hist_entry *he; struct hist_entry *he;
pthread_mutex_lock(&evsel->hists.lock);
he = __hists__add_entry(&evsel->hists, al, NULL, sample->period, he = __hists__add_entry(&evsel->hists, al, NULL, sample->period,
sample->weight); sample->weight);
pthread_mutex_unlock(&evsel->hists.lock);
if (he == NULL) if (he == NULL)
return NULL; return NULL;
...@@ -290,16 +287,17 @@ static void perf_top__print_sym_table(struct perf_top *top) ...@@ -290,16 +287,17 @@ static void perf_top__print_sym_table(struct perf_top *top)
return; return;
} }
hists__collapse_resort_threaded(&top->sym_evsel->hists); hists__collapse_resort(&top->sym_evsel->hists);
hists__output_resort_threaded(&top->sym_evsel->hists); hists__output_resort(&top->sym_evsel->hists);
hists__decay_entries_threaded(&top->sym_evsel->hists, hists__decay_entries(&top->sym_evsel->hists,
top->hide_user_symbols, top->hide_user_symbols,
top->hide_kernel_symbols); top->hide_kernel_symbols);
hists__output_recalc_col_len(&top->sym_evsel->hists, hists__output_recalc_col_len(&top->sym_evsel->hists,
top->winsize.ws_row - 3); top->print_entries - printed);
putchar('\n'); putchar('\n');
hists__fprintf(&top->sym_evsel->hists, false, hists__fprintf(&top->sym_evsel->hists, false,
top->winsize.ws_row - 4 - printed, win_width, stdout); top->print_entries - printed, win_width,
top->min_percent, stdout);
} }
static void prompt_integer(int *target, const char *msg) static void prompt_integer(int *target, const char *msg)
...@@ -477,7 +475,6 @@ static bool perf_top__handle_keypress(struct perf_top *top, int c) ...@@ -477,7 +475,6 @@ static bool perf_top__handle_keypress(struct perf_top *top, int c)
perf_top__sig_winch(SIGWINCH, NULL, top); perf_top__sig_winch(SIGWINCH, NULL, top);
sigaction(SIGWINCH, &act, NULL); sigaction(SIGWINCH, &act, NULL);
} else { } else {
perf_top__sig_winch(SIGWINCH, NULL, top);
signal(SIGWINCH, SIG_DFL); signal(SIGWINCH, SIG_DFL);
} }
break; break;
...@@ -556,11 +553,11 @@ static void perf_top__sort_new_samples(void *arg) ...@@ -556,11 +553,11 @@ static void perf_top__sort_new_samples(void *arg)
if (t->evlist->selected != NULL) if (t->evlist->selected != NULL)
t->sym_evsel = t->evlist->selected; t->sym_evsel = t->evlist->selected;
hists__collapse_resort_threaded(&t->sym_evsel->hists); hists__collapse_resort(&t->sym_evsel->hists);
hists__output_resort_threaded(&t->sym_evsel->hists); hists__output_resort(&t->sym_evsel->hists);
hists__decay_entries_threaded(&t->sym_evsel->hists, hists__decay_entries(&t->sym_evsel->hists,
t->hide_user_symbols, t->hide_user_symbols,
t->hide_kernel_symbols); t->hide_kernel_symbols);
} }
static void *display_thread_tui(void *arg) static void *display_thread_tui(void *arg)
...@@ -584,7 +581,7 @@ static void *display_thread_tui(void *arg) ...@@ -584,7 +581,7 @@ static void *display_thread_tui(void *arg)
list_for_each_entry(pos, &top->evlist->entries, node) list_for_each_entry(pos, &top->evlist->entries, node)
pos->hists.uid_filter_str = top->record_opts.target.uid_str; pos->hists.uid_filter_str = top->record_opts.target.uid_str;
perf_evlist__tui_browse_hists(top->evlist, help, &hbt, perf_evlist__tui_browse_hists(top->evlist, help, &hbt, top->min_percent,
&top->session->header.env); &top->session->header.env);
done = 1; done = 1;
...@@ -794,7 +791,7 @@ static void perf_event__process_sample(struct perf_tool *tool, ...@@ -794,7 +791,7 @@ static void perf_event__process_sample(struct perf_tool *tool,
return; return;
} }
if (top->sort_has_symbols) if (sort__has_sym)
perf_top__record_precise_ip(top, he, evsel->idx, ip); perf_top__record_precise_ip(top, he, evsel->idx, ip);
} }
...@@ -912,9 +909,9 @@ static int perf_top__start_counters(struct perf_top *top) ...@@ -912,9 +909,9 @@ static int perf_top__start_counters(struct perf_top *top)
return -1; return -1;
} }
static int perf_top__setup_sample_type(struct perf_top *top) static int perf_top__setup_sample_type(struct perf_top *top __maybe_unused)
{ {
if (!top->sort_has_symbols) { if (!sort__has_sym) {
if (symbol_conf.use_callchain) { if (symbol_conf.use_callchain) {
ui__error("Selected -g but \"sym\" not present in --sort/-s."); ui__error("Selected -g but \"sym\" not present in --sort/-s.");
return -EINVAL; return -EINVAL;
...@@ -1025,6 +1022,16 @@ parse_callchain_opt(const struct option *opt, const char *arg, int unset) ...@@ -1025,6 +1022,16 @@ parse_callchain_opt(const struct option *opt, const char *arg, int unset)
return record_parse_callchain_opt(opt, arg, unset); return record_parse_callchain_opt(opt, arg, unset);
} }
static int
parse_percent_limit(const struct option *opt, const char *arg,
int unset __maybe_unused)
{
struct perf_top *top = opt->value;
top->min_percent = strtof(arg, NULL);
return 0;
}
int cmd_top(int argc, const char **argv, const char *prefix __maybe_unused) int cmd_top(int argc, const char **argv, const char *prefix __maybe_unused)
{ {
int status; int status;
...@@ -1110,6 +1117,8 @@ int cmd_top(int argc, const char **argv, const char *prefix __maybe_unused) ...@@ -1110,6 +1117,8 @@ int cmd_top(int argc, const char **argv, const char *prefix __maybe_unused)
OPT_STRING('M', "disassembler-style", &disassembler_style, "disassembler style", OPT_STRING('M', "disassembler-style", &disassembler_style, "disassembler style",
"Specify disassembler style (e.g. -M intel for intel syntax)"), "Specify disassembler style (e.g. -M intel for intel syntax)"),
OPT_STRING('u', "uid", &target->uid_str, "user", "user to profile"), OPT_STRING('u', "uid", &target->uid_str, "user", "user to profile"),
OPT_CALLBACK(0, "percent-limit", &top, "percent",
"Don't show entries under that percent", parse_percent_limit),
OPT_END() OPT_END()
}; };
const char * const top_usage[] = { const char * const top_usage[] = {
...@@ -1133,6 +1142,9 @@ int cmd_top(int argc, const char **argv, const char *prefix __maybe_unused) ...@@ -1133,6 +1142,9 @@ int cmd_top(int argc, const char **argv, const char *prefix __maybe_unused)
if (setup_sorting() < 0) if (setup_sorting() < 0)
usage_with_options(top_usage, options); usage_with_options(top_usage, options);
/* display thread wants entries to be collapsed in a different tree */
sort__need_collapse = 1;
if (top.use_stdio) if (top.use_stdio)
use_browser = 0; use_browser = 0;
else if (top.use_tui) else if (top.use_tui)
...@@ -1200,15 +1212,7 @@ int cmd_top(int argc, const char **argv, const char *prefix __maybe_unused) ...@@ -1200,15 +1212,7 @@ int cmd_top(int argc, const char **argv, const char *prefix __maybe_unused)
if (symbol__init() < 0) if (symbol__init() < 0)
return -1; return -1;
sort_entry__setup_elide(&sort_dso, symbol_conf.dso_list, "dso", stdout); sort__setup_elide(stdout);
sort_entry__setup_elide(&sort_comm, symbol_conf.comm_list, "comm", stdout);
sort_entry__setup_elide(&sort_sym, symbol_conf.sym_list, "symbol", stdout);
/*
* Avoid annotation data structures overhead when symbols aren't on the
* sort list.
*/
top.sort_has_symbols = sort_sym.list.next != NULL;
get_term_dimensions(&top.winsize); get_term_dimensions(&top.winsize);
if (top.print_entries == 0) { if (top.print_entries == 0) {
......
This diff is collapsed.
...@@ -27,8 +27,8 @@ watermark=0 ...@@ -27,8 +27,8 @@ watermark=0
precise_ip=0 precise_ip=0
mmap_data=0 mmap_data=0
sample_id_all=1 sample_id_all=1
exclude_host=0 exclude_host=0|1
exclude_guest=1 exclude_guest=0|1
exclude_callchain_kernel=0 exclude_callchain_kernel=0
exclude_callchain_user=0 exclude_callchain_user=0
wakeup_events=0 wakeup_events=0
......
...@@ -27,8 +27,8 @@ watermark=0 ...@@ -27,8 +27,8 @@ watermark=0
precise_ip=0 precise_ip=0
mmap_data=0 mmap_data=0
sample_id_all=0 sample_id_all=0
exclude_host=0 exclude_host=0|1
exclude_guest=1 exclude_guest=0|1
exclude_callchain_kernel=0 exclude_callchain_kernel=0
exclude_callchain_user=0 exclude_callchain_user=0
wakeup_events=0 wakeup_events=0
......
...@@ -4,5 +4,8 @@ args = -d kill >/dev/null 2>&1 ...@@ -4,5 +4,8 @@ args = -d kill >/dev/null 2>&1
[event:base-record] [event:base-record]
sample_period=4000 sample_period=4000
sample_type=271
# sample_type = PERF_SAMPLE_IP | PERF_SAMPLE_TID | PERF_SAMPLE_TIME |
# PERF_SAMPLE_ADDR | PERF_SAMPLE_PERIOD | PERF_SAMPLE_DATA_SRC
sample_type=33039
mmap_data=1 mmap_data=1
...@@ -4,6 +4,12 @@ ...@@ -4,6 +4,12 @@
* (git://github.com/deater/perf_event_tests) * (git://github.com/deater/perf_event_tests)
*/ */
/*
* Powerpc needs __SANE_USERSPACE_TYPES__ before <linux/types.h> to select
* 'int-ll64.h' and avoid compile warnings when printing __u64 with %llu.
*/
#define __SANE_USERSPACE_TYPES__
#include <stdlib.h> #include <stdlib.h>
#include <stdio.h> #include <stdio.h>
#include <unistd.h> #include <unistd.h>
......
...@@ -3,6 +3,12 @@ ...@@ -3,6 +3,12 @@
* perf_event_tests (git://github.com/deater/perf_event_tests) * perf_event_tests (git://github.com/deater/perf_event_tests)
*/ */
/*
* Powerpc needs __SANE_USERSPACE_TYPES__ before <linux/types.h> to select
* 'int-ll64.h' and avoid compile warnings when printing __u64 with %llu.
*/
#define __SANE_USERSPACE_TYPES__
#include <stdlib.h> #include <stdlib.h>
#include <stdio.h> #include <stdio.h>
#include <unistd.h> #include <unistd.h>
......
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
...@@ -84,7 +84,6 @@ struct perf_session_env { ...@@ -84,7 +84,6 @@ struct perf_session_env {
}; };
struct perf_header { struct perf_header {
int frozen;
bool needs_swap; bool needs_swap;
s64 attr_offset; s64 attr_offset;
u64 data_offset; u64 data_offset;
......
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment