Commit 6635df2f authored by Ian Rogers's avatar Ian Rogers Committed by Arnaldo Carvalho de Melo

perf vendor events intel: Refresh cascadelakex events

Update the cascadelakex events from 1.16 to 1.17. Generation was done
using https://github.com/intel/perfmon.

Notable changes are new events and event descriptions, TMA metrics are
updated to version 4.5, TMA info metrics are renamed from their node
name to be lower case and prefixed by tma_info_, MetricThreshold
expressions are added, "Sample with" documentation is added to many
TMA metrics, smi_cost and transaction metric groups are added
replicating existing hard coded metrics in stat-shadow.
Signed-off-by: default avatarIan Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: Florian Fischer <florian.fischer@muhq.space>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-stm32@st-md-mailman.stormreply.com
Link: https://lore.kernel.org/r/20230219092848.639226-16-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
parent 46db21af
......@@ -234,20 +234,22 @@
"UMask": "0x4f"
},
{
"BriefDescription": "All retired load instructions.",
"BriefDescription": "Retired load instructions.",
"Data_LA": "1",
"EventCode": "0xD0",
"EventName": "MEM_INST_RETIRED.ALL_LOADS",
"PEBS": "1",
"PublicDescription": "Counts all retired load instructions. This event accounts for SW prefetch instructions of PREFETCHNTA or PREFETCHT0/1/2 or PREFETCHW.",
"SampleAfterValue": "2000003",
"UMask": "0x81"
},
{
"BriefDescription": "All retired store instructions.",
"BriefDescription": "Retired store instructions.",
"Data_LA": "1",
"EventCode": "0xD0",
"EventName": "MEM_INST_RETIRED.ALL_STORES",
"PEBS": "1",
"PublicDescription": "Counts all retired store instructions.",
"SampleAfterValue": "2000003",
"UMask": "0x82"
},
......@@ -388,12 +390,12 @@
"UMask": "0x4"
},
{
"BriefDescription": "Retired load instructions with remote Intel(R) Optane(TM) DC persistent memory as the data source where the data request missed all caches. Precise event.",
"BriefDescription": "Retired load instructions with remote Intel(R) Optane(TM) DC persistent memory as the data source where the data request missed all caches.",
"Data_LA": "1",
"EventCode": "0xD3",
"EventName": "MEM_LOAD_L3_MISS_RETIRED.REMOTE_PMM",
"PEBS": "1",
"PublicDescription": "Counts retired load instructions with remote Intel(R) Optane(TM) DC persistent memory as the data source and the data request missed L3 (AppDirect or Memory Mode) and DRAM cache(Memory Mode). Precise event",
"PublicDescription": "Counts retired load instructions with remote Intel(R) Optane(TM) DC persistent memory as the data source and the data request missed L3 (AppDirect or Memory Mode) and DRAM cache(Memory Mode).",
"SampleAfterValue": "100007",
"UMask": "0x10"
},
......@@ -477,12 +479,12 @@
"UMask": "0x20"
},
{
"BriefDescription": "Retired load instructions with local Intel(R) Optane(TM) DC persistent memory as the data source where the data request missed all caches. Precise event.",
"BriefDescription": "Retired load instructions with local Intel(R) Optane(TM) DC persistent memory as the data source where the data request missed all caches.",
"Data_LA": "1",
"EventCode": "0xD1",
"EventName": "MEM_LOAD_RETIRED.LOCAL_PMM",
"PEBS": "1",
"PublicDescription": "Counts retired load instructions with local Intel(R) Optane(TM) DC persistent memory as the data source and the data request missed L3 (AppDirect or Memory Mode) and DRAM cache(Memory Mode). Precise event",
"PublicDescription": "Counts retired load instructions with local Intel(R) Optane(TM) DC persistent memory as the data source and the data request missed L3 (AppDirect or Memory Mode) and DRAM cache(Memory Mode).",
"SampleAfterValue": "100003",
"UMask": "0x80"
},
......@@ -5039,7 +5041,7 @@
"UMask": "0x80"
},
{
"BriefDescription": "Cacheable and noncachaeble code read requests",
"BriefDescription": "Cacheable and non-cacheable code read requests",
"EventCode": "0xB0",
"EventName": "OFFCORE_REQUESTS.DEMAND_CODE_RD",
"PublicDescription": "Counts both cacheable and non-cacheable code read requests.",
......@@ -5146,14 +5148,6 @@
"SampleAfterValue": "2000003",
"UMask": "0x4"
},
{
"BriefDescription": "Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction",
"EventCode": "0xB7, 0xBB",
"EventName": "OFFCORE_RESPONSE",
"PublicDescription": "Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
"SampleAfterValue": "100003",
"UMask": "0x1"
},
{
"BriefDescription": "This event is deprecated. Refer to new event OCR.ALL_DATA_RD.ANY_RESPONSE",
"Deprecated": "1",
......
This source diff could not be displayed because it is too large. You can view the blob instead.
......@@ -322,7 +322,7 @@
"UMask": "0x4"
},
{
"BriefDescription": "Cycles when uops are being delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
"BriefDescription": "Cycles when uops are being delivered to Instruction Decode Queue (IDQ) while Microcode Sequencer (MS) is busy",
"CounterMask": "1",
"EventCode": "0x79",
"EventName": "IDQ.MS_CYCLES",
......@@ -331,7 +331,7 @@
"UMask": "0x30"
},
{
"BriefDescription": "Cycles when uops initiated by Decode Stream Buffer (DSB) are being delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
"BriefDescription": "Cycles when uops initiated by Decode Stream Buffer (DSB) are being delivered to Instruction Decode Queue (IDQ) while Microcode Sequencer (MS) is busy",
"CounterMask": "1",
"EventCode": "0x79",
"EventName": "IDQ.MS_DSB_CYCLES",
......@@ -340,7 +340,7 @@
"UMask": "0x10"
},
{
"BriefDescription": "Uops initiated by MITE and delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
"BriefDescription": "Uops initiated by MITE and delivered to Instruction Decode Queue (IDQ) while Microcode Sequencer (MS) is busy",
"EventCode": "0x79",
"EventName": "IDQ.MS_MITE_UOPS",
"PublicDescription": "Counts the number of uops initiated by MITE and delivered to Instruction Decode Queue (IDQ) while the Microcode Sequencer (MS) is busy. Counting includes uops that may 'bypass' the IDQ.",
......@@ -358,7 +358,7 @@
"UMask": "0x30"
},
{
"BriefDescription": "Uops delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
"BriefDescription": "Uops delivered to Instruction Decode Queue (IDQ) while Microcode Sequencer (MS) is busy",
"EventCode": "0x79",
"EventName": "IDQ.MS_UOPS",
"PublicDescription": "Counts the total number of uops delivered by the Microcode Sequencer (MS). Any instruction over 4 uops will be delivered by the MS. Some instructions such as transcendentals may additionally generate uops from the MS.",
......
......@@ -93,6 +93,22 @@
"SampleAfterValue": "400009",
"UMask": "0x10"
},
{
"BriefDescription": "Speculative and retired mispredicted macro conditional branches",
"EventCode": "0x89",
"EventName": "BR_MISP_EXEC.ALL_BRANCHES",
"PublicDescription": "This event counts both taken and not taken speculative and retired mispredicted branch instructions.",
"SampleAfterValue": "200003",
"UMask": "0xff"
},
{
"BriefDescription": "Speculative mispredicted indirect branches",
"EventCode": "0x89",
"EventName": "BR_MISP_EXEC.INDIRECT",
"PublicDescription": "Counts speculatively miss-predicted indirect branches at execution time. Counts for indirect near CALL or JMP instructions (RET excluded).",
"SampleAfterValue": "200003",
"UMask": "0xe4"
},
{
"BriefDescription": "All mispredicted macro branch instructions retired.",
"EventCode": "0xC5",
......
......@@ -192,7 +192,7 @@
"EventCode": "0x9",
"EventName": "UNC_M_ECC_CORRECTABLE_ERRORS",
"PerPkg": "1",
"PublicDescription": "Counts the number of ECC errors detected and corrected by the iMC on this channel. This counter is only useful with ECC DRAM devices. This count will increment one time for each correction regardless of the number of bits corrected. The iMC can correct up to 4 bit errors in independent channel mode and 8 bit erros in lockstep mode.",
"PublicDescription": "Counts the number of ECC errors detected and corrected by the iMC on this channel. This counter is only useful with ECC DRAM devices. This count will increment one time for each correction regardless of the number of bits corrected. The iMC can correct up to 4 bit errors in independent channel mode and 8 bit errors in lockstep mode.",
"Unit": "iMC"
},
{
......@@ -212,7 +212,7 @@
"Unit": "iMC"
},
{
"BriefDescription": "UNC_M_MAJMODE2.PMM_CYC",
"BriefDescription": "Major Mode 2 : Cycles in PMM major mode",
"EventCode": "0xED",
"EventName": "UNC_M_MAJMODE2.PMM_CYC",
"PerPkg": "1",
......@@ -220,7 +220,7 @@
"Unit": "iMC"
},
{
"BriefDescription": "UNC_M_MAJMODE2.PMM_ENTER",
"BriefDescription": "Major Mode 2 : Entered PMM major mode",
"EventCode": "0xED",
"EventName": "UNC_M_MAJMODE2.PMM_ENTER",
"PerPkg": "1",
......@@ -290,7 +290,7 @@
"Unit": "iMC"
},
{
"BriefDescription": "All commands for Intel Optane DC persistent memory",
"BriefDescription": "All commands for Intel(R) Optane(TM) DC persistent memory",
"EventCode": "0xEA",
"EventName": "UNC_M_PMM_CMD1.ALL",
"PerPkg": "1",
......@@ -314,7 +314,7 @@
"Unit": "iMC"
},
{
"BriefDescription": "Regular reads(RPQ) commands for Intel Optane DC persistent memory",
"BriefDescription": "Regular reads(RPQ) commands for Intel(R) Optane(TM) DC persistent memory",
"EventCode": "0xEA",
"EventName": "UNC_M_PMM_CMD1.RD",
"PerPkg": "1",
......@@ -331,7 +331,7 @@
"Unit": "iMC"
},
{
"BriefDescription": "Underfill read commands for Intel Optane DC persistent memory",
"BriefDescription": "Underfill read commands for Intel(R) Optane(TM) DC persistent memory",
"EventCode": "0xEA",
"EventName": "UNC_M_PMM_CMD1.UFILL_RD",
"PerPkg": "1",
......@@ -348,7 +348,7 @@
"Unit": "iMC"
},
{
"BriefDescription": "Write commands for Intel Optane DC persistent memory",
"BriefDescription": "Write commands for Intel(R) Optane(TM) DC persistent memory",
"EventCode": "0xEA",
"EventName": "UNC_M_PMM_CMD1.WR",
"PerPkg": "1",
......@@ -522,7 +522,7 @@
"Unit": "iMC"
},
{
"BriefDescription": "Write Pending Queue Occupancy of all write requests for Intel Optane DC persistent memory",
"BriefDescription": "Write Pending Queue Occupancy of all write requests for Intel(R) Optane(TM) DC persistent memory",
"EventCode": "0xE4",
"EventName": "UNC_M_PMM_WPQ_OCCUPANCY.ALL",
"PerPkg": "1",
......@@ -2735,7 +2735,7 @@
"EventCode": "0x81",
"EventName": "UNC_M_WPQ_OCCUPANCY",
"PerPkg": "1",
"PublicDescription": "Counts the number of entries in the Write Pending Queue (WPQ) at each cycle. This can then be used to calculate both the average queue occupancy (in conjunction with the number of cycles not empty) and the average latency (in conjunction with the number of allocations). The WPQ is used to schedule writes out to the memory controller and to track the requests. Requests allocate into the WPQ soon after they enter the memory controller, and need credits for an entry in this buffer before being sent from the CHA to the iMC (memory controller). They deallocate after being issued to DRAM. Write requests themselves are able to complete (from the perspective of the rest of the system) as soon they have 'posted' to the iMC. This is not to be confused with actually performing the write to DRAM. Therefore, the average latency for this queue is actually not useful for deconstruction intermediate write latencies. So, we provide filtering based on if the request has posted or not. By using the 'not posted' filter, we can track how long writes spent in the iMC before completions were sent to the HA. The 'posted' filter, on the other hand, provides information about how much queueing is actually happenning in the iMC for writes before they are actually issued to memory. High average occupancies will generally coincide with high write major mode counts. Is there a filter of sorts???",
"PublicDescription": "Counts the number of entries in the Write Pending Queue (WPQ) at each cycle. This can then be used to calculate both the average queue occupancy (in conjunction with the number of cycles not empty) and the average latency (in conjunction with the number of allocations). The WPQ is used to schedule writes out to the memory controller and to track the requests. Requests allocate into the WPQ soon after they enter the memory controller, and need credits for an entry in this buffer before being sent from the CHA to the iMC (memory controller). They deallocate after being issued to DRAM. Write requests themselves are able to complete (from the perspective of the rest of the system) as soon they have 'posted' to the iMC. This is not to be confused with actually performing the write to DRAM. Therefore, the average latency for this queue is actually not useful for deconstruction intermediate write latencies. So, we provide filtering based on if the request has posted or not. By using the 'not posted' filter, we can track how long writes spent in the iMC before completions were sent to the HA. The 'posted' filter, on the other hand, provides information about how much queueing is actually happening in the iMC for writes before they are actually issued to memory. High average occupancies will generally coincide with high write major mode counts. Is there a filter of sorts???",
"Unit": "iMC"
},
{
......
......@@ -143,7 +143,7 @@
"EventCode": "0x80",
"EventName": "UNC_P_POWER_STATE_OCCUPANCY.CORES_C0",
"PerPkg": "1",
"PublicDescription": "This is an occupancy event that tracks the number of cores that are in the chosen C-State. It can be used by itself to get the average number of cores in that C-state with threshholding to generate histograms, or with other PCU events and occupancy triggering to capture other details.",
"PublicDescription": "This is an occupancy event that tracks the number of cores that are in the chosen C-State. It can be used by itself to get the average number of cores in that C-state with thresholding to generate histograms, or with other PCU events and occupancy triggering to capture other details.",
"Unit": "PCU"
},
{
......@@ -151,7 +151,7 @@
"EventCode": "0x80",
"EventName": "UNC_P_POWER_STATE_OCCUPANCY.CORES_C3",
"PerPkg": "1",
"PublicDescription": "This is an occupancy event that tracks the number of cores that are in the chosen C-State. It can be used by itself to get the average number of cores in that C-state with threshholding to generate histograms, or with other PCU events and occupancy triggering to capture other details.",
"PublicDescription": "This is an occupancy event that tracks the number of cores that are in the chosen C-State. It can be used by itself to get the average number of cores in that C-state with thresholding to generate histograms, or with other PCU events and occupancy triggering to capture other details.",
"Unit": "PCU"
},
{
......@@ -159,7 +159,7 @@
"EventCode": "0x80",
"EventName": "UNC_P_POWER_STATE_OCCUPANCY.CORES_C6",
"PerPkg": "1",
"PublicDescription": "This is an occupancy event that tracks the number of cores that are in the chosen C-State. It can be used by itself to get the average number of cores in that C-state with threshholding to generate histograms, or with other PCU events and occupancy triggering to capture other details.",
"PublicDescription": "This is an occupancy event that tracks the number of cores that are in the chosen C-State. It can be used by itself to get the average number of cores in that C-state with thresholding to generate histograms, or with other PCU events and occupancy triggering to capture other details.",
"Unit": "PCU"
},
{
......@@ -175,7 +175,7 @@
"EventCode": "0x9",
"EventName": "UNC_P_PROCHOT_INTERNAL_CYCLES",
"PerPkg": "1",
"PublicDescription": "Counts the number of cycles that we are in Interal PROCHOT mode. This mode is triggered when a sensor on the die determines that we are too hot and must throttle to avoid damaging the chip.",
"PublicDescription": "Counts the number of cycles that we are in Internal PROCHOT mode. This mode is triggered when a sensor on the die determines that we are too hot and must throttle to avoid damaging the chip.",
"Unit": "PCU"
},
{
......
......@@ -5,7 +5,7 @@ GenuineIntel-6-(1C|26|27|35|36),v4,bonnell,core
GenuineIntel-6-(3D|47),v26,broadwell,core
GenuineIntel-6-56,v7,broadwellde,core
GenuineIntel-6-4F,v19,broadwellx,core
GenuineIntel-6-55-[56789ABCDEF],v1.16,cascadelakex,core
GenuineIntel-6-55-[56789ABCDEF],v1.17,cascadelakex,core
GenuineIntel-6-9[6C],v1.03,elkhartlake,core
GenuineIntel-6-5[CF],v13,goldmont,core
GenuineIntel-6-7A,v1.01,goldmontplus,core
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment