Commit aa8c3db4 authored by Linus Torvalds's avatar Linus Torvalds

Merge tag 'x86_cache_for_v6.3_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 resource control updates from Borislav Petkov:

 - Add support for a new AMD feature called slow memory bandwidth
   allocation. Its goal is to control resource allocation in external
   slow memory which is connected to the machine like for example
   through CXL devices, accelerators etc

* tag 'x86_cache_for_v6.3_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/resctrl: Fix a silly -Wunused-but-set-variable warning
  Documentation/x86: Update resctrl.rst for new features
  x86/resctrl: Add interface to write mbm_local_bytes_config
  x86/resctrl: Add interface to write mbm_total_bytes_config
  x86/resctrl: Add interface to read mbm_local_bytes_config
  x86/resctrl: Add interface to read mbm_total_bytes_config
  x86/resctrl: Support monitor configuration
  x86/resctrl: Add __init attribute to rdt_get_mon_l3_config()
  x86/resctrl: Detect and configure Slow Memory Bandwidth Allocation
  x86/resctrl: Include new features in command line options
  x86/cpufeatures: Add Bandwidth Monitoring Event Configuration feature flag
  x86/resctrl: Add a new resource type RDT_RESOURCE_SMBA
  x86/cpufeatures: Add Slow Memory Bandwidth Allocation feature flag
  x86/resctrl: Replace smp_call_function_many() with on_each_cpu_mask()
parents 1adce1b9 793207ba
......@@ -5221,7 +5221,7 @@
rdt= [HW,X86,RDT]
Turn on/off individual RDT features. List is:
cmt, mbmtotal, mbmlocal, l3cat, l3cdp, l2cat, l2cdp,
mba.
mba, smba, bmec.
E.g. to turn on cmt and turn off mba use:
rdt=cmt,!mba
......
......@@ -17,14 +17,21 @@ AMD refers to this feature as AMD Platform Quality of Service(AMD QoS).
This feature is enabled by the CONFIG_X86_CPU_RESCTRL and the x86 /proc/cpuinfo
flag bits:
============================================= ================================
=============================================== ================================
RDT (Resource Director Technology) Allocation "rdt_a"
CAT (Cache Allocation Technology) "cat_l3", "cat_l2"
CDP (Code and Data Prioritization) "cdp_l3", "cdp_l2"
CQM (Cache QoS Monitoring) "cqm_llc", "cqm_occup_llc"
MBM (Memory Bandwidth Monitoring) "cqm_mbm_total", "cqm_mbm_local"
MBA (Memory Bandwidth Allocation) "mba"
============================================= ================================
SMBA (Slow Memory Bandwidth Allocation) ""
BMEC (Bandwidth Monitoring Event Configuration) ""
=============================================== ================================
Historically, new features were made visible by default in /proc/cpuinfo. This
resulted in the feature flags becoming hard to parse by humans. Adding a new
flag to /proc/cpuinfo should be avoided if user space can obtain information
about the feature from resctrl's info directory.
To use the feature mount the file system::
......@@ -161,6 +168,83 @@ with the following files:
"mon_features":
Lists the monitoring events if
monitoring is enabled for the resource.
Example::
# cat /sys/fs/resctrl/info/L3_MON/mon_features
llc_occupancy
mbm_total_bytes
mbm_local_bytes
If the system supports Bandwidth Monitoring Event
Configuration (BMEC), then the bandwidth events will
be configurable. The output will be::
# cat /sys/fs/resctrl/info/L3_MON/mon_features
llc_occupancy
mbm_total_bytes
mbm_total_bytes_config
mbm_local_bytes
mbm_local_bytes_config
"mbm_total_bytes_config", "mbm_local_bytes_config":
Read/write files containing the configuration for the mbm_total_bytes
and mbm_local_bytes events, respectively, when the Bandwidth
Monitoring Event Configuration (BMEC) feature is supported.
The event configuration settings are domain specific and affect
all the CPUs in the domain. When either event configuration is
changed, the bandwidth counters for all RMIDs of both events
(mbm_total_bytes as well as mbm_local_bytes) are cleared for that
domain. The next read for every RMID will report "Unavailable"
and subsequent reads will report the valid value.
Following are the types of events supported:
==== ========================================================
Bits Description
==== ========================================================
6 Dirty Victims from the QOS domain to all types of memory
5 Reads to slow memory in the non-local NUMA domain
4 Reads to slow memory in the local NUMA domain
3 Non-temporal writes to non-local NUMA domain
2 Non-temporal writes to local NUMA domain
1 Reads to memory in the non-local NUMA domain
0 Reads to memory in the local NUMA domain
==== ========================================================
By default, the mbm_total_bytes configuration is set to 0x7f to count
all the event types and the mbm_local_bytes configuration is set to
0x15 to count all the local memory events.
Examples:
* To view the current configuration::
::
# cat /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config
0=0x7f;1=0x7f;2=0x7f;3=0x7f
# cat /sys/fs/resctrl/info/L3_MON/mbm_local_bytes_config
0=0x15;1=0x15;3=0x15;4=0x15
* To change the mbm_total_bytes to count only reads on domain 0,
the bits 0, 1, 4 and 5 needs to be set, which is 110011b in binary
(in hexadecimal 0x33):
::
# echo "0=0x33" > /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config
# cat /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config
0=0x33;1=0x7f;2=0x7f;3=0x7f
* To change the mbm_local_bytes to count all the slow memory reads on
domain 0 and 1, the bits 4 and 5 needs to be set, which is 110000b
in binary (in hexadecimal 0x30):
::
# echo "0=0x30;1=0x30" > /sys/fs/resctrl/info/L3_MON/mbm_local_bytes_config
# cat /sys/fs/resctrl/info/L3_MON/mbm_local_bytes_config
0=0x30;1=0x30;3=0x15;4=0x15
"max_threshold_occupancy":
Read/write file provides the largest value (in
......@@ -464,6 +548,25 @@ Memory bandwidth domain is L3 cache.
MB:<cache_id0>=bw_MBps0;<cache_id1>=bw_MBps1;...
Slow Memory Bandwidth Allocation (SMBA)
---------------------------------------
AMD hardware supports Slow Memory Bandwidth Allocation (SMBA).
CXL.memory is the only supported "slow" memory device. With the
support of SMBA, the hardware enables bandwidth allocation on
the slow memory devices. If there are multiple such devices in
the system, the throttling logic groups all the slow sources
together and applies the limit on them as a whole.
The presence of SMBA (with CXL.memory) is independent of slow memory
devices presence. If there are no such devices on the system, then
configuring SMBA will have no impact on the performance of the system.
The bandwidth domain for slow memory is L3 cache. Its schemata file
is formatted as:
::
SMBA:<cache_id0>=bandwidth0;<cache_id1>=bandwidth1;...
Reading/writing the schemata file
---------------------------------
Reading the schemata file will show the state of all resources
......@@ -479,6 +582,46 @@ which you wish to change. E.g.
L3DATA:0=fffff;1=fffff;2=3c0;3=fffff
L3CODE:0=fffff;1=fffff;2=fffff;3=fffff
Reading/writing the schemata file (on AMD systems)
--------------------------------------------------
Reading the schemata file will show the current bandwidth limit on all
domains. The allocated resources are in multiples of one eighth GB/s.
When writing to the file, you need to specify what cache id you wish to
configure the bandwidth limit.
For example, to allocate 2GB/s limit on the first cache id:
::
# cat schemata
MB:0=2048;1=2048;2=2048;3=2048
L3:0=ffff;1=ffff;2=ffff;3=ffff
# echo "MB:1=16" > schemata
# cat schemata
MB:0=2048;1= 16;2=2048;3=2048
L3:0=ffff;1=ffff;2=ffff;3=ffff
Reading/writing the schemata file (on AMD systems) with SMBA feature
--------------------------------------------------------------------
Reading and writing the schemata file is the same as without SMBA in
above section.
For example, to allocate 8GB/s limit on the first cache id:
::
# cat schemata
SMBA:0=2048;1=2048;2=2048;3=2048
MB:0=2048;1=2048;2=2048;3=2048
L3:0=ffff;1=ffff;2=ffff;3=ffff
# echo "SMBA:1=64" > schemata
# cat schemata
SMBA:0=2048;1= 64;2=2048;3=2048
MB:0=2048;1=2048;2=2048;3=2048
L3:0=ffff;1=ffff;2=ffff;3=ffff
Cache Pseudo-Locking
====================
CAT enables a user to specify the amount of cache space that an
......
......@@ -307,6 +307,8 @@
#define X86_FEATURE_SGX_EDECCSSA (11*32+18) /* "" SGX EDECCSSA user leaf function */
#define X86_FEATURE_CALL_DEPTH (11*32+19) /* "" Call depth tracking for RSB stuffing */
#define X86_FEATURE_MSR_TSX_CTRL (11*32+20) /* "" MSR IA32_TSX_CTRL (Intel) implemented */
#define X86_FEATURE_SMBA (11*32+21) /* "" Slow Memory Bandwidth Allocation */
#define X86_FEATURE_BMEC (11*32+22) /* "" Bandwidth Monitoring Event Configuration */
/* Intel-defined CPU features, CPUID level 0x00000007:1 (EAX), word 12 */
#define X86_FEATURE_AVX_VNNI (12*32+ 4) /* AVX VNNI instructions */
......
......@@ -1084,6 +1084,8 @@
/* - AMD: */
#define MSR_IA32_MBA_BW_BASE 0xc0000200
#define MSR_IA32_SMBA_BW_BASE 0xc0000280
#define MSR_IA32_EVT_CFG_BASE 0xc0000400
/* MSR_IA32_VMX_MISC bits */
#define MSR_IA32_VMX_MISC_INTEL_PT (1ULL << 14)
......
......@@ -68,6 +68,8 @@ static const struct cpuid_dep cpuid_deps[] = {
{ X86_FEATURE_CQM_OCCUP_LLC, X86_FEATURE_CQM_LLC },
{ X86_FEATURE_CQM_MBM_TOTAL, X86_FEATURE_CQM_LLC },
{ X86_FEATURE_CQM_MBM_LOCAL, X86_FEATURE_CQM_LLC },
{ X86_FEATURE_BMEC, X86_FEATURE_CQM_MBM_TOTAL },
{ X86_FEATURE_BMEC, X86_FEATURE_CQM_MBM_LOCAL },
{ X86_FEATURE_AVX512_BF16, X86_FEATURE_AVX512VL },
{ X86_FEATURE_AVX512_FP16, X86_FEATURE_AVX512BW },
{ X86_FEATURE_ENQCMD, X86_FEATURE_XSAVES },
......
......@@ -100,6 +100,18 @@ struct rdt_hw_resource rdt_resources_all[] = {
.fflags = RFTYPE_RES_MB,
},
},
[RDT_RESOURCE_SMBA] =
{
.r_resctrl = {
.rid = RDT_RESOURCE_SMBA,
.name = "SMBA",
.cache_level = 3,
.domains = domain_init(RDT_RESOURCE_SMBA),
.parse_ctrlval = parse_bw,
.format_str = "%d=%*u",
.fflags = RFTYPE_RES_MB,
},
},
};
/*
......@@ -150,6 +162,13 @@ bool is_mba_sc(struct rdt_resource *r)
if (!r)
return rdt_resources_all[RDT_RESOURCE_MBA].r_resctrl.membw.mba_sc;
/*
* The software controller support is only applicable to MBA resource.
* Make sure to check for resource type.
*/
if (r->rid != RDT_RESOURCE_MBA)
return false;
return r->membw.mba_sc;
}
......@@ -213,9 +232,15 @@ static bool __rdt_get_mem_config_amd(struct rdt_resource *r)
struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
union cpuid_0x10_3_eax eax;
union cpuid_0x10_x_edx edx;
u32 ebx, ecx;
u32 ebx, ecx, subleaf;
/*
* Query CPUID_Fn80000020_EDX_x01 for MBA and
* CPUID_Fn80000020_EDX_x02 for SMBA
*/
subleaf = (r->rid == RDT_RESOURCE_SMBA) ? 2 : 1;
cpuid_count(0x80000020, 1, &eax.full, &ebx, &ecx, &edx.full);
cpuid_count(0x80000020, subleaf, &eax.full, &ebx, &ecx, &edx.full);
hw_res->num_closid = edx.split.cos_max + 1;
r->default_ctrl = MAX_MBA_BW_AMD;
......@@ -647,6 +672,8 @@ enum {
RDT_FLAG_L2_CAT,
RDT_FLAG_L2_CDP,
RDT_FLAG_MBA,
RDT_FLAG_SMBA,
RDT_FLAG_BMEC,
};
#define RDT_OPT(idx, n, f) \
......@@ -670,6 +697,8 @@ static struct rdt_options rdt_options[] __initdata = {
RDT_OPT(RDT_FLAG_L2_CAT, "l2cat", X86_FEATURE_CAT_L2),
RDT_OPT(RDT_FLAG_L2_CDP, "l2cdp", X86_FEATURE_CDP_L2),
RDT_OPT(RDT_FLAG_MBA, "mba", X86_FEATURE_MBA),
RDT_OPT(RDT_FLAG_SMBA, "smba", X86_FEATURE_SMBA),
RDT_OPT(RDT_FLAG_BMEC, "bmec", X86_FEATURE_BMEC),
};
#define NUM_RDT_OPTIONS ARRAY_SIZE(rdt_options)
......@@ -699,7 +728,7 @@ static int __init set_rdt_options(char *str)
}
__setup("rdt", set_rdt_options);
static bool __init rdt_cpu_has(int flag)
bool __init rdt_cpu_has(int flag)
{
bool ret = boot_cpu_has(flag);
struct rdt_options *o;
......@@ -734,6 +763,19 @@ static __init bool get_mem_config(void)
return false;
}
static __init bool get_slow_mem_config(void)
{
struct rdt_hw_resource *hw_res = &rdt_resources_all[RDT_RESOURCE_SMBA];
if (!rdt_cpu_has(X86_FEATURE_SMBA))
return false;
if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD)
return __rdt_get_mem_config_amd(&hw_res->r_resctrl);
return false;
}
static __init bool get_rdt_alloc_resources(void)
{
struct rdt_resource *r;
......@@ -764,6 +806,9 @@ static __init bool get_rdt_alloc_resources(void)
if (get_mem_config())
ret = true;
if (get_slow_mem_config())
ret = true;
return ret;
}
......@@ -853,6 +898,9 @@ static __init void rdt_init_res_defs_amd(void)
} else if (r->rid == RDT_RESOURCE_MBA) {
hw_res->msr_base = MSR_IA32_MBA_BW_BASE;
hw_res->msr_update = mba_wrmsr_amd;
} else if (r->rid == RDT_RESOURCE_SMBA) {
hw_res->msr_base = MSR_IA32_SMBA_BW_BASE;
hw_res->msr_update = mba_wrmsr_amd;
}
}
}
......
......@@ -209,7 +209,7 @@ static int parse_line(char *line, struct resctrl_schema *s,
unsigned long dom_id;
if (rdtgrp->mode == RDT_MODE_PSEUDO_LOCKSETUP &&
r->rid == RDT_RESOURCE_MBA) {
(r->rid == RDT_RESOURCE_MBA || r->rid == RDT_RESOURCE_SMBA)) {
rdt_last_cmd_puts("Cannot pseudo-lock MBA resource\n");
return -EINVAL;
}
......@@ -310,7 +310,6 @@ int resctrl_arch_update_domains(struct rdt_resource *r, u32 closid)
enum resctrl_conf_type t;
cpumask_var_t cpu_mask;
struct rdt_domain *d;
int cpu;
u32 idx;
if (!zalloc_cpumask_var(&cpu_mask, GFP_KERNEL))
......@@ -341,13 +340,9 @@ int resctrl_arch_update_domains(struct rdt_resource *r, u32 closid)
if (cpumask_empty(cpu_mask))
goto done;
cpu = get_cpu();
/* Update resource control msr on this CPU if it's in cpu_mask. */
if (cpumask_test_cpu(cpu, cpu_mask))
rdt_ctrl_update(&msr_param);
/* Update resource control msr on other CPUs. */
smp_call_function_many(cpu_mask, rdt_ctrl_update, &msr_param, 1);
put_cpu();
/* Update resource control msr on all the CPUs. */
on_each_cpu_mask(cpu_mask, rdt_ctrl_update, &msr_param, 1);
done:
free_cpumask_var(cpu_mask);
......
......@@ -30,6 +30,29 @@
*/
#define MBM_CNTR_WIDTH_OFFSET_MAX (62 - MBM_CNTR_WIDTH_BASE)
/* Reads to Local DRAM Memory */
#define READS_TO_LOCAL_MEM BIT(0)
/* Reads to Remote DRAM Memory */
#define READS_TO_REMOTE_MEM BIT(1)
/* Non-Temporal Writes to Local Memory */
#define NON_TEMP_WRITE_TO_LOCAL_MEM BIT(2)
/* Non-Temporal Writes to Remote Memory */
#define NON_TEMP_WRITE_TO_REMOTE_MEM BIT(3)
/* Reads to Local Memory the system identifies as "Slow Memory" */
#define READS_TO_LOCAL_S_MEM BIT(4)
/* Reads to Remote Memory the system identifies as "Slow Memory" */
#define READS_TO_REMOTE_S_MEM BIT(5)
/* Dirty Victims to All Types of Memory */
#define DIRTY_VICTIMS_TO_ALL_MEM BIT(6)
/* Max event bits supported */
#define MAX_EVT_CONFIG_BITS GENMASK(6, 0)
struct rdt_fs_context {
struct kernfs_fs_context kfc;
......@@ -52,11 +75,13 @@ DECLARE_STATIC_KEY_FALSE(rdt_mon_enable_key);
* struct mon_evt - Entry in the event list of a resource
* @evtid: event id
* @name: name of the event
* @configurable: true if the event is configurable
* @list: entry in &rdt_resource->evt_list
*/
struct mon_evt {
enum resctrl_event_id evtid;
char *name;
bool configurable;
struct list_head list;
};
......@@ -409,6 +434,7 @@ enum resctrl_res_level {
RDT_RESOURCE_L3,
RDT_RESOURCE_L2,
RDT_RESOURCE_MBA,
RDT_RESOURCE_SMBA,
/* Must be the last */
RDT_NUM_RESOURCES,
......@@ -511,6 +537,7 @@ void closid_free(int closid);
int alloc_rmid(void);
void free_rmid(u32 rmid);
int rdt_get_mon_l3_config(struct rdt_resource *r);
bool __init rdt_cpu_has(int flag);
void mon_event_count(void *info);
int rdtgroup_mondata_show(struct seq_file *m, void *arg);
void mon_event_read(struct rmid_read *rr, struct rdt_resource *r,
......@@ -527,5 +554,6 @@ bool has_busy_rmid(struct rdt_resource *r, struct rdt_domain *d);
void __check_limbo(struct rdt_domain *d, bool force_free);
void rdt_domain_reconfigure_cdp(struct rdt_resource *r);
void __init thread_throttle_mode_init(void);
void __init mbm_config_rftype_init(const char *config);
#endif /* _ASM_X86_RESCTRL_INTERNAL_H */
......@@ -204,6 +204,23 @@ void resctrl_arch_reset_rmid(struct rdt_resource *r, struct rdt_domain *d,
}
}
/*
* Assumes that hardware counters are also reset and thus that there is
* no need to record initial non-zero counts.
*/
void resctrl_arch_reset_rmid_all(struct rdt_resource *r, struct rdt_domain *d)
{
struct rdt_hw_domain *hw_dom = resctrl_to_arch_dom(d);
if (is_mbm_total_enabled())
memset(hw_dom->arch_mbm_total, 0,
sizeof(*hw_dom->arch_mbm_total) * r->num_rmid);
if (is_mbm_local_enabled())
memset(hw_dom->arch_mbm_local, 0,
sizeof(*hw_dom->arch_mbm_local) * r->num_rmid);
}
static u64 mbm_overflow_count(u64 prev_msr, u64 cur_msr, unsigned int width)
{
u64 shift = 64 - width, chunks;
......@@ -763,7 +780,7 @@ static void l3_mon_evt_init(struct rdt_resource *r)
list_add_tail(&mbm_local_event.list, &r->evt_list);
}
int rdt_get_mon_l3_config(struct rdt_resource *r)
int __init rdt_get_mon_l3_config(struct rdt_resource *r)
{
unsigned int mbm_offset = boot_cpu_data.x86_cache_mbm_width_offset;
struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
......@@ -800,6 +817,17 @@ int rdt_get_mon_l3_config(struct rdt_resource *r)
if (ret)
return ret;
if (rdt_cpu_has(X86_FEATURE_BMEC)) {
if (rdt_cpu_has(X86_FEATURE_CQM_MBM_TOTAL)) {
mbm_total_event.configurable = true;
mbm_config_rftype_init("mbm_total_bytes_config");
}
if (rdt_cpu_has(X86_FEATURE_CQM_MBM_LOCAL)) {
mbm_local_event.configurable = true;
mbm_config_rftype_init("mbm_local_bytes_config");
}
}
l3_mon_evt_init(r);
r->mon_capable = true;
......
......@@ -325,12 +325,7 @@ static void update_cpu_closid_rmid(void *info)
static void
update_closid_rmid(const struct cpumask *cpu_mask, struct rdtgroup *r)
{
int cpu = get_cpu();
if (cpumask_test_cpu(cpu, cpu_mask))
update_cpu_closid_rmid(r);
smp_call_function_many(cpu_mask, update_cpu_closid_rmid, r, 1);
put_cpu();
on_each_cpu_mask(cpu_mask, update_cpu_closid_rmid, r, 1);
}
static int cpus_mon_write(struct rdtgroup *rdtgrp, cpumask_var_t newmask,
......@@ -1003,8 +998,11 @@ static int rdt_mon_features_show(struct kernfs_open_file *of,
struct rdt_resource *r = of->kn->parent->priv;
struct mon_evt *mevt;
list_for_each_entry(mevt, &r->evt_list, list)
list_for_each_entry(mevt, &r->evt_list, list) {
seq_printf(seq, "%s\n", mevt->name);
if (mevt->configurable)
seq_printf(seq, "%s_config\n", mevt->name);
}
return 0;
}
......@@ -1215,7 +1213,7 @@ static bool rdtgroup_mode_test_exclusive(struct rdtgroup *rdtgrp)
list_for_each_entry(s, &resctrl_schema_all, list) {
r = s->res;
if (r->rid == RDT_RESOURCE_MBA)
if (r->rid == RDT_RESOURCE_MBA || r->rid == RDT_RESOURCE_SMBA)
continue;
has_cache = true;
list_for_each_entry(d, &r->domains, list) {
......@@ -1404,7 +1402,8 @@ static int rdtgroup_size_show(struct kernfs_open_file *of,
ctrl = resctrl_arch_get_config(r, d,
closid,
type);
if (r->rid == RDT_RESOURCE_MBA)
if (r->rid == RDT_RESOURCE_MBA ||
r->rid == RDT_RESOURCE_SMBA)
size = ctrl;
else
size = rdtgroup_cbm_to_size(r, d, ctrl);
......@@ -1421,6 +1420,248 @@ static int rdtgroup_size_show(struct kernfs_open_file *of,
return ret;
}
struct mon_config_info {
u32 evtid;
u32 mon_config;
};
#define INVALID_CONFIG_INDEX UINT_MAX
/**
* mon_event_config_index_get - get the hardware index for the
* configurable event
* @evtid: event id.
*
* Return: 0 for evtid == QOS_L3_MBM_TOTAL_EVENT_ID
* 1 for evtid == QOS_L3_MBM_LOCAL_EVENT_ID
* INVALID_CONFIG_INDEX for invalid evtid
*/
static inline unsigned int mon_event_config_index_get(u32 evtid)
{
switch (evtid) {
case QOS_L3_MBM_TOTAL_EVENT_ID:
return 0;
case QOS_L3_MBM_LOCAL_EVENT_ID:
return 1;
default:
/* Should never reach here */
return INVALID_CONFIG_INDEX;
}
}
static void mon_event_config_read(void *info)
{
struct mon_config_info *mon_info = info;
unsigned int index;
u64 msrval;
index = mon_event_config_index_get(mon_info->evtid);
if (index == INVALID_CONFIG_INDEX) {
pr_warn_once("Invalid event id %d\n", mon_info->evtid);
return;
}
rdmsrl(MSR_IA32_EVT_CFG_BASE + index, msrval);
/* Report only the valid event configuration bits */
mon_info->mon_config = msrval & MAX_EVT_CONFIG_BITS;
}
static void mondata_config_read(struct rdt_domain *d, struct mon_config_info *mon_info)
{
smp_call_function_any(&d->cpu_mask, mon_event_config_read, mon_info, 1);
}
static int mbm_config_show(struct seq_file *s, struct rdt_resource *r, u32 evtid)
{
struct mon_config_info mon_info = {0};
struct rdt_domain *dom;
bool sep = false;
mutex_lock(&rdtgroup_mutex);
list_for_each_entry(dom, &r->domains, list) {
if (sep)
seq_puts(s, ";");
memset(&mon_info, 0, sizeof(struct mon_config_info));
mon_info.evtid = evtid;
mondata_config_read(dom, &mon_info);
seq_printf(s, "%d=0x%02x", dom->id, mon_info.mon_config);
sep = true;
}
seq_puts(s, "\n");
mutex_unlock(&rdtgroup_mutex);
return 0;
}
static int mbm_total_bytes_config_show(struct kernfs_open_file *of,
struct seq_file *seq, void *v)
{
struct rdt_resource *r = of->kn->parent->priv;
mbm_config_show(seq, r, QOS_L3_MBM_TOTAL_EVENT_ID);
return 0;
}
static int mbm_local_bytes_config_show(struct kernfs_open_file *of,
struct seq_file *seq, void *v)
{
struct rdt_resource *r = of->kn->parent->priv;
mbm_config_show(seq, r, QOS_L3_MBM_LOCAL_EVENT_ID);
return 0;
}
static void mon_event_config_write(void *info)
{
struct mon_config_info *mon_info = info;
unsigned int index;
index = mon_event_config_index_get(mon_info->evtid);
if (index == INVALID_CONFIG_INDEX) {
pr_warn_once("Invalid event id %d\n", mon_info->evtid);
return;
}
wrmsr(MSR_IA32_EVT_CFG_BASE + index, mon_info->mon_config, 0);
}
static int mbm_config_write_domain(struct rdt_resource *r,
struct rdt_domain *d, u32 evtid, u32 val)
{
struct mon_config_info mon_info = {0};
int ret = 0;
/* mon_config cannot be more than the supported set of events */
if (val > MAX_EVT_CONFIG_BITS) {
rdt_last_cmd_puts("Invalid event configuration\n");
return -EINVAL;
}
/*
* Read the current config value first. If both are the same then
* no need to write it again.
*/
mon_info.evtid = evtid;
mondata_config_read(d, &mon_info);
if (mon_info.mon_config == val)
goto out;
mon_info.mon_config = val;
/*
* Update MSR_IA32_EVT_CFG_BASE MSR on one of the CPUs in the
* domain. The MSRs offset from MSR MSR_IA32_EVT_CFG_BASE
* are scoped at the domain level. Writing any of these MSRs
* on one CPU is observed by all the CPUs in the domain.
*/
smp_call_function_any(&d->cpu_mask, mon_event_config_write,
&mon_info, 1);
/*
* When an Event Configuration is changed, the bandwidth counters
* for all RMIDs and Events will be cleared by the hardware. The
* hardware also sets MSR_IA32_QM_CTR.Unavailable (bit 62) for
* every RMID on the next read to any event for every RMID.
* Subsequent reads will have MSR_IA32_QM_CTR.Unavailable (bit 62)
* cleared while it is tracked by the hardware. Clear the
* mbm_local and mbm_total counts for all the RMIDs.
*/
resctrl_arch_reset_rmid_all(r, d);
out:
return ret;
}
static int mon_config_write(struct rdt_resource *r, char *tok, u32 evtid)
{
char *dom_str = NULL, *id_str;
unsigned long dom_id, val;
struct rdt_domain *d;
int ret = 0;
next:
if (!tok || tok[0] == '\0')
return 0;
/* Start processing the strings for each domain */
dom_str = strim(strsep(&tok, ";"));
id_str = strsep(&dom_str, "=");
if (!id_str || kstrtoul(id_str, 10, &dom_id)) {
rdt_last_cmd_puts("Missing '=' or non-numeric domain id\n");
return -EINVAL;
}
if (!dom_str || kstrtoul(dom_str, 16, &val)) {
rdt_last_cmd_puts("Non-numeric event configuration value\n");
return -EINVAL;
}
list_for_each_entry(d, &r->domains, list) {
if (d->id == dom_id) {
ret = mbm_config_write_domain(r, d, evtid, val);
if (ret)
return -EINVAL;
goto next;
}
}
return -EINVAL;
}
static ssize_t mbm_total_bytes_config_write(struct kernfs_open_file *of,
char *buf, size_t nbytes,
loff_t off)
{
struct rdt_resource *r = of->kn->parent->priv;
int ret;
/* Valid input requires a trailing newline */
if (nbytes == 0 || buf[nbytes - 1] != '\n')
return -EINVAL;
mutex_lock(&rdtgroup_mutex);
rdt_last_cmd_clear();
buf[nbytes - 1] = '\0';
ret = mon_config_write(r, buf, QOS_L3_MBM_TOTAL_EVENT_ID);
mutex_unlock(&rdtgroup_mutex);
return ret ?: nbytes;
}
static ssize_t mbm_local_bytes_config_write(struct kernfs_open_file *of,
char *buf, size_t nbytes,
loff_t off)
{
struct rdt_resource *r = of->kn->parent->priv;
int ret;
/* Valid input requires a trailing newline */
if (nbytes == 0 || buf[nbytes - 1] != '\n')
return -EINVAL;
mutex_lock(&rdtgroup_mutex);
rdt_last_cmd_clear();
buf[nbytes - 1] = '\0';
ret = mon_config_write(r, buf, QOS_L3_MBM_LOCAL_EVENT_ID);
mutex_unlock(&rdtgroup_mutex);
return ret ?: nbytes;
}
/* rdtgroup information files for one cache resource. */
static struct rftype res_common_files[] = {
{
......@@ -1519,6 +1760,20 @@ static struct rftype res_common_files[] = {
.seq_show = max_threshold_occ_show,
.fflags = RF_MON_INFO | RFTYPE_RES_CACHE,
},
{
.name = "mbm_total_bytes_config",
.mode = 0644,
.kf_ops = &rdtgroup_kf_single_ops,
.seq_show = mbm_total_bytes_config_show,
.write = mbm_total_bytes_config_write,
},
{
.name = "mbm_local_bytes_config",
.mode = 0644,
.kf_ops = &rdtgroup_kf_single_ops,
.seq_show = mbm_local_bytes_config_show,
.write = mbm_local_bytes_config_write,
},
{
.name = "cpus",
.mode = 0644,
......@@ -1625,6 +1880,15 @@ void __init thread_throttle_mode_init(void)
rft->fflags = RF_CTRL_INFO | RFTYPE_RES_MB;
}
void __init mbm_config_rftype_init(const char *config)
{
struct rftype *rft;
rft = rdtgroup_get_rftype_by_name(config);
if (rft)
rft->fflags = RF_MON_INFO | RFTYPE_RES_CACHE;
}
/**
* rdtgroup_kn_mode_restrict - Restrict user access to named resctrl file
* @r: The resource group with which the file is associated.
......@@ -1866,13 +2130,9 @@ static int set_cache_qos_cfg(int level, bool enable)
/* Pick one CPU from each domain instance to update MSR */
cpumask_set_cpu(cpumask_any(&d->cpu_mask), cpu_mask);
}
cpu = get_cpu();
/* Update QOS_CFG MSR on this cpu if it's in cpu_mask. */
if (cpumask_test_cpu(cpu, cpu_mask))
update(&enable);
/* Update QOS_CFG MSR on all other cpus in cpu_mask. */
smp_call_function_many(cpu_mask, update, &enable, 1);
put_cpu();
/* Update QOS_CFG MSR on all the CPUs in cpu_mask */
on_each_cpu_mask(cpu_mask, update, &enable, 1);
free_cpumask_var(cpu_mask);
......@@ -2349,7 +2609,7 @@ static int reset_all_ctrls(struct rdt_resource *r)
struct msr_param msr_param;
cpumask_var_t cpu_mask;
struct rdt_domain *d;
int i, cpu;
int i;
if (!zalloc_cpumask_var(&cpu_mask, GFP_KERNEL))
return -ENOMEM;
......@@ -2370,13 +2630,9 @@ static int reset_all_ctrls(struct rdt_resource *r)
for (i = 0; i < hw_res->num_closid; i++)
hw_dom->ctrl_val[i] = r->default_ctrl;
}
cpu = get_cpu();
/* Update CBM on this cpu if it's in cpu_mask. */
if (cpumask_test_cpu(cpu, cpu_mask))
rdt_ctrl_update(&msr_param);
/* Update CBM on all other cpus in cpu_mask. */
smp_call_function_many(cpu_mask, rdt_ctrl_update, &msr_param, 1);
put_cpu();
/* Update CBM on all the CPUs in cpu_mask */
on_each_cpu_mask(cpu_mask, rdt_ctrl_update, &msr_param, 1);
free_cpumask_var(cpu_mask);
......@@ -2855,7 +3111,8 @@ static int rdtgroup_init_alloc(struct rdtgroup *rdtgrp)
list_for_each_entry(s, &resctrl_schema_all, list) {
r = s->res;
if (r->rid == RDT_RESOURCE_MBA) {
if (r->rid == RDT_RESOURCE_MBA ||
r->rid == RDT_RESOURCE_SMBA) {
rdtgroup_init_mba(r, rdtgrp->closid);
if (is_mba_sc(r))
continue;
......
......@@ -45,6 +45,8 @@ static const struct cpuid_bit cpuid_bits[] = {
{ X86_FEATURE_CPB, CPUID_EDX, 9, 0x80000007, 0 },
{ X86_FEATURE_PROC_FEEDBACK, CPUID_EDX, 11, 0x80000007, 0 },
{ X86_FEATURE_MBA, CPUID_EBX, 6, 0x80000008, 0 },
{ X86_FEATURE_SMBA, CPUID_EBX, 2, 0x80000020, 0 },
{ X86_FEATURE_BMEC, CPUID_EBX, 3, 0x80000020, 0 },
{ X86_FEATURE_PERFMON_V2, CPUID_EAX, 0, 0x80000022, 0 },
{ X86_FEATURE_AMD_LBR_V2, CPUID_EAX, 1, 0x80000022, 0 },
{ 0, 0, 0, 0, 0 }
......
......@@ -250,6 +250,17 @@ int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_domain *d,
void resctrl_arch_reset_rmid(struct rdt_resource *r, struct rdt_domain *d,
u32 rmid, enum resctrl_event_id eventid);
/**
* resctrl_arch_reset_rmid_all() - Reset all private state associated with
* all rmids and eventids.
* @r: The resctrl resource.
* @d: The domain for which all architectural counter state will
* be cleared.
*
* This can be called from any CPU.
*/
void resctrl_arch_reset_rmid_all(struct rdt_resource *r, struct rdt_domain *d);
extern unsigned int resctrl_rmid_realloc_threshold;
extern unsigned int resctrl_rmid_realloc_limit;
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment