Commit af65545a authored by Borislav Petkov (AMD)'s avatar Borislav Petkov (AMD)

Merge remote-tracking branches 'ras/edac-drivers', 'ras/edac-misc' and...

Merge remote-tracking branches 'ras/edac-drivers', 'ras/edac-misc' and 'ras/edac-amd-atl' into edac-updates-for-v6.9

* ras/edac-drivers:
  EDAC/i10nm: Add Intel Grand Ridge micro-server support
  EDAC/igen6: Add one more Intel Alder Lake-N SoC support

* ras/edac-misc:
  EDAC/versal: Convert to platform remove callback returning void
  EDAC/versal: Make the bit position of injected errors configurable
  EDAC/synopsys: Convert to devm_platform_ioremap_resource()

* ras/edac-amd-atl:
  RAS/AMD/FMPM: Fix off by one when unwinding on error
  RAS/AMD/FMPM: Add debugfs interface to print record entries
  RAS/AMD/FMPM: Save SPA values
  RAS: Export helper to get ras_debugfs_dir
  RAS/AMD/ATL: Fix bit overflow in denorm_addr_df4_np2()
  RAS: Introduce a FRU memory poison manager
  RAS/AMD/ATL: Add MI300 row retirement support
  Documentation: Move RAS section to admin-guide
  RAS/AMD/ATL: Add MI300 DRAM to normalized address translation support
  RAS/AMD/ATL: Fix array overflow in get_logical_coh_st_fabric_id_mi300()
  RAS/AMD/ATL: Add MI300 support
  Documentation: RAS: Add index and address translation section
  EDAC/amd64: Use new AMD Address Translation Library
  RAS: Introduce AMD Address Translation Library
Signed-off-by: default avatarBorislav Petkov (AMD) <bp@alien8.de>
.. SPDX-License-Identifier: GPL-2.0
Address translation
===================
x86 AMD
-------
Zen-based AMD systems include a Data Fabric that manages the layout of
physical memory. Devices attached to the Fabric, like memory controllers,
I/O, etc., may not have a complete view of the system physical memory map.
These devices may provide a "normalized", i.e. device physical, address
when reporting memory errors. Normalized addresses must be translated to
a system physical address for the kernel to action on the memory.
AMD Address Translation Library (CONFIG_AMD_ATL) provides translation for
this case.
Glossary of acronyms used in address translation for Zen-based systems
* CCM = Cache Coherent Moderator
* COD = Cluster-on-Die
* COH_ST = Coherent Station
* DF = Data Fabric
.. SPDX-License-Identifier: GPL-2.0 .. SPDX-License-Identifier: GPL-2.0
Reliability, Availability and Serviceability features
=====================================================
This documents different aspects of the RAS functionality present in the
kernel.
Error decoding Error decoding
--------------- ==============
* x86 x86
---
Error decoding on AMD systems should be done using the rasdaemon tool: Error decoding on AMD systems should be done using the rasdaemon tool:
https://github.com/mchehab/rasdaemon/ https://github.com/mchehab/rasdaemon/
......
.. SPDX-License-Identifier: GPL-2.0
.. toctree::
:maxdepth: 2
main
error-decoding
address-translation
.. SPDX-License-Identifier: GPL-2.0
.. include:: <isonum.txt> .. include:: <isonum.txt>
============================================ ==================================================
Reliability, Availability and Serviceability Reliability, Availability and Serviceability (RAS)
============================================ ==================================================
This documents different aspects of the RAS functionality present in the
kernel.
RAS concepts RAS concepts
************ ************
......
...@@ -122,7 +122,7 @@ configure specific aspects of kernel behavior to your liking. ...@@ -122,7 +122,7 @@ configure specific aspects of kernel behavior to your liking.
pmf pmf
pnp pnp
rapidio rapidio
ras RAS/index
rtc rtc
serial-console serial-console
svga svga
......
...@@ -113,7 +113,6 @@ to ReStructured Text format, or are simply too old. ...@@ -113,7 +113,6 @@ to ReStructured Text format, or are simply too old.
:maxdepth: 1 :maxdepth: 1
staging/index staging/index
RAS/ras
Translations Translations
......
...@@ -897,6 +897,12 @@ Q: https://patchwork.kernel.org/project/linux-rdma/list/ ...@@ -897,6 +897,12 @@ Q: https://patchwork.kernel.org/project/linux-rdma/list/
F: drivers/infiniband/hw/efa/ F: drivers/infiniband/hw/efa/
F: include/uapi/rdma/efa-abi.h F: include/uapi/rdma/efa-abi.h
AMD ADDRESS TRANSLATION LIBRARY (ATL)
M: Yazen Ghannam <Yazen.Ghannam@amd.com>
L: linux-edac@vger.kernel.org
S: Supported
F: drivers/ras/amd/atl/*
AMD AXI W1 DRIVER AMD AXI W1 DRIVER
M: Kris Chaplin <kris.chaplin@amd.com> M: Kris Chaplin <kris.chaplin@amd.com>
R: Thomas Delev <thomas.delev@amd.com> R: Thomas Delev <thomas.delev@amd.com>
...@@ -7578,7 +7584,6 @@ R: Robert Richter <rric@kernel.org> ...@@ -7578,7 +7584,6 @@ R: Robert Richter <rric@kernel.org>
L: linux-edac@vger.kernel.org L: linux-edac@vger.kernel.org
S: Supported S: Supported
T: git git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras.git edac-for-next T: git git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras.git edac-for-next
F: Documentation/admin-guide/ras.rst
F: Documentation/driver-api/edac.rst F: Documentation/driver-api/edac.rst
F: drivers/edac/ F: drivers/edac/
F: include/linux/edac.h F: include/linux/edac.h
...@@ -18353,11 +18358,17 @@ M: Tony Luck <tony.luck@intel.com> ...@@ -18353,11 +18358,17 @@ M: Tony Luck <tony.luck@intel.com>
M: Borislav Petkov <bp@alien8.de> M: Borislav Petkov <bp@alien8.de>
L: linux-edac@vger.kernel.org L: linux-edac@vger.kernel.org
S: Maintained S: Maintained
F: Documentation/admin-guide/ras.rst F: Documentation/admin-guide/RAS
F: drivers/ras/ F: drivers/ras/
F: include/linux/ras.h F: include/linux/ras.h
F: include/ras/ras_event.h F: include/ras/ras_event.h
RAS FRU MEMORY POISON MANAGER (FMPM)
M: Yazen Ghannam <Yazen.Ghannam@amd.com>
L: linux-edac@vger.kernel.org
S: Maintained
F: drivers/ras/amd/fmpm.c
RC-CORE / LIRC FRAMEWORK RC-CORE / LIRC FRAMEWORK
M: Sean Young <sean@mess.org> M: Sean Young <sean@mess.org>
L: linux-media@vger.kernel.org L: linux-media@vger.kernel.org
......
...@@ -78,6 +78,7 @@ config EDAC_GHES ...@@ -78,6 +78,7 @@ config EDAC_GHES
config EDAC_AMD64 config EDAC_AMD64
tristate "AMD64 (Opteron, Athlon64)" tristate "AMD64 (Opteron, Athlon64)"
depends on AMD_NB && EDAC_DECODE_MCE depends on AMD_NB && EDAC_DECODE_MCE
imply AMD_ATL
help help
Support for error detection and correction of DRAM ECC errors on Support for error detection and correction of DRAM ECC errors on
the AMD64 families (>= K8) of memory controllers. the AMD64 families (>= K8) of memory controllers.
......
// SPDX-License-Identifier: GPL-2.0-only // SPDX-License-Identifier: GPL-2.0-only
#include <linux/ras.h>
#include "amd64_edac.h" #include "amd64_edac.h"
#include <asm/amd_nb.h> #include <asm/amd_nb.h>
...@@ -1051,281 +1052,6 @@ static int fixup_node_id(int node_id, struct mce *m) ...@@ -1051,281 +1052,6 @@ static int fixup_node_id(int node_id, struct mce *m)
return nid - gpu_node_map.base_node_id + 1; return nid - gpu_node_map.base_node_id + 1;
} }
/* Protect the PCI config register pairs used for DF indirect access. */
static DEFINE_MUTEX(df_indirect_mutex);
/*
* Data Fabric Indirect Access uses FICAA/FICAD.
*
* Fabric Indirect Configuration Access Address (FICAA): Constructed based
* on the device's Instance Id and the PCI function and register offset of
* the desired register.
*
* Fabric Indirect Configuration Access Data (FICAD): There are FICAD LO
* and FICAD HI registers but so far we only need the LO register.
*
* Use Instance Id 0xFF to indicate a broadcast read.
*/
#define DF_BROADCAST 0xFF
static int __df_indirect_read(u16 node, u8 func, u16 reg, u8 instance_id, u32 *lo)
{
struct pci_dev *F4;
u32 ficaa;
int err = -ENODEV;
if (node >= amd_nb_num())
goto out;
F4 = node_to_amd_nb(node)->link;
if (!F4)
goto out;
ficaa = (instance_id == DF_BROADCAST) ? 0 : 1;
ficaa |= reg & 0x3FC;
ficaa |= (func & 0x7) << 11;
ficaa |= instance_id << 16;
mutex_lock(&df_indirect_mutex);
err = pci_write_config_dword(F4, 0x5C, ficaa);
if (err) {
pr_warn("Error writing DF Indirect FICAA, FICAA=0x%x\n", ficaa);
goto out_unlock;
}
err = pci_read_config_dword(F4, 0x98, lo);
if (err)
pr_warn("Error reading DF Indirect FICAD LO, FICAA=0x%x.\n", ficaa);
out_unlock:
mutex_unlock(&df_indirect_mutex);
out:
return err;
}
static int df_indirect_read_instance(u16 node, u8 func, u16 reg, u8 instance_id, u32 *lo)
{
return __df_indirect_read(node, func, reg, instance_id, lo);
}
static int df_indirect_read_broadcast(u16 node, u8 func, u16 reg, u32 *lo)
{
return __df_indirect_read(node, func, reg, DF_BROADCAST, lo);
}
struct addr_ctx {
u64 ret_addr;
u32 tmp;
u16 nid;
u8 inst_id;
};
static int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 umc, u64 *sys_addr)
{
u64 dram_base_addr, dram_limit_addr, dram_hole_base;
u8 die_id_shift, die_id_mask, socket_id_shift, socket_id_mask;
u8 intlv_num_dies, intlv_num_chan, intlv_num_sockets;
u8 intlv_addr_sel, intlv_addr_bit;
u8 num_intlv_bits, hashed_bit;
u8 lgcy_mmio_hole_en, base = 0;
u8 cs_mask, cs_id = 0;
bool hash_enabled = false;
struct addr_ctx ctx;
memset(&ctx, 0, sizeof(ctx));
/* Start from the normalized address */
ctx.ret_addr = norm_addr;
ctx.nid = nid;
ctx.inst_id = umc;
/* Read D18F0x1B4 (DramOffset), check if base 1 is used. */
if (df_indirect_read_instance(nid, 0, 0x1B4, umc, &ctx.tmp))
goto out_err;
/* Remove HiAddrOffset from normalized address, if enabled: */
if (ctx.tmp & BIT(0)) {
u64 hi_addr_offset = (ctx.tmp & GENMASK_ULL(31, 20)) << 8;
if (norm_addr >= hi_addr_offset) {
ctx.ret_addr -= hi_addr_offset;
base = 1;
}
}
/* Read D18F0x110 (DramBaseAddress). */
if (df_indirect_read_instance(nid, 0, 0x110 + (8 * base), umc, &ctx.tmp))
goto out_err;
/* Check if address range is valid. */
if (!(ctx.tmp & BIT(0))) {
pr_err("%s: Invalid DramBaseAddress range: 0x%x.\n",
__func__, ctx.tmp);
goto out_err;
}
lgcy_mmio_hole_en = ctx.tmp & BIT(1);
intlv_num_chan = (ctx.tmp >> 4) & 0xF;
intlv_addr_sel = (ctx.tmp >> 8) & 0x7;
dram_base_addr = (ctx.tmp & GENMASK_ULL(31, 12)) << 16;
/* {0, 1, 2, 3} map to address bits {8, 9, 10, 11} respectively */
if (intlv_addr_sel > 3) {
pr_err("%s: Invalid interleave address select %d.\n",
__func__, intlv_addr_sel);
goto out_err;
}
/* Read D18F0x114 (DramLimitAddress). */
if (df_indirect_read_instance(nid, 0, 0x114 + (8 * base), umc, &ctx.tmp))
goto out_err;
intlv_num_sockets = (ctx.tmp >> 8) & 0x1;
intlv_num_dies = (ctx.tmp >> 10) & 0x3;
dram_limit_addr = ((ctx.tmp & GENMASK_ULL(31, 12)) << 16) | GENMASK_ULL(27, 0);
intlv_addr_bit = intlv_addr_sel + 8;
/* Re-use intlv_num_chan by setting it equal to log2(#channels) */
switch (intlv_num_chan) {
case 0: intlv_num_chan = 0; break;
case 1: intlv_num_chan = 1; break;
case 3: intlv_num_chan = 2; break;
case 5: intlv_num_chan = 3; break;
case 7: intlv_num_chan = 4; break;
case 8: intlv_num_chan = 1;
hash_enabled = true;
break;
default:
pr_err("%s: Invalid number of interleaved channels %d.\n",
__func__, intlv_num_chan);
goto out_err;
}
num_intlv_bits = intlv_num_chan;
if (intlv_num_dies > 2) {
pr_err("%s: Invalid number of interleaved nodes/dies %d.\n",
__func__, intlv_num_dies);
goto out_err;
}
num_intlv_bits += intlv_num_dies;
/* Add a bit if sockets are interleaved. */
num_intlv_bits += intlv_num_sockets;
/* Assert num_intlv_bits <= 4 */
if (num_intlv_bits > 4) {
pr_err("%s: Invalid interleave bits %d.\n",
__func__, num_intlv_bits);
goto out_err;
}
if (num_intlv_bits > 0) {
u64 temp_addr_x, temp_addr_i, temp_addr_y;
u8 die_id_bit, sock_id_bit, cs_fabric_id;
/*
* Read FabricBlockInstanceInformation3_CS[BlockFabricID].
* This is the fabric id for this coherent slave. Use
* umc/channel# as instance id of the coherent slave
* for FICAA.
*/
if (df_indirect_read_instance(nid, 0, 0x50, umc, &ctx.tmp))
goto out_err;
cs_fabric_id = (ctx.tmp >> 8) & 0xFF;
die_id_bit = 0;
/* If interleaved over more than 1 channel: */
if (intlv_num_chan) {
die_id_bit = intlv_num_chan;
cs_mask = (1 << die_id_bit) - 1;
cs_id = cs_fabric_id & cs_mask;
}
sock_id_bit = die_id_bit;
/* Read D18F1x208 (SystemFabricIdMask). */
if (intlv_num_dies || intlv_num_sockets)
if (df_indirect_read_broadcast(nid, 1, 0x208, &ctx.tmp))
goto out_err;
/* If interleaved over more than 1 die. */
if (intlv_num_dies) {
sock_id_bit = die_id_bit + intlv_num_dies;
die_id_shift = (ctx.tmp >> 24) & 0xF;
die_id_mask = (ctx.tmp >> 8) & 0xFF;
cs_id |= ((cs_fabric_id & die_id_mask) >> die_id_shift) << die_id_bit;
}
/* If interleaved over more than 1 socket. */
if (intlv_num_sockets) {
socket_id_shift = (ctx.tmp >> 28) & 0xF;
socket_id_mask = (ctx.tmp >> 16) & 0xFF;
cs_id |= ((cs_fabric_id & socket_id_mask) >> socket_id_shift) << sock_id_bit;
}
/*
* The pre-interleaved address consists of XXXXXXIIIYYYYY
* where III is the ID for this CS, and XXXXXXYYYYY are the
* address bits from the post-interleaved address.
* "num_intlv_bits" has been calculated to tell us how many "I"
* bits there are. "intlv_addr_bit" tells us how many "Y" bits
* there are (where "I" starts).
*/
temp_addr_y = ctx.ret_addr & GENMASK_ULL(intlv_addr_bit - 1, 0);
temp_addr_i = (cs_id << intlv_addr_bit);
temp_addr_x = (ctx.ret_addr & GENMASK_ULL(63, intlv_addr_bit)) << num_intlv_bits;
ctx.ret_addr = temp_addr_x | temp_addr_i | temp_addr_y;
}
/* Add dram base address */
ctx.ret_addr += dram_base_addr;
/* If legacy MMIO hole enabled */
if (lgcy_mmio_hole_en) {
if (df_indirect_read_broadcast(nid, 0, 0x104, &ctx.tmp))
goto out_err;
dram_hole_base = ctx.tmp & GENMASK(31, 24);
if (ctx.ret_addr >= dram_hole_base)
ctx.ret_addr += (BIT_ULL(32) - dram_hole_base);
}
if (hash_enabled) {
/* Save some parentheses and grab ls-bit at the end. */
hashed_bit = (ctx.ret_addr >> 12) ^
(ctx.ret_addr >> 18) ^
(ctx.ret_addr >> 21) ^
(ctx.ret_addr >> 30) ^
cs_id;
hashed_bit &= BIT(0);
if (hashed_bit != ((ctx.ret_addr >> intlv_addr_bit) & BIT(0)))
ctx.ret_addr ^= BIT(intlv_addr_bit);
}
/* Is calculated system address is above DRAM limit address? */
if (ctx.ret_addr > dram_limit_addr)
goto out_err;
*sys_addr = ctx.ret_addr;
return 0;
out_err:
return -EINVAL;
}
static int get_channel_from_ecc_syndrome(struct mem_ctl_info *, u16); static int get_channel_from_ecc_syndrome(struct mem_ctl_info *, u16);
/* /*
...@@ -3073,9 +2799,10 @@ static void decode_umc_error(int node_id, struct mce *m) ...@@ -3073,9 +2799,10 @@ static void decode_umc_error(int node_id, struct mce *m)
{ {
u8 ecc_type = (m->status >> 45) & 0x3; u8 ecc_type = (m->status >> 45) & 0x3;
struct mem_ctl_info *mci; struct mem_ctl_info *mci;
unsigned long sys_addr;
struct amd64_pvt *pvt; struct amd64_pvt *pvt;
struct atl_err a_err;
struct err_info err; struct err_info err;
u64 sys_addr;
node_id = fixup_node_id(node_id, m); node_id = fixup_node_id(node_id, m);
...@@ -3106,7 +2833,12 @@ static void decode_umc_error(int node_id, struct mce *m) ...@@ -3106,7 +2833,12 @@ static void decode_umc_error(int node_id, struct mce *m)
pvt->ops->get_err_info(m, &err); pvt->ops->get_err_info(m, &err);
if (umc_normaddr_to_sysaddr(m->addr, pvt->mc_node_id, err.channel, &sys_addr)) { a_err.addr = m->addr;
a_err.ipid = m->ipid;
a_err.cpu = m->extcpu;
sys_addr = amd_convert_umc_mca_addr_to_sys_addr(&a_err);
if (IS_ERR_VALUE(sys_addr)) {
err.err_code = ERR_NORM_ADDR; err.err_code = ERR_NORM_ADDR;
goto log_error; goto log_error;
} }
......
...@@ -1324,11 +1324,9 @@ static int mc_probe(struct platform_device *pdev) ...@@ -1324,11 +1324,9 @@ static int mc_probe(struct platform_device *pdev)
struct synps_edac_priv *priv; struct synps_edac_priv *priv;
struct mem_ctl_info *mci; struct mem_ctl_info *mci;
void __iomem *baseaddr; void __iomem *baseaddr;
struct resource *res;
int rc; int rc;
res = platform_get_resource(pdev, IORESOURCE_MEM, 0); baseaddr = devm_platform_ioremap_resource(pdev, 0);
baseaddr = devm_ioremap_resource(&pdev->dev, res);
if (IS_ERR(baseaddr)) if (IS_ERR(baseaddr))
return PTR_ERR(baseaddr); return PTR_ERR(baseaddr);
......
This diff is collapsed.
...@@ -32,5 +32,18 @@ menuconfig RAS ...@@ -32,5 +32,18 @@ menuconfig RAS
if RAS if RAS
source "arch/x86/ras/Kconfig" source "arch/x86/ras/Kconfig"
source "drivers/ras/amd/atl/Kconfig"
config RAS_FMPM
tristate "FRU Memory Poison Manager"
default m
depends on AMD_ATL && ACPI_APEI
help
Support saving and restoring memory error information across reboot
using ACPI ERST as persistent storage. Error information is saved with
the UEFI CPER "FRU Memory Poison" section format.
Memory will be retired during boot time and run time depending on
platform-specific policies.
endif endif
...@@ -2,3 +2,6 @@ ...@@ -2,3 +2,6 @@
obj-$(CONFIG_RAS) += ras.o obj-$(CONFIG_RAS) += ras.o
obj-$(CONFIG_DEBUG_FS) += debugfs.o obj-$(CONFIG_DEBUG_FS) += debugfs.o
obj-$(CONFIG_RAS_CEC) += cec.o obj-$(CONFIG_RAS_CEC) += cec.o
obj-$(CONFIG_RAS_FMPM) += amd/fmpm.o
obj-y += amd/atl/
# SPDX-License-Identifier: GPL-2.0-or-later
#
# AMD Address Translation Library Kconfig
#
# Copyright (c) 2023, Advanced Micro Devices, Inc.
# All Rights Reserved.
#
# Author: Yazen Ghannam <Yazen.Ghannam@amd.com>
config AMD_ATL
tristate "AMD Address Translation Library"
depends on AMD_NB && X86_64 && RAS
depends on MEMORY_FAILURE
default N
help
This library includes support for implementation-specific
address translation procedures needed for various error
handling cases.
Enable this option if using DRAM ECC on Zen-based systems
and OS-based error handling.
# SPDX-License-Identifier: GPL-2.0-or-later
#
# AMD Address Translation Library Makefile
#
# Copyright (c) 2023, Advanced Micro Devices, Inc.
# All Rights Reserved.
#
# Author: Yazen Ghannam <Yazen.Ghannam@amd.com>
amd_atl-y := access.o
amd_atl-y += core.o
amd_atl-y += dehash.o
amd_atl-y += denormalize.o
amd_atl-y += map.o
amd_atl-y += system.o
amd_atl-y += umc.o
obj-$(CONFIG_AMD_ATL) += amd_atl.o
// SPDX-License-Identifier: GPL-2.0-or-later
/*
* AMD Address Translation Library
*
* access.c : DF Indirect Access functions
*
* Copyright (c) 2023, Advanced Micro Devices, Inc.
* All Rights Reserved.
*
* Author: Yazen Ghannam <Yazen.Ghannam@amd.com>
*/
#include "internal.h"
/* Protect the PCI config register pairs used for DF indirect access. */
static DEFINE_MUTEX(df_indirect_mutex);
/*
* Data Fabric Indirect Access uses FICAA/FICAD.
*
* Fabric Indirect Configuration Access Address (FICAA): constructed based
* on the device's Instance Id and the PCI function and register offset of
* the desired register.
*
* Fabric Indirect Configuration Access Data (FICAD): there are FICAD
* low and high registers but so far only the low register is needed.
*
* Use Instance Id 0xFF to indicate a broadcast read.
*/
#define DF_BROADCAST 0xFF
#define DF_FICAA_INST_EN BIT(0)
#define DF_FICAA_REG_NUM GENMASK(10, 1)
#define DF_FICAA_FUNC_NUM GENMASK(13, 11)
#define DF_FICAA_INST_ID GENMASK(23, 16)
#define DF_FICAA_REG_NUM_LEGACY GENMASK(10, 2)
static u16 get_accessible_node(u16 node)
{
/*
* On heterogeneous systems, not all AMD Nodes are accessible
* through software-visible registers. The Node ID needs to be
* adjusted for register accesses. But its value should not be
* changed for the translation methods.
*/
if (df_cfg.flags.heterogeneous) {
/* Only Node 0 is accessible on DF3.5 systems. */
if (df_cfg.rev == DF3p5)
node = 0;
/*
* Only the first Node in each Socket is accessible on
* DF4.5 systems, and this is visible to software as one
* Fabric per Socket. The Socket ID can be derived from
* the Node ID and global shift values.
*/
if (df_cfg.rev == DF4p5)
node >>= df_cfg.socket_id_shift - df_cfg.node_id_shift;
}
return node;
}
static int __df_indirect_read(u16 node, u8 func, u16 reg, u8 instance_id, u32 *lo)
{
u32 ficaa_addr = 0x8C, ficad_addr = 0xB8;
struct pci_dev *F4;
int err = -ENODEV;
u32 ficaa = 0;
node = get_accessible_node(node);
if (node >= amd_nb_num())
goto out;
F4 = node_to_amd_nb(node)->link;
if (!F4)
goto out;
/* Enable instance-specific access. */
if (instance_id != DF_BROADCAST) {
ficaa |= FIELD_PREP(DF_FICAA_INST_EN, 1);
ficaa |= FIELD_PREP(DF_FICAA_INST_ID, instance_id);
}
/*
* The two least-significant bits are masked when inputing the
* register offset to FICAA.
*/
reg >>= 2;
if (df_cfg.flags.legacy_ficaa) {
ficaa_addr = 0x5C;
ficad_addr = 0x98;
ficaa |= FIELD_PREP(DF_FICAA_REG_NUM_LEGACY, reg);
} else {
ficaa |= FIELD_PREP(DF_FICAA_REG_NUM, reg);
}
ficaa |= FIELD_PREP(DF_FICAA_FUNC_NUM, func);
mutex_lock(&df_indirect_mutex);
err = pci_write_config_dword(F4, ficaa_addr, ficaa);
if (err) {
pr_warn("Error writing DF Indirect FICAA, FICAA=0x%x\n", ficaa);
goto out_unlock;
}
err = pci_read_config_dword(F4, ficad_addr, lo);
if (err)
pr_warn("Error reading DF Indirect FICAD LO, FICAA=0x%x.\n", ficaa);
pr_debug("node=%u inst=0x%x func=0x%x reg=0x%x val=0x%x",
node, instance_id, func, reg << 2, *lo);
out_unlock:
mutex_unlock(&df_indirect_mutex);
out:
return err;
}
int df_indirect_read_instance(u16 node, u8 func, u16 reg, u8 instance_id, u32 *lo)
{
return __df_indirect_read(node, func, reg, instance_id, lo);
}
int df_indirect_read_broadcast(u16 node, u8 func, u16 reg, u32 *lo)
{
return __df_indirect_read(node, func, reg, DF_BROADCAST, lo);
}
// SPDX-License-Identifier: GPL-2.0-or-later
/*
* AMD Address Translation Library
*
* core.c : Module init and base translation functions
*
* Copyright (c) 2023, Advanced Micro Devices, Inc.
* All Rights Reserved.
*
* Author: Yazen Ghannam <Yazen.Ghannam@amd.com>
*/
#include <linux/module.h>
#include <asm/cpu_device_id.h>
#include "internal.h"
struct df_config df_cfg __read_mostly;
static int addr_over_limit(struct addr_ctx *ctx)
{
u64 dram_limit_addr;
if (df_cfg.rev >= DF4)
dram_limit_addr = FIELD_GET(DF4_DRAM_LIMIT_ADDR, ctx->map.limit);
else
dram_limit_addr = FIELD_GET(DF2_DRAM_LIMIT_ADDR, ctx->map.limit);
dram_limit_addr <<= DF_DRAM_BASE_LIMIT_LSB;
dram_limit_addr |= GENMASK(DF_DRAM_BASE_LIMIT_LSB - 1, 0);
/* Is calculated system address above DRAM limit address? */
if (ctx->ret_addr > dram_limit_addr) {
atl_debug(ctx, "Calculated address (0x%016llx) > DRAM limit (0x%016llx)",
ctx->ret_addr, dram_limit_addr);
return -EINVAL;
}
return 0;
}
static bool legacy_hole_en(struct addr_ctx *ctx)
{
u32 reg = ctx->map.base;
if (df_cfg.rev >= DF4)
reg = ctx->map.ctl;
return FIELD_GET(DF_LEGACY_MMIO_HOLE_EN, reg);
}
static int add_legacy_hole(struct addr_ctx *ctx)
{
u32 dram_hole_base;
u8 func = 0;
if (!legacy_hole_en(ctx))
return 0;
if (df_cfg.rev >= DF4)
func = 7;
if (df_indirect_read_broadcast(ctx->node_id, func, 0x104, &dram_hole_base))
return -EINVAL;
dram_hole_base &= DF_DRAM_HOLE_BASE_MASK;
if (ctx->ret_addr >= dram_hole_base)
ctx->ret_addr += (BIT_ULL(32) - dram_hole_base);
return 0;
}
static u64 get_base_addr(struct addr_ctx *ctx)
{
u64 base_addr;
if (df_cfg.rev >= DF4)
base_addr = FIELD_GET(DF4_BASE_ADDR, ctx->map.base);
else
base_addr = FIELD_GET(DF2_BASE_ADDR, ctx->map.base);
return base_addr << DF_DRAM_BASE_LIMIT_LSB;
}
static int add_base_and_hole(struct addr_ctx *ctx)
{
ctx->ret_addr += get_base_addr(ctx);
if (add_legacy_hole(ctx))
return -EINVAL;
return 0;
}
static bool late_hole_remove(struct addr_ctx *ctx)
{
if (df_cfg.rev == DF3p5)
return true;
if (df_cfg.rev == DF4)
return true;
if (ctx->map.intlv_mode == DF3_6CHAN)
return true;
return false;
}
unsigned long norm_to_sys_addr(u8 socket_id, u8 die_id, u8 coh_st_inst_id, unsigned long addr)
{
struct addr_ctx ctx;
if (df_cfg.rev == UNKNOWN)
return -EINVAL;
memset(&ctx, 0, sizeof(ctx));
/* Start from the normalized address */
ctx.ret_addr = addr;
ctx.inst_id = coh_st_inst_id;
ctx.inputs.norm_addr = addr;
ctx.inputs.socket_id = socket_id;
ctx.inputs.die_id = die_id;
ctx.inputs.coh_st_inst_id = coh_st_inst_id;
if (determine_node_id(&ctx, socket_id, die_id))
return -EINVAL;
if (get_address_map(&ctx))
return -EINVAL;
if (denormalize_address(&ctx))
return -EINVAL;
if (!late_hole_remove(&ctx) && add_base_and_hole(&ctx))
return -EINVAL;
if (dehash_address(&ctx))
return -EINVAL;
if (late_hole_remove(&ctx) && add_base_and_hole(&ctx))
return -EINVAL;
if (addr_over_limit(&ctx))
return -EINVAL;
return ctx.ret_addr;
}
static void check_for_legacy_df_access(void)
{
/*
* All Zen-based systems before Family 19h use the legacy
* DF Indirect Access (FICAA/FICAD) offsets.
*/
if (boot_cpu_data.x86 < 0x19) {
df_cfg.flags.legacy_ficaa = true;
return;
}
/* All systems after Family 19h use the current offsets. */
if (boot_cpu_data.x86 > 0x19)
return;
/* Some Family 19h systems use the legacy offsets. */
switch (boot_cpu_data.x86_model) {
case 0x00 ... 0x0f:
case 0x20 ... 0x5f:
df_cfg.flags.legacy_ficaa = true;
}
}
/*
* This library provides functionality for AMD-based systems with a Data Fabric.
* The set of systems with a Data Fabric is equivalent to the set of Zen-based systems
* and the set of systems with the Scalable MCA feature at this time. However, these
* are technically independent things.
*
* It's possible to match on the PCI IDs of the Data Fabric devices, but this will be
* an ever expanding list. Instead, match on the SMCA and Zen features to cover all
* relevant systems.
*/
static const struct x86_cpu_id amd_atl_cpuids[] = {
X86_MATCH_FEATURE(X86_FEATURE_SMCA, NULL),
X86_MATCH_FEATURE(X86_FEATURE_ZEN, NULL),
{ }
};
MODULE_DEVICE_TABLE(x86cpu, amd_atl_cpuids);
static int __init amd_atl_init(void)
{
if (!x86_match_cpu(amd_atl_cpuids))
return -ENODEV;
if (!amd_nb_num())
return -ENODEV;
check_for_legacy_df_access();
if (get_df_system_info())
return -ENODEV;
/* Increment this module's recount so that it can't be easily unloaded. */
__module_get(THIS_MODULE);
amd_atl_register_decoder(convert_umc_mca_addr_to_sys_addr);
pr_info("AMD Address Translation Library initialized");
return 0;
}
/*
* Exit function is only needed for testing and debug. Module unload must be
* forced to override refcount check.
*/
static void __exit amd_atl_exit(void)
{
amd_atl_unregister_decoder();
}
module_init(amd_atl_init);
module_exit(amd_atl_exit);
MODULE_LICENSE("GPL");
This diff is collapsed.
This diff is collapsed.
/* SPDX-License-Identifier: GPL-2.0 */
/*
* AMD Address Translation Library
*
* internal.h : Helper functions and common defines
*
* Copyright (c) 2023, Advanced Micro Devices, Inc.
* All Rights Reserved.
*
* Author: Yazen Ghannam <Yazen.Ghannam@amd.com>
*/
#ifndef __AMD_ATL_INTERNAL_H__
#define __AMD_ATL_INTERNAL_H__
#include <linux/bitfield.h>
#include <linux/bitops.h>
#include <linux/ras.h>
#include <asm/amd_nb.h>
#include "reg_fields.h"
/* Maximum possible number of Coherent Stations within a single Data Fabric. */
#define MAX_COH_ST_CHANNELS 32
/* PCI ID for Zen4 Server DF Function 0. */
#define DF_FUNC0_ID_ZEN4_SERVER 0x14AD1022
/* PCI IDs for MI300 DF Function 0. */
#define DF_FUNC0_ID_MI300 0x15281022
/* Shift needed for adjusting register values to true values. */
#define DF_DRAM_BASE_LIMIT_LSB 28
#define MI300_DRAM_LIMIT_LSB 20
enum df_revisions {
UNKNOWN,
DF2,
DF3,
DF3p5,
DF4,
DF4p5,
};
/* These are mapped 1:1 to the hardware values. Special cases are set at > 0x20. */
enum intlv_modes {
NONE = 0x00,
NOHASH_2CHAN = 0x01,
NOHASH_4CHAN = 0x03,
NOHASH_8CHAN = 0x05,
DF3_6CHAN = 0x06,
NOHASH_16CHAN = 0x07,
NOHASH_32CHAN = 0x08,
DF3_COD4_2CHAN_HASH = 0x0C,
DF3_COD2_4CHAN_HASH = 0x0D,
DF3_COD1_8CHAN_HASH = 0x0E,
DF4_NPS4_2CHAN_HASH = 0x10,
DF4_NPS2_4CHAN_HASH = 0x11,
DF4_NPS1_8CHAN_HASH = 0x12,
DF4_NPS4_3CHAN_HASH = 0x13,
DF4_NPS2_6CHAN_HASH = 0x14,
DF4_NPS1_12CHAN_HASH = 0x15,
DF4_NPS2_5CHAN_HASH = 0x16,
DF4_NPS1_10CHAN_HASH = 0x17,
MI3_HASH_8CHAN = 0x18,
MI3_HASH_16CHAN = 0x19,
MI3_HASH_32CHAN = 0x1A,
DF2_2CHAN_HASH = 0x21,
/* DF4.5 modes are all IntLvNumChan + 0x20 */
DF4p5_NPS1_16CHAN_1K_HASH = 0x2C,
DF4p5_NPS0_24CHAN_1K_HASH = 0x2E,
DF4p5_NPS4_2CHAN_1K_HASH = 0x30,
DF4p5_NPS2_4CHAN_1K_HASH = 0x31,
DF4p5_NPS1_8CHAN_1K_HASH = 0x32,
DF4p5_NPS4_3CHAN_1K_HASH = 0x33,
DF4p5_NPS2_6CHAN_1K_HASH = 0x34,
DF4p5_NPS1_12CHAN_1K_HASH = 0x35,
DF4p5_NPS2_5CHAN_1K_HASH = 0x36,
DF4p5_NPS1_10CHAN_1K_HASH = 0x37,
DF4p5_NPS4_2CHAN_2K_HASH = 0x40,
DF4p5_NPS2_4CHAN_2K_HASH = 0x41,
DF4p5_NPS1_8CHAN_2K_HASH = 0x42,
DF4p5_NPS1_16CHAN_2K_HASH = 0x43,
DF4p5_NPS4_3CHAN_2K_HASH = 0x44,
DF4p5_NPS2_6CHAN_2K_HASH = 0x45,
DF4p5_NPS1_12CHAN_2K_HASH = 0x46,
DF4p5_NPS0_24CHAN_2K_HASH = 0x47,
DF4p5_NPS2_5CHAN_2K_HASH = 0x48,
DF4p5_NPS1_10CHAN_2K_HASH = 0x49,
};
struct df_flags {
__u8 legacy_ficaa : 1,
socket_id_shift_quirk : 1,
heterogeneous : 1,
__reserved_0 : 5;
};
struct df_config {
enum df_revisions rev;
/*
* These masks operate on the 16-bit Coherent Station IDs,
* e.g. Instance, Fabric, Destination, etc.
*/
u16 component_id_mask;
u16 die_id_mask;
u16 node_id_mask;
u16 socket_id_mask;
/*
* Least-significant bit of Node ID portion of the
* system-wide Coherent Station Fabric ID.
*/
u8 node_id_shift;
/*
* Least-significant bit of Die portion of the Node ID.
* Adjusted to include the Node ID shift in order to apply
* to the Coherent Station Fabric ID.
*/
u8 die_id_shift;
/*
* Least-significant bit of Socket portion of the Node ID.
* Adjusted to include the Node ID shift in order to apply
* to the Coherent Station Fabric ID.
*/
u8 socket_id_shift;
/* Number of DRAM Address maps visible in a Coherent Station. */
u8 num_coh_st_maps;
/* Global flags to handle special cases. */
struct df_flags flags;
};
extern struct df_config df_cfg;
struct dram_addr_map {
/*
* Each DRAM Address Map can operate independently
* in different interleaving modes.
*/
enum intlv_modes intlv_mode;
/* System-wide number for this address map. */
u8 num;
/* Raw register values */
u32 base;
u32 limit;
u32 ctl;
u32 intlv;
/*
* Logical to Physical Coherent Station Remapping array
*
* Index: Logical Coherent Station Instance ID
* Value: Physical Coherent Station Instance ID
*
* phys_coh_st_inst_id = remap_array[log_coh_st_inst_id]
*/
u8 remap_array[MAX_COH_ST_CHANNELS];
/*
* Number of bits covering DRAM Address map 0
* when interleaving is non-power-of-2.
*
* Used only for DF3_6CHAN.
*/
u8 np2_bits;
/* Position of the 'interleave bit'. */
u8 intlv_bit_pos;
/* Number of channels interleaved in this map. */
u8 num_intlv_chan;
/* Number of dies interleaved in this map. */
u8 num_intlv_dies;
/* Number of sockets interleaved in this map. */
u8 num_intlv_sockets;
/*
* Total number of channels interleaved accounting
* for die and socket interleaving.
*/
u8 total_intlv_chan;
/* Total bits needed to cover 'total_intlv_chan'. */
u8 total_intlv_bits;
};
/* Original input values cached for debug printing. */
struct addr_ctx_inputs {
u64 norm_addr;
u8 socket_id;
u8 die_id;
u8 coh_st_inst_id;
};
struct addr_ctx {
u64 ret_addr;
struct addr_ctx_inputs inputs;
struct dram_addr_map map;
/* AMD Node ID calculated from Socket and Die IDs. */
u8 node_id;
/*
* Coherent Station Instance ID
* Local ID used within a 'node'.
*/
u16 inst_id;
/*
* Coherent Station Fabric ID
* System-wide ID that includes 'node' bits.
*/
u16 coh_st_fabric_id;
};
int df_indirect_read_instance(u16 node, u8 func, u16 reg, u8 instance_id, u32 *lo);
int df_indirect_read_broadcast(u16 node, u8 func, u16 reg, u32 *lo);
int get_df_system_info(void);
int determine_node_id(struct addr_ctx *ctx, u8 socket_num, u8 die_num);
int get_addr_hash_mi300(void);
int get_address_map(struct addr_ctx *ctx);
int denormalize_address(struct addr_ctx *ctx);
int dehash_address(struct addr_ctx *ctx);
unsigned long norm_to_sys_addr(u8 socket_id, u8 die_id, u8 coh_st_inst_id, unsigned long addr);
unsigned long convert_umc_mca_addr_to_sys_addr(struct atl_err *err);
/*
* Make a gap in @data that is @num_bits long starting at @bit_num.
* e.g. data = 11111111'b
* bit_num = 3
* num_bits = 2
* result = 1111100111'b
*/
static inline u64 expand_bits(u8 bit_num, u8 num_bits, u64 data)
{
u64 temp1, temp2;
if (!num_bits)
return data;
if (!bit_num) {
WARN_ON_ONCE(num_bits >= BITS_PER_LONG);
return data << num_bits;
}
WARN_ON_ONCE(bit_num >= BITS_PER_LONG);
temp1 = data & GENMASK_ULL(bit_num - 1, 0);
temp2 = data & GENMASK_ULL(63, bit_num);
temp2 <<= num_bits;
return temp1 | temp2;
}
/*
* Remove bits in @data between @low_bit and @high_bit inclusive.
* e.g. data = XXXYYZZZ'b
* low_bit = 3
* high_bit = 4
* result = XXXZZZ'b
*/
static inline u64 remove_bits(u8 low_bit, u8 high_bit, u64 data)
{
u64 temp1, temp2;
WARN_ON_ONCE(high_bit >= BITS_PER_LONG);
WARN_ON_ONCE(low_bit >= BITS_PER_LONG);
WARN_ON_ONCE(low_bit > high_bit);
if (!low_bit)
return data >> (high_bit++);
temp1 = GENMASK_ULL(low_bit - 1, 0) & data;
temp2 = GENMASK_ULL(63, high_bit + 1) & data;
temp2 >>= high_bit - low_bit + 1;
return temp1 | temp2;
}
#define atl_debug(ctx, fmt, arg...) \
pr_debug("socket_id=%u die_id=%u coh_st_inst_id=%u norm_addr=0x%016llx: " fmt,\
(ctx)->inputs.socket_id, (ctx)->inputs.die_id,\
(ctx)->inputs.coh_st_inst_id, (ctx)->inputs.norm_addr, ##arg)
static inline void atl_debug_on_bad_df_rev(void)
{
pr_debug("Unrecognized DF rev: %u", df_cfg.rev);
}
static inline void atl_debug_on_bad_intlv_mode(struct addr_ctx *ctx)
{
atl_debug(ctx, "Unrecognized interleave mode: %u", ctx->map.intlv_mode);
}
#endif /* __AMD_ATL_INTERNAL_H__ */
This diff is collapsed.
This diff is collapsed.
// SPDX-License-Identifier: GPL-2.0-or-later
/*
* AMD Address Translation Library
*
* system.c : Functions to read and save system-wide data
*
* Copyright (c) 2023, Advanced Micro Devices, Inc.
* All Rights Reserved.
*
* Author: Yazen Ghannam <Yazen.Ghannam@amd.com>
*/
#include "internal.h"
int determine_node_id(struct addr_ctx *ctx, u8 socket_id, u8 die_id)
{
u16 socket_id_bits, die_id_bits;
if (socket_id > 0 && df_cfg.socket_id_mask == 0) {
atl_debug(ctx, "Invalid socket inputs: socket_id=%u socket_id_mask=0x%x",
socket_id, df_cfg.socket_id_mask);
return -EINVAL;
}
/* Do each step independently to avoid shift out-of-bounds issues. */
socket_id_bits = socket_id;
socket_id_bits <<= df_cfg.socket_id_shift;
socket_id_bits &= df_cfg.socket_id_mask;
if (die_id > 0 && df_cfg.die_id_mask == 0) {
atl_debug(ctx, "Invalid die inputs: die_id=%u die_id_mask=0x%x",
die_id, df_cfg.die_id_mask);
return -EINVAL;
}
/* Do each step independently to avoid shift out-of-bounds issues. */
die_id_bits = die_id;
die_id_bits <<= df_cfg.die_id_shift;
die_id_bits &= df_cfg.die_id_mask;
ctx->node_id = (socket_id_bits | die_id_bits) >> df_cfg.node_id_shift;
return 0;
}
static void df2_get_masks_shifts(u32 mask0)
{
df_cfg.socket_id_shift = FIELD_GET(DF2_SOCKET_ID_SHIFT, mask0);
df_cfg.socket_id_mask = FIELD_GET(DF2_SOCKET_ID_MASK, mask0);
df_cfg.die_id_shift = FIELD_GET(DF2_DIE_ID_SHIFT, mask0);
df_cfg.die_id_mask = FIELD_GET(DF2_DIE_ID_MASK, mask0);
df_cfg.node_id_shift = df_cfg.die_id_shift;
df_cfg.node_id_mask = df_cfg.socket_id_mask | df_cfg.die_id_mask;
df_cfg.component_id_mask = ~df_cfg.node_id_mask;
}
static void df3_get_masks_shifts(u32 mask0, u32 mask1)
{
df_cfg.component_id_mask = FIELD_GET(DF3_COMPONENT_ID_MASK, mask0);
df_cfg.node_id_mask = FIELD_GET(DF3_NODE_ID_MASK, mask0);
df_cfg.node_id_shift = FIELD_GET(DF3_NODE_ID_SHIFT, mask1);
df_cfg.socket_id_shift = FIELD_GET(DF3_SOCKET_ID_SHIFT, mask1);
df_cfg.socket_id_mask = FIELD_GET(DF3_SOCKET_ID_MASK, mask1);
df_cfg.die_id_mask = FIELD_GET(DF3_DIE_ID_MASK, mask1);
}
static void df3p5_get_masks_shifts(u32 mask0, u32 mask1, u32 mask2)
{
df_cfg.component_id_mask = FIELD_GET(DF4_COMPONENT_ID_MASK, mask0);
df_cfg.node_id_mask = FIELD_GET(DF4_NODE_ID_MASK, mask0);
df_cfg.node_id_shift = FIELD_GET(DF3_NODE_ID_SHIFT, mask1);
df_cfg.socket_id_shift = FIELD_GET(DF4_SOCKET_ID_SHIFT, mask1);
df_cfg.socket_id_mask = FIELD_GET(DF4_SOCKET_ID_MASK, mask2);
df_cfg.die_id_mask = FIELD_GET(DF4_DIE_ID_MASK, mask2);
}
static void df4_get_masks_shifts(u32 mask0, u32 mask1, u32 mask2)
{
df3p5_get_masks_shifts(mask0, mask1, mask2);
if (!(df_cfg.flags.socket_id_shift_quirk && df_cfg.socket_id_shift == 1))
return;
df_cfg.socket_id_shift = 0;
df_cfg.socket_id_mask = 1;
df_cfg.die_id_shift = 0;
df_cfg.die_id_mask = 0;
df_cfg.node_id_shift = 8;
df_cfg.node_id_mask = 0x100;
}
static int df4_get_fabric_id_mask_registers(void)
{
u32 mask0, mask1, mask2;
/* Read D18F4x1B0 (SystemFabricIdMask0) */
if (df_indirect_read_broadcast(0, 4, 0x1B0, &mask0))
return -EINVAL;
/* Read D18F4x1B4 (SystemFabricIdMask1) */
if (df_indirect_read_broadcast(0, 4, 0x1B4, &mask1))
return -EINVAL;
/* Read D18F4x1B8 (SystemFabricIdMask2) */
if (df_indirect_read_broadcast(0, 4, 0x1B8, &mask2))
return -EINVAL;
df4_get_masks_shifts(mask0, mask1, mask2);
return 0;
}
static int df4_determine_df_rev(u32 reg)
{
df_cfg.rev = FIELD_GET(DF_MINOR_REVISION, reg) < 5 ? DF4 : DF4p5;
/* Check for special cases or quirks based on Device/Vendor IDs.*/
/* Read D18F0x000 (DeviceVendorId0) */
if (df_indirect_read_broadcast(0, 0, 0, &reg))
return -EINVAL;
if (reg == DF_FUNC0_ID_ZEN4_SERVER)
df_cfg.flags.socket_id_shift_quirk = 1;
if (reg == DF_FUNC0_ID_MI300) {
df_cfg.flags.heterogeneous = 1;
if (get_addr_hash_mi300())
return -EINVAL;
}
return df4_get_fabric_id_mask_registers();
}
static int determine_df_rev_legacy(void)
{
u32 fabric_id_mask0, fabric_id_mask1, fabric_id_mask2;
/*
* Check for DF3.5.
*
* Component ID Mask must be non-zero. Register D18F1x150 is
* reserved pre-DF3.5, so value will be Read-as-Zero.
*/
/* Read D18F1x150 (SystemFabricIdMask0). */
if (df_indirect_read_broadcast(0, 1, 0x150, &fabric_id_mask0))
return -EINVAL;
if (FIELD_GET(DF4_COMPONENT_ID_MASK, fabric_id_mask0)) {
df_cfg.rev = DF3p5;
/* Read D18F1x154 (SystemFabricIdMask1) */
if (df_indirect_read_broadcast(0, 1, 0x154, &fabric_id_mask1))
return -EINVAL;
/* Read D18F1x158 (SystemFabricIdMask2) */
if (df_indirect_read_broadcast(0, 1, 0x158, &fabric_id_mask2))
return -EINVAL;
df3p5_get_masks_shifts(fabric_id_mask0, fabric_id_mask1, fabric_id_mask2);
return 0;
}
/*
* Check for DF3.
*
* Component ID Mask must be non-zero. Field is Read-as-Zero on DF2.
*/
/* Read D18F1x208 (SystemFabricIdMask). */
if (df_indirect_read_broadcast(0, 1, 0x208, &fabric_id_mask0))
return -EINVAL;
if (FIELD_GET(DF3_COMPONENT_ID_MASK, fabric_id_mask0)) {
df_cfg.rev = DF3;
/* Read D18F1x20C (SystemFabricIdMask1) */
if (df_indirect_read_broadcast(0, 1, 0x20C, &fabric_id_mask1))
return -EINVAL;
df3_get_masks_shifts(fabric_id_mask0, fabric_id_mask1);
return 0;
}
/* Default to DF2. */
df_cfg.rev = DF2;
df2_get_masks_shifts(fabric_id_mask0);
return 0;
}
static int determine_df_rev(void)
{
u32 reg;
u8 rev;
if (df_cfg.rev != UNKNOWN)
return 0;
/* Read D18F0x40 (FabricBlockInstanceCount). */
if (df_indirect_read_broadcast(0, 0, 0x40, &reg))
return -EINVAL;
/*
* Revision fields added for DF4 and later.
*
* Major revision of '0' is found pre-DF4. Field is Read-as-Zero.
*/
rev = FIELD_GET(DF_MAJOR_REVISION, reg);
if (!rev)
return determine_df_rev_legacy();
/*
* Fail out for major revisions other than '4'.
*
* Explicit support should be added for newer systems to avoid issues.
*/
if (rev == 4)
return df4_determine_df_rev(reg);
return -EINVAL;
}
static void get_num_maps(void)
{
switch (df_cfg.rev) {
case DF2:
case DF3:
case DF3p5:
df_cfg.num_coh_st_maps = 2;
break;
case DF4:
case DF4p5:
df_cfg.num_coh_st_maps = 4;
break;
default:
atl_debug_on_bad_df_rev();
}
}
static void apply_node_id_shift(void)
{
if (df_cfg.rev == DF2)
return;
df_cfg.die_id_shift = df_cfg.node_id_shift;
df_cfg.die_id_mask <<= df_cfg.node_id_shift;
df_cfg.socket_id_mask <<= df_cfg.node_id_shift;
df_cfg.socket_id_shift += df_cfg.node_id_shift;
}
static void dump_df_cfg(void)
{
pr_debug("rev=0x%x", df_cfg.rev);
pr_debug("component_id_mask=0x%x", df_cfg.component_id_mask);
pr_debug("die_id_mask=0x%x", df_cfg.die_id_mask);
pr_debug("node_id_mask=0x%x", df_cfg.node_id_mask);
pr_debug("socket_id_mask=0x%x", df_cfg.socket_id_mask);
pr_debug("die_id_shift=0x%x", df_cfg.die_id_shift);
pr_debug("node_id_shift=0x%x", df_cfg.node_id_shift);
pr_debug("socket_id_shift=0x%x", df_cfg.socket_id_shift);
pr_debug("num_coh_st_maps=%u", df_cfg.num_coh_st_maps);
pr_debug("flags.legacy_ficaa=%u", df_cfg.flags.legacy_ficaa);
pr_debug("flags.socket_id_shift_quirk=%u", df_cfg.flags.socket_id_shift_quirk);
}
int get_df_system_info(void)
{
if (determine_df_rev()) {
pr_warn("amd_atl: Failed to determine DF Revision");
df_cfg.rev = UNKNOWN;
return -EINVAL;
}
apply_node_id_shift();
get_num_maps();
dump_df_cfg();
return 0;
}
// SPDX-License-Identifier: GPL-2.0-or-later
/*
* AMD Address Translation Library
*
* umc.c : Unified Memory Controller (UMC) topology helpers
*
* Copyright (c) 2023, Advanced Micro Devices, Inc.
* All Rights Reserved.
*
* Author: Yazen Ghannam <Yazen.Ghannam@amd.com>
*/
#include "internal.h"
/*
* MI300 has a fixed, model-specific mapping between a UMC instance and
* its related Data Fabric Coherent Station instance.
*
* The MCA_IPID_UMC[InstanceId] field holds a unique identifier for the
* UMC instance within a Node. Use this to find the appropriate Coherent
* Station ID.
*
* Redundant bits were removed from the map below.
*/
static const u16 umc_coh_st_map[32] = {
0x393, 0x293, 0x193, 0x093,
0x392, 0x292, 0x192, 0x092,
0x391, 0x291, 0x191, 0x091,
0x390, 0x290, 0x190, 0x090,
0x793, 0x693, 0x593, 0x493,
0x792, 0x692, 0x592, 0x492,
0x791, 0x691, 0x591, 0x491,
0x790, 0x690, 0x590, 0x490,
};
#define UMC_ID_MI300 GENMASK(23, 12)
static u8 get_coh_st_inst_id_mi300(struct atl_err *err)
{
u16 umc_id = FIELD_GET(UMC_ID_MI300, err->ipid);
u8 i;
for (i = 0; i < ARRAY_SIZE(umc_coh_st_map); i++) {
if (umc_id == umc_coh_st_map[i])
break;
}
WARN_ON_ONCE(i >= ARRAY_SIZE(umc_coh_st_map));
return i;
}
/* XOR the bits in @val. */
static u16 bitwise_xor_bits(u16 val)
{
u16 tmp = 0;
u8 i;
for (i = 0; i < 16; i++)
tmp ^= (val >> i) & 0x1;
return tmp;
}
struct xor_bits {
bool xor_enable;
u16 col_xor;
u32 row_xor;
};
#define NUM_BANK_BITS 4
static struct {
/* UMC::CH::AddrHashBank */
struct xor_bits bank[NUM_BANK_BITS];
/* UMC::CH::AddrHashPC */
struct xor_bits pc;
/* UMC::CH::AddrHashPC2 */
u8 bank_xor;
} addr_hash;
#define MI300_UMC_CH_BASE 0x90000
#define MI300_ADDR_HASH_BANK0 (MI300_UMC_CH_BASE + 0xC8)
#define MI300_ADDR_HASH_PC (MI300_UMC_CH_BASE + 0xE0)
#define MI300_ADDR_HASH_PC2 (MI300_UMC_CH_BASE + 0xE4)
#define ADDR_HASH_XOR_EN BIT(0)
#define ADDR_HASH_COL_XOR GENMASK(13, 1)
#define ADDR_HASH_ROW_XOR GENMASK(31, 14)
#define ADDR_HASH_BANK_XOR GENMASK(5, 0)
/*
* Read UMC::CH::AddrHash{Bank,PC,PC2} registers to get XOR bits used
* for hashing. Do this during module init, since the values will not
* change during run time.
*
* These registers are instantiated for each UMC across each AMD Node.
* However, they should be identically programmed due to the fixed hardware
* design of MI300 systems. So read the values from Node 0 UMC 0 and keep a
* single global structure for simplicity.
*/
int get_addr_hash_mi300(void)
{
u32 temp;
int ret;
u8 i;
for (i = 0; i < NUM_BANK_BITS; i++) {
ret = amd_smn_read(0, MI300_ADDR_HASH_BANK0 + (i * 4), &temp);
if (ret)
return ret;
addr_hash.bank[i].xor_enable = FIELD_GET(ADDR_HASH_XOR_EN, temp);
addr_hash.bank[i].col_xor = FIELD_GET(ADDR_HASH_COL_XOR, temp);
addr_hash.bank[i].row_xor = FIELD_GET(ADDR_HASH_ROW_XOR, temp);
}
ret = amd_smn_read(0, MI300_ADDR_HASH_PC, &temp);
if (ret)
return ret;
addr_hash.pc.xor_enable = FIELD_GET(ADDR_HASH_XOR_EN, temp);
addr_hash.pc.col_xor = FIELD_GET(ADDR_HASH_COL_XOR, temp);
addr_hash.pc.row_xor = FIELD_GET(ADDR_HASH_ROW_XOR, temp);
ret = amd_smn_read(0, MI300_ADDR_HASH_PC2, &temp);
if (ret)
return ret;
addr_hash.bank_xor = FIELD_GET(ADDR_HASH_BANK_XOR, temp);
return 0;
}
/*
* MI300 systems report a DRAM address in MCA_ADDR for DRAM ECC errors. This must
* be converted to the intermediate normalized address (NA) before translating to a
* system physical address.
*
* The DRAM address includes bank, row, and column. Also included are bits for
* pseudochannel (PC) and stack ID (SID).
*
* Abbreviations: (S)tack ID, (P)seudochannel, (R)ow, (B)ank, (C)olumn, (Z)ero
*
* The MCA address format is as follows:
* MCA_ADDR[27:0] = {S[1:0], P[0], R[14:0], B[3:0], C[4:0], Z[0]}
*
* The normalized address format is fixed in hardware and is as follows:
* NA[30:0] = {S[1:0], R[13:0], C4, B[1:0], B[3:2], C[3:2], P, C[1:0], Z[4:0]}
*
* Additionally, the PC and Bank bits may be hashed. This must be accounted for before
* reconstructing the normalized address.
*/
#define MI300_UMC_MCA_COL GENMASK(5, 1)
#define MI300_UMC_MCA_BANK GENMASK(9, 6)
#define MI300_UMC_MCA_ROW GENMASK(24, 10)
#define MI300_UMC_MCA_PC BIT(25)
#define MI300_UMC_MCA_SID GENMASK(27, 26)
#define MI300_NA_COL_1_0 GENMASK(6, 5)
#define MI300_NA_PC BIT(7)
#define MI300_NA_COL_3_2 GENMASK(9, 8)
#define MI300_NA_BANK_3_2 GENMASK(11, 10)
#define MI300_NA_BANK_1_0 GENMASK(13, 12)
#define MI300_NA_COL_4 BIT(14)
#define MI300_NA_ROW GENMASK(28, 15)
#define MI300_NA_SID GENMASK(30, 29)
static unsigned long convert_dram_to_norm_addr_mi300(unsigned long addr)
{
u16 i, col, row, bank, pc, sid, temp;
col = FIELD_GET(MI300_UMC_MCA_COL, addr);
bank = FIELD_GET(MI300_UMC_MCA_BANK, addr);
row = FIELD_GET(MI300_UMC_MCA_ROW, addr);
pc = FIELD_GET(MI300_UMC_MCA_PC, addr);
sid = FIELD_GET(MI300_UMC_MCA_SID, addr);
/* Calculate hash for each Bank bit. */
for (i = 0; i < NUM_BANK_BITS; i++) {
if (!addr_hash.bank[i].xor_enable)
continue;
temp = bitwise_xor_bits(col & addr_hash.bank[i].col_xor);
temp ^= bitwise_xor_bits(row & addr_hash.bank[i].row_xor);
bank ^= temp << i;
}
/* Calculate hash for PC bit. */
if (addr_hash.pc.xor_enable) {
/* Bits SID[1:0] act as Bank[6:5] for PC hash, so apply them here. */
bank |= sid << 5;
temp = bitwise_xor_bits(col & addr_hash.pc.col_xor);
temp ^= bitwise_xor_bits(row & addr_hash.pc.row_xor);
temp ^= bitwise_xor_bits(bank & addr_hash.bank_xor);
pc ^= temp;
/* Drop SID bits for the sake of debug printing later. */
bank &= 0x1F;
}
/* Reconstruct the normalized address starting with NA[4:0] = 0 */
addr = 0;
/* NA[6:5] = Column[1:0] */
temp = col & 0x3;
addr |= FIELD_PREP(MI300_NA_COL_1_0, temp);
/* NA[7] = PC */
addr |= FIELD_PREP(MI300_NA_PC, pc);
/* NA[9:8] = Column[3:2] */
temp = (col >> 2) & 0x3;
addr |= FIELD_PREP(MI300_NA_COL_3_2, temp);
/* NA[11:10] = Bank[3:2] */
temp = (bank >> 2) & 0x3;
addr |= FIELD_PREP(MI300_NA_BANK_3_2, temp);
/* NA[13:12] = Bank[1:0] */
temp = bank & 0x3;
addr |= FIELD_PREP(MI300_NA_BANK_1_0, temp);
/* NA[14] = Column[4] */
temp = (col >> 4) & 0x1;
addr |= FIELD_PREP(MI300_NA_COL_4, temp);
/* NA[28:15] = Row[13:0] */
addr |= FIELD_PREP(MI300_NA_ROW, row);
/* NA[30:29] = SID[1:0] */
addr |= FIELD_PREP(MI300_NA_SID, sid);
pr_debug("Addr=0x%016lx", addr);
pr_debug("Bank=%u Row=%u Column=%u PC=%u SID=%u", bank, row, col, pc, sid);
return addr;
}
/*
* When a DRAM ECC error occurs on MI300 systems, it is recommended to retire
* all memory within that DRAM row. This applies to the memory with a DRAM
* bank.
*
* To find the memory addresses, loop through permutations of the DRAM column
* bits and find the System Physical address of each. The column bits are used
* to calculate the intermediate Normalized address, so all permutations should
* be checked.
*
* See amd_atl::convert_dram_to_norm_addr_mi300() for MI300 address formats.
*/
#define MI300_NUM_COL BIT(HWEIGHT(MI300_UMC_MCA_COL))
static void retire_row_mi300(struct atl_err *a_err)
{
unsigned long addr;
struct page *p;
u8 col;
for (col = 0; col < MI300_NUM_COL; col++) {
a_err->addr &= ~MI300_UMC_MCA_COL;
a_err->addr |= FIELD_PREP(MI300_UMC_MCA_COL, col);
addr = amd_convert_umc_mca_addr_to_sys_addr(a_err);
if (IS_ERR_VALUE(addr))
continue;
addr = PHYS_PFN(addr);
/*
* Skip invalid or already poisoned pages to avoid unnecessary
* error messages from memory_failure().
*/
p = pfn_to_online_page(addr);
if (!p)
continue;
if (PageHWPoison(p))
continue;
memory_failure(addr, 0);
}
}
void amd_retire_dram_row(struct atl_err *a_err)
{
if (df_cfg.rev == DF4p5 && df_cfg.flags.heterogeneous)
return retire_row_mi300(a_err);
}
EXPORT_SYMBOL_GPL(amd_retire_dram_row);
static unsigned long get_addr(unsigned long addr)
{
if (df_cfg.rev == DF4p5 && df_cfg.flags.heterogeneous)
return convert_dram_to_norm_addr_mi300(addr);
return addr;
}
#define MCA_IPID_INST_ID_HI GENMASK_ULL(47, 44)
static u8 get_die_id(struct atl_err *err)
{
/*
* AMD Node ID is provided in MCA_IPID[InstanceIdHi], and this
* needs to be divided by 4 to get the internal Die ID.
*/
if (df_cfg.rev == DF4p5 && df_cfg.flags.heterogeneous) {
u8 node_id = FIELD_GET(MCA_IPID_INST_ID_HI, err->ipid);
return node_id >> 2;
}
/*
* For CPUs, this is the AMD Node ID modulo the number
* of AMD Nodes per socket.
*/
return topology_die_id(err->cpu) % amd_get_nodes_per_socket();
}
#define UMC_CHANNEL_NUM GENMASK(31, 20)
static u8 get_coh_st_inst_id(struct atl_err *err)
{
if (df_cfg.rev == DF4p5 && df_cfg.flags.heterogeneous)
return get_coh_st_inst_id_mi300(err);
return FIELD_GET(UMC_CHANNEL_NUM, err->ipid);
}
unsigned long convert_umc_mca_addr_to_sys_addr(struct atl_err *err)
{
u8 socket_id = topology_physical_package_id(err->cpu);
u8 coh_st_inst_id = get_coh_st_inst_id(err);
unsigned long addr = get_addr(err->addr);
u8 die_id = get_die_id(err);
pr_debug("socket_id=0x%x die_id=0x%x coh_st_inst_id=0x%x addr=0x%016lx",
socket_id, die_id, coh_st_inst_id, addr);
return norm_to_sys_addr(socket_id, die_id, coh_st_inst_id, addr);
}
This diff is collapsed.
...@@ -480,9 +480,15 @@ DEFINE_SHOW_ATTRIBUTE(array); ...@@ -480,9 +480,15 @@ DEFINE_SHOW_ATTRIBUTE(array);
static int __init create_debugfs_nodes(void) static int __init create_debugfs_nodes(void)
{ {
struct dentry *d, *pfn, *decay, *count, *array; struct dentry *d, *pfn, *decay, *count, *array, *dfs;
d = debugfs_create_dir("cec", ras_debugfs_dir); dfs = ras_get_debugfs_root();
if (!dfs) {
pr_warn("Error getting RAS debugfs root!\n");
return -1;
}
d = debugfs_create_dir("cec", dfs);
if (!d) { if (!d) {
pr_warn("Error creating cec debugfs node!\n"); pr_warn("Error creating cec debugfs node!\n");
return -1; return -1;
......
...@@ -3,10 +3,16 @@ ...@@ -3,10 +3,16 @@
#include <linux/ras.h> #include <linux/ras.h>
#include "debugfs.h" #include "debugfs.h"
struct dentry *ras_debugfs_dir; static struct dentry *ras_debugfs_dir;
static atomic_t trace_count = ATOMIC_INIT(0); static atomic_t trace_count = ATOMIC_INIT(0);
struct dentry *ras_get_debugfs_root(void)
{
return ras_debugfs_dir;
}
EXPORT_SYMBOL_GPL(ras_get_debugfs_root);
int ras_userspace_consumers(void) int ras_userspace_consumers(void)
{ {
return atomic_read(&trace_count); return atomic_read(&trace_count);
......
...@@ -4,6 +4,6 @@ ...@@ -4,6 +4,6 @@
#include <linux/debugfs.h> #include <linux/debugfs.h>
extern struct dentry *ras_debugfs_dir; struct dentry *ras_get_debugfs_root(void);
#endif /* __RAS_DEBUGFS_H__ */ #endif /* __RAS_DEBUGFS_H__ */
...@@ -10,6 +10,37 @@ ...@@ -10,6 +10,37 @@
#include <linux/ras.h> #include <linux/ras.h>
#include <linux/uuid.h> #include <linux/uuid.h>
#if IS_ENABLED(CONFIG_AMD_ATL)
/*
* Once set, this function pointer should never be unset.
*
* The library module will set this pointer if it successfully loads. The module
* should not be unloaded except for testing and debug purposes.
*/
static unsigned long (*amd_atl_umc_na_to_spa)(struct atl_err *err);
void amd_atl_register_decoder(unsigned long (*f)(struct atl_err *))
{
amd_atl_umc_na_to_spa = f;
}
EXPORT_SYMBOL_GPL(amd_atl_register_decoder);
void amd_atl_unregister_decoder(void)
{
amd_atl_umc_na_to_spa = NULL;
}
EXPORT_SYMBOL_GPL(amd_atl_unregister_decoder);
unsigned long amd_convert_umc_mca_addr_to_sys_addr(struct atl_err *err)
{
if (!amd_atl_umc_na_to_spa)
return -EINVAL;
return amd_atl_umc_na_to_spa(err);
}
EXPORT_SYMBOL_GPL(amd_convert_umc_mca_addr_to_sys_addr);
#endif /* CONFIG_AMD_ATL */
#define CREATE_TRACE_POINTS #define CREATE_TRACE_POINTS
#define TRACE_INCLUDE_PATH ../../include/ras #define TRACE_INCLUDE_PATH ../../include/ras
#include <ras/ras_event.h> #include <ras/ras_event.h>
......
...@@ -25,6 +25,7 @@ void log_non_standard_event(const guid_t *sec_type, ...@@ -25,6 +25,7 @@ void log_non_standard_event(const guid_t *sec_type,
const guid_t *fru_id, const char *fru_text, const guid_t *fru_id, const char *fru_text,
const u8 sev, const u8 *err, const u32 len); const u8 sev, const u8 *err, const u32 len);
void log_arm_hw_error(struct cper_sec_proc_arm *err); void log_arm_hw_error(struct cper_sec_proc_arm *err);
#else #else
static inline void static inline void
log_non_standard_event(const guid_t *sec_type, log_non_standard_event(const guid_t *sec_type,
...@@ -35,4 +36,21 @@ static inline void ...@@ -35,4 +36,21 @@ static inline void
log_arm_hw_error(struct cper_sec_proc_arm *err) { return; } log_arm_hw_error(struct cper_sec_proc_arm *err) { return; }
#endif #endif
struct atl_err {
u64 addr;
u64 ipid;
u32 cpu;
};
#if IS_ENABLED(CONFIG_AMD_ATL)
void amd_atl_register_decoder(unsigned long (*f)(struct atl_err *));
void amd_atl_unregister_decoder(void);
void amd_retire_dram_row(struct atl_err *err);
unsigned long amd_convert_umc_mca_addr_to_sys_addr(struct atl_err *err);
#else
static inline void amd_retire_dram_row(struct atl_err *err) { }
static inline unsigned long
amd_convert_umc_mca_addr_to_sys_addr(struct atl_err *err) { return -EINVAL; }
#endif /* CONFIG_AMD_ATL */
#endif /* __RAS_H__ */ #endif /* __RAS_H__ */
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment