Commit 275b103a authored by Linus Torvalds's avatar Linus Torvalds

Merge tag 'edac_for_5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp

Pull EDAC updates from Borislav Petkov:

 - amd64_edac: Family 0x17, models 0x30-.. enablement (Yazen Ghannam)

 - skx_*: Librarize it so that it can be shared between drivers (Qiuxu Zhuo)

 - altera: Stratix10 improvements (Thor Thayer)

 - The usual round of fixes, fixlets and cleanups

* tag 'edac_for_5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp:
  Revert "EDAC/amd64: Support more than two controllers for chip select handling"
  arm64: dts: stratix10: Use new Stratix10 EDAC bindings
  Documentation: dt: edac: Add Stratix10 Peripheral bindings
  Documentation: dt: edac: Fix Stratix10 IRQ bindings
  EDAC/altera, firmware/intel: Add Stratix10 ECC DBE SMC call
  EDAC/altera: Initialize peripheral FIFOs in probe()
  EDAC/altera: Do less intrusive error injection
  EDAC/amd64: Adjust printed chip select sizes when interleaved
  EDAC/amd64: Support more than two controllers for chip select handling
  EDAC/amd64: Recognize x16 symbol size
  EDAC/amd64: Set maximum channel layer size depending on family
  EDAC/amd64: Support more than two Unified Memory Controllers
  EDAC/amd64: Use a macro for iterating over Unified Memory Controllers
  EDAC/amd64: Add Family 17h Model 30h PCI IDs
  MAINTAINERS: Add entry for EDAC-I10NM
  MAINTAINERS: Update entry for EDAC-SKYLAKE
  EDAC, altera: Fix S10 Double Bit Error Notification
  EDAC, skx, i10nm: Make skx_common.c a pure library
parents 4dd2ab9a 8de9930a
...@@ -232,37 +232,152 @@ Example: ...@@ -232,37 +232,152 @@ Example:
}; };
}; };
Stratix10 SoCFPGA ECC Manager Stratix10 SoCFPGA ECC Manager (ARM64)
The Stratix10 SoC ECC Manager handles the IRQs for each peripheral The Stratix10 SoC ECC Manager handles the IRQs for each peripheral
in a shared register similar to the Arria10. However, ECC requires in a shared register similar to the Arria10. However, Stratix10 ECC
access to registers that can only be read from Secure Monitor with requires access to registers that can only be read from Secure Monitor
SMC calls. Therefore the device tree is slightly different. with SMC calls. Therefore the device tree is slightly different. Note
that only 1 interrupt is sent in Stratix10 because the double bit errors
are treated as SErrors in ARM64 instead of IRQs in ARM32.
Required Properties: Required Properties:
- compatible : Should be "altr,socfpga-s10-ecc-manager" - compatible : Should be "altr,socfpga-s10-ecc-manager"
- interrupts : Should be single bit error interrupt, then double bit error - altr,sysgr-syscon : phandle to Stratix10 System Manager Block
interrupt. containing the ECC manager registers.
- interrupts : Should be single bit error interrupt.
- interrupt-controller : boolean indicator that ECC Manager is an interrupt controller - interrupt-controller : boolean indicator that ECC Manager is an interrupt controller
- #interrupt-cells : must be set to 2. - #interrupt-cells : must be set to 2.
- #address-cells: must be 1
- #size-cells: must be 1
- ranges : standard definition, should translate from local addresses
Subcomponents: Subcomponents:
SDRAM ECC SDRAM ECC
Required Properties: Required Properties:
- compatible : Should be "altr,sdram-edac-s10" - compatible : Should be "altr,sdram-edac-s10"
- interrupts : Should be single bit error interrupt, then double bit error - interrupts : Should be single bit error interrupt.
interrupt, in this order.
On-Chip RAM ECC
Required Properties:
- compatible : Should be "altr,socfpga-s10-ocram-ecc"
- reg : Address and size for ECC block registers.
- altr,ecc-parent : phandle to parent OCRAM node.
- interrupts : Should be single bit error interrupt.
Ethernet FIFO ECC
Required Properties:
- compatible : Should be "altr,socfpga-s10-eth-mac-ecc"
- reg : Address and size for ECC block registers.
- altr,ecc-parent : phandle to parent Ethernet node.
- interrupts : Should be single bit error interrupt.
NAND FIFO ECC
Required Properties:
- compatible : Should be "altr,socfpga-s10-nand-ecc"
- reg : Address and size for ECC block registers.
- altr,ecc-parent : phandle to parent NAND node.
- interrupts : Should be single bit error interrupt.
DMA FIFO ECC
Required Properties:
- compatible : Should be "altr,socfpga-s10-dma-ecc"
- reg : Address and size for ECC block registers.
- altr,ecc-parent : phandle to parent DMA node.
- interrupts : Should be single bit error interrupt.
USB FIFO ECC
Required Properties:
- compatible : Should be "altr,socfpga-s10-usb-ecc"
- reg : Address and size for ECC block registers.
- altr,ecc-parent : phandle to parent USB node.
- interrupts : Should be single bit error interrupt.
SDMMC FIFO ECC
Required Properties:
- compatible : Should be "altr,socfpga-s10-sdmmc-ecc"
- reg : Address and size for ECC block registers.
- altr,ecc-parent : phandle to parent SD/MMC node.
- interrupts : Should be single bit error interrupt for port A
and then single bit error interrupt for port B.
Example: Example:
eccmgr { eccmgr {
compatible = "altr,socfpga-s10-ecc-manager"; compatible = "altr,socfpga-s10-ecc-manager";
interrupts = <0 15 4>, <0 95 4>; altr,sysmgr-syscon = <&sysmgr>;
#address-cells = <1>;
#size-cells = <1>;
interrupts = <0 15 4>;
interrupt-controller; interrupt-controller;
#interrupt-cells = <2>; #interrupt-cells = <2>;
ranges;
sdramedac { sdramedac {
compatible = "altr,sdram-edac-s10"; compatible = "altr,sdram-edac-s10";
interrupts = <16 4>, <48 4>; interrupts = <16 IRQ_TYPE_LEVEL_HIGH>;
};
ocram-ecc@ff8cc000 {
compatible = "altr,socfpga-s10-ocram-ecc";
reg = <ff8cc000 0x100>;
altr,ecc-parent = <&ocram>;
interrupts = <1 IRQ_TYPE_LEVEL_HIGH>;
};
emac0-rx-ecc@ff8c0000 {
compatible = "altr,socfpga-s10-eth-mac-ecc";
reg = <0xff8c0000 0x100>;
altr,ecc-parent = <&gmac0>;
interrupts = <4 IRQ_TYPE_LEVEL_HIGH>;
};
emac0-tx-ecc@ff8c0400 {
compatible = "altr,socfpga-s10-eth-mac-ecc";
reg = <0xff8c0400 0x100>;
altr,ecc-parent = <&gmac0>;
interrupts = <5 IRQ_TYPE_LEVEL_HIGH>'
};
nand-buf-ecc@ff8c8000 {
compatible = "altr,socfpga-s10-nand-ecc";
reg = <0xff8c8000 0x100>;
altr,ecc-parent = <&nand>;
interrupts = <11 IRQ_TYPE_LEVEL_HIGH>;
};
nand-rd-ecc@ff8c8400 {
compatible = "altr,socfpga-s10-nand-ecc";
reg = <0xff8c8400 0x100>;
altr,ecc-parent = <&nand>;
interrupts = <13 IRQ_TYPE_LEVEL_HIGH>;
};
nand-wr-ecc@ff8c8800 {
compatible = "altr,socfpga-s10-nand-ecc";
reg = <0xff8c8800 0x100>;
altr,ecc-parent = <&nand>;
interrupts = <12 IRQ_TYPE_LEVEL_HIGH>;
};
dma-ecc@ff8c9000 {
compatible = "altr,socfpga-s10-dma-ecc";
reg = <0xff8c9000 0x100>;
altr,ecc-parent = <&pdma>;
interrupts = <10 IRQ_TYPE_LEVEL_HIGH>;
usb0-ecc@ff8c4000 {
compatible = "altr,socfpga-s10-usb-ecc";
reg = <0xff8c4000 0x100>;
altr,ecc-parent = <&usb0>;
interrupts = <2 IRQ_TYPE_LEVEL_HIGH>;
};
sdmmc-ecc@ff8c8c00 {
compatible = "altr,socfpga-s10-sdmmc-ecc";
reg = <0xff8c8c00 0x100>;
altr,ecc-parent = <&mmc>;
interrupts = <14 IRQ_TYPE_LEVEL_HIGH>,
<15 IRQ_TYPE_LEVEL_HIGH>;
}; };
}; };
...@@ -5599,6 +5599,12 @@ L: linux-edac@vger.kernel.org ...@@ -5599,6 +5599,12 @@ L: linux-edac@vger.kernel.org
S: Maintained S: Maintained
F: drivers/edac/ghes_edac.c F: drivers/edac/ghes_edac.c
EDAC-I10NM
M: Tony Luck <tony.luck@intel.com>
L: linux-edac@vger.kernel.org
S: Maintained
F: drivers/edac/i10nm_base.c
EDAC-I3000 EDAC-I3000
L: linux-edac@vger.kernel.org L: linux-edac@vger.kernel.org
S: Orphan S: Orphan
...@@ -5680,7 +5686,7 @@ EDAC-SKYLAKE ...@@ -5680,7 +5686,7 @@ EDAC-SKYLAKE
M: Tony Luck <tony.luck@intel.com> M: Tony Luck <tony.luck@intel.com>
L: linux-edac@vger.kernel.org L: linux-edac@vger.kernel.org
S: Maintained S: Maintained
F: drivers/edac/skx_edac.c F: drivers/edac/skx_*.c
EDAC-TI EDAC-TI
M: Tero Kristo <t-kristo@ti.com> M: Tero Kristo <t-kristo@ti.com>
......
...@@ -534,11 +534,12 @@ sdr: sdr@f8011100 { ...@@ -534,11 +534,12 @@ sdr: sdr@f8011100 {
}; };
eccmgr { eccmgr {
compatible = "altr,socfpga-a10-ecc-manager"; compatible = "altr,socfpga-s10-ecc-manager",
"altr,socfpga-a10-ecc-manager";
altr,sysmgr-syscon = <&sysmgr>; altr,sysmgr-syscon = <&sysmgr>;
#address-cells = <1>; #address-cells = <1>;
#size-cells = <1>; #size-cells = <1>;
interrupts = <0 15 4>, <0 95 4>; interrupts = <0 15 4>;
interrupt-controller; interrupt-controller;
#interrupt-cells = <2>; #interrupt-cells = <2>;
ranges; ranges;
...@@ -546,31 +547,31 @@ eccmgr { ...@@ -546,31 +547,31 @@ eccmgr {
sdramedac { sdramedac {
compatible = "altr,sdram-edac-s10"; compatible = "altr,sdram-edac-s10";
altr,sdr-syscon = <&sdr>; altr,sdr-syscon = <&sdr>;
interrupts = <16 4>, <48 4>; interrupts = <16 4>;
}; };
usb0-ecc@ff8c4000 { usb0-ecc@ff8c4000 {
compatible = "altr,socfpga-usb-ecc"; compatible = "altr,socfpga-s10-usb-ecc",
"altr,socfpga-usb-ecc";
reg = <0xff8c4000 0x100>; reg = <0xff8c4000 0x100>;
altr,ecc-parent = <&usb0>; altr,ecc-parent = <&usb0>;
interrupts = <2 4>, interrupts = <2 4>;
<34 4>;
}; };
emac0-rx-ecc@ff8c0000 { emac0-rx-ecc@ff8c0000 {
compatible = "altr,socfpga-eth-mac-ecc"; compatible = "altr,socfpga-s10-eth-mac-ecc",
"altr,socfpga-eth-mac-ecc";
reg = <0xff8c0000 0x100>; reg = <0xff8c0000 0x100>;
altr,ecc-parent = <&gmac0>; altr,ecc-parent = <&gmac0>;
interrupts = <4 4>, interrupts = <4 4>;
<36 4>;
}; };
emac0-tx-ecc@ff8c0400 { emac0-tx-ecc@ff8c0400 {
compatible = "altr,socfpga-eth-mac-ecc"; compatible = "altr,socfpga-s10-eth-mac-ecc",
"altr,socfpga-eth-mac-ecc";
reg = <0xff8c0400 0x100>; reg = <0xff8c0400 0x100>;
altr,ecc-parent = <&gmac0>; altr,ecc-parent = <&gmac0>;
interrupts = <5 4>, interrupts = <5 4>;
<37 4>;
}; };
}; };
......
This diff is collapsed.
...@@ -289,6 +289,7 @@ struct altr_sdram_mc_data { ...@@ -289,6 +289,7 @@ struct altr_sdram_mc_data {
#define ALTR_A10_ECC_INIT_WATCHDOG_10US 10000 #define ALTR_A10_ECC_INIT_WATCHDOG_10US 10000
/************* Stratix10 Defines **************/ /************* Stratix10 Defines **************/
#define ALTR_S10_DERR_ADDRA_OFST 0x2C
/* Stratix10 ECC Manager Defines */ /* Stratix10 ECC Manager Defines */
#define S10_SYSMGR_ECC_INTMASK_CLR_OFST 0x98 #define S10_SYSMGR_ECC_INTMASK_CLR_OFST 0x98
...@@ -299,6 +300,7 @@ struct altr_sdram_mc_data { ...@@ -299,6 +300,7 @@ struct altr_sdram_mc_data {
#define S10_SYSMGR_UE_ADDR_OFST 0x224 #define S10_SYSMGR_UE_ADDR_OFST 0x224
#define S10_DDR0_IRQ_MASK BIT(16) #define S10_DDR0_IRQ_MASK BIT(16)
#define S10_DBE_IRQ_MASK 0x3FE
/* Define ECC Block Offsets for peripherals */ /* Define ECC Block Offsets for peripherals */
#define ECC_BLK_ADDRESS_OFST 0x40 #define ECC_BLK_ADDRESS_OFST 0x40
...@@ -319,7 +321,7 @@ struct altr_sdram_mc_data { ...@@ -319,7 +321,7 @@ struct altr_sdram_mc_data {
#define ECC_BLK_STARTACC_OFST 0x7C #define ECC_BLK_STARTACC_OFST 0x7C
#define ECC_XACT_KICK 0x10000 #define ECC_XACT_KICK 0x10000
#define ECC_WORD_WRITE 0xF #define ECC_WORD_WRITE 0xFF
#define ECC_WRITE_DOVR 0x101 #define ECC_WRITE_DOVR 0x101
#define ECC_WRITE_EDOVR 0x103 #define ECC_WRITE_EDOVR 0x103
#define ECC_READ_EOVR 0x2 #define ECC_READ_EOVR 0x2
...@@ -370,69 +372,4 @@ struct altr_arria10_edac { ...@@ -370,69 +372,4 @@ struct altr_arria10_edac {
struct notifier_block panic_notifier; struct notifier_block panic_notifier;
}; };
/*
* Functions specified by ARM SMC Calling convention:
*
* FAST call executes atomic operations, returns when the requested operation
* has completed.
* STD call starts a operation which can be preempted by a non-secure
* interrupt. The call can return before the requested operation has
* completed.
*
* a0..a7 is used as register names in the descriptions below, on arm32
* that translates to r0..r7 and on arm64 to w0..w7.
*/
#define INTEL_SIP_SMC_STD_CALL_VAL(func_num) \
ARM_SMCCC_CALL_VAL(ARM_SMCCC_STD_CALL, ARM_SMCCC_SMC_64, \
ARM_SMCCC_OWNER_SIP, (func_num))
#define INTEL_SIP_SMC_FAST_CALL_VAL(func_num) \
ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, ARM_SMCCC_SMC_64, \
ARM_SMCCC_OWNER_SIP, (func_num))
#define INTEL_SIP_SMC_RETURN_UNKNOWN_FUNCTION 0xFFFFFFFF
#define INTEL_SIP_SMC_STATUS_OK 0x0
#define INTEL_SIP_SMC_REG_ERROR 0x5
/*
* Request INTEL_SIP_SMC_REG_READ
*
* Read a protected register using SMCCC
*
* Call register usage:
* a0: INTEL_SIP_SMC_REG_READ.
* a1: register address.
* a2-7: not used.
*
* Return status:
* a0: INTEL_SIP_SMC_STATUS_OK, INTEL_SIP_SMC_REG_ERROR, or
* INTEL_SIP_SMC_RETURN_UNKNOWN_FUNCTION
* a1: Value in the register
* a2-3: not used.
*/
#define INTEL_SIP_SMC_FUNCID_REG_READ 7
#define INTEL_SIP_SMC_REG_READ \
INTEL_SIP_SMC_FAST_CALL_VAL(INTEL_SIP_SMC_FUNCID_REG_READ)
/*
* Request INTEL_SIP_SMC_REG_WRITE
*
* Write a protected register using SMCCC
*
* Call register usage:
* a0: INTEL_SIP_SMC_REG_WRITE.
* a1: register address
* a2: value to program into register.
* a3-7: not used.
*
* Return status:
* a0: INTEL_SIP_SMC_STATUS_OK, INTEL_SIP_SMC_REG_ERROR, or
* INTEL_SIP_SMC_RETURN_UNKNOWN_FUNCTION
* a1-3: not used.
*/
#define INTEL_SIP_SMC_FUNCID_REG_WRITE 8
#define INTEL_SIP_SMC_REG_WRITE \
INTEL_SIP_SMC_FAST_CALL_VAL(INTEL_SIP_SMC_FUNCID_REG_WRITE)
#endif /* #ifndef _ALTERA_EDAC_H */ #endif /* #ifndef _ALTERA_EDAC_H */
...@@ -18,6 +18,9 @@ static struct msr __percpu *msrs; ...@@ -18,6 +18,9 @@ static struct msr __percpu *msrs;
/* Per-node stuff */ /* Per-node stuff */
static struct ecc_settings **ecc_stngs; static struct ecc_settings **ecc_stngs;
/* Number of Unified Memory Controllers */
static u8 num_umcs;
/* /*
* Valid scrub rates for the K8 hardware memory scrubber. We map the scrubbing * Valid scrub rates for the K8 hardware memory scrubber. We map the scrubbing
* bandwidth to a valid bit pattern. The 'set' operation finds the 'matching- * bandwidth to a valid bit pattern. The 'set' operation finds the 'matching-
...@@ -449,6 +452,9 @@ static void get_cs_base_and_mask(struct amd64_pvt *pvt, int csrow, u8 dct, ...@@ -449,6 +452,9 @@ static void get_cs_base_and_mask(struct amd64_pvt *pvt, int csrow, u8 dct,
#define for_each_chip_select_mask(i, dct, pvt) \ #define for_each_chip_select_mask(i, dct, pvt) \
for (i = 0; i < pvt->csels[dct].m_cnt; i++) for (i = 0; i < pvt->csels[dct].m_cnt; i++)
#define for_each_umc(i) \
for (i = 0; i < num_umcs; i++)
/* /*
* @input_addr is an InputAddr associated with the node given by mci. Return the * @input_addr is an InputAddr associated with the node given by mci. Return the
* csrow that input_addr maps to, or -1 on failure (no csrow claims input_addr). * csrow that input_addr maps to, or -1 on failure (no csrow claims input_addr).
...@@ -722,7 +728,7 @@ static unsigned long determine_edac_cap(struct amd64_pvt *pvt) ...@@ -722,7 +728,7 @@ static unsigned long determine_edac_cap(struct amd64_pvt *pvt)
if (pvt->umc) { if (pvt->umc) {
u8 i, umc_en_mask = 0, dimm_ecc_en_mask = 0; u8 i, umc_en_mask = 0, dimm_ecc_en_mask = 0;
for (i = 0; i < NUM_UMCS; i++) { for_each_umc(i) {
if (!(pvt->umc[i].sdp_ctrl & UMC_SDP_INIT)) if (!(pvt->umc[i].sdp_ctrl & UMC_SDP_INIT))
continue; continue;
...@@ -781,6 +787,22 @@ static void debug_dump_dramcfg_low(struct amd64_pvt *pvt, u32 dclr, int chan) ...@@ -781,6 +787,22 @@ static void debug_dump_dramcfg_low(struct amd64_pvt *pvt, u32 dclr, int chan)
(dclr & BIT(15)) ? "yes" : "no"); (dclr & BIT(15)) ? "yes" : "no");
} }
/*
* The Address Mask should be a contiguous set of bits in the non-interleaved
* case. So to check for CS interleaving, find the most- and least-significant
* bits of the mask, generate a contiguous bitmask, and compare the two.
*/
static bool f17_cs_interleaved(struct amd64_pvt *pvt, u8 ctrl, int cs)
{
u32 mask = pvt->csels[ctrl].csmasks[cs >> 1];
u32 msb = fls(mask) - 1, lsb = ffs(mask) - 1;
u32 test_mask = GENMASK(msb, lsb);
edac_dbg(1, "mask=0x%08x test_mask=0x%08x\n", mask, test_mask);
return mask ^ test_mask;
}
static void debug_display_dimm_sizes_df(struct amd64_pvt *pvt, u8 ctrl) static void debug_display_dimm_sizes_df(struct amd64_pvt *pvt, u8 ctrl)
{ {
int dimm, size0, size1, cs0, cs1; int dimm, size0, size1, cs0, cs1;
...@@ -797,8 +819,19 @@ static void debug_display_dimm_sizes_df(struct amd64_pvt *pvt, u8 ctrl) ...@@ -797,8 +819,19 @@ static void debug_display_dimm_sizes_df(struct amd64_pvt *pvt, u8 ctrl)
size1 = 0; size1 = 0;
cs1 = dimm * 2 + 1; cs1 = dimm * 2 + 1;
if (csrow_enabled(cs1, ctrl, pvt)) if (csrow_enabled(cs1, ctrl, pvt)) {
/*
* CS interleaving is only supported if both CSes have
* the same amount of memory. Because they are
* interleaved, it will look like both CSes have the
* full amount of memory. Save the size for both as
* half the amount we found on CS0, if interleaved.
*/
if (f17_cs_interleaved(pvt, ctrl, cs1))
size1 = size0 = (size0 >> 1);
else
size1 = pvt->ops->dbam_to_cs(pvt, ctrl, 0, cs1); size1 = pvt->ops->dbam_to_cs(pvt, ctrl, 0, cs1);
}
amd64_info(EDAC_MC ": %d: %5dMB %d: %5dMB\n", amd64_info(EDAC_MC ": %d: %5dMB %d: %5dMB\n",
cs0, size0, cs0, size0,
...@@ -811,7 +844,7 @@ static void __dump_misc_regs_df(struct amd64_pvt *pvt) ...@@ -811,7 +844,7 @@ static void __dump_misc_regs_df(struct amd64_pvt *pvt)
struct amd64_umc *umc; struct amd64_umc *umc;
u32 i, tmp, umc_base; u32 i, tmp, umc_base;
for (i = 0; i < NUM_UMCS; i++) { for_each_umc(i) {
umc_base = get_umc_base(i); umc_base = get_umc_base(i);
umc = &pvt->umc[i]; umc = &pvt->umc[i];
...@@ -894,8 +927,7 @@ static void dump_misc_regs(struct amd64_pvt *pvt) ...@@ -894,8 +927,7 @@ static void dump_misc_regs(struct amd64_pvt *pvt)
edac_dbg(1, " DramHoleValid: %s\n", dhar_valid(pvt) ? "yes" : "no"); edac_dbg(1, " DramHoleValid: %s\n", dhar_valid(pvt) ? "yes" : "no");
amd64_info("using %s syndromes.\n", amd64_info("using x%u syndromes.\n", pvt->ecc_sym_sz);
((pvt->ecc_sym_sz == 8) ? "x8" : "x4"));
} }
/* /*
...@@ -1388,7 +1420,7 @@ static int f17_early_channel_count(struct amd64_pvt *pvt) ...@@ -1388,7 +1420,7 @@ static int f17_early_channel_count(struct amd64_pvt *pvt)
int i, channels = 0; int i, channels = 0;
/* SDP Control bit 31 (SdpInit) is clear for unused UMC channels */ /* SDP Control bit 31 (SdpInit) is clear for unused UMC channels */
for (i = 0; i < NUM_UMCS; i++) for_each_umc(i)
channels += !!(pvt->umc[i].sdp_ctrl & UMC_SDP_INIT); channels += !!(pvt->umc[i].sdp_ctrl & UMC_SDP_INIT);
amd64_info("MCT channel count: %d\n", channels); amd64_info("MCT channel count: %d\n", channels);
...@@ -2211,6 +2243,15 @@ static struct amd64_family_type family_types[] = { ...@@ -2211,6 +2243,15 @@ static struct amd64_family_type family_types[] = {
.dbam_to_cs = f17_base_addr_to_cs_size, .dbam_to_cs = f17_base_addr_to_cs_size,
} }
}, },
[F17_M30H_CPUS] = {
.ctl_name = "F17h_M30h",
.f0_id = PCI_DEVICE_ID_AMD_17H_M30H_DF_F0,
.f6_id = PCI_DEVICE_ID_AMD_17H_M30H_DF_F6,
.ops = {
.early_channel_count = f17_early_channel_count,
.dbam_to_cs = f17_base_addr_to_cs_size,
}
},
}; };
/* /*
...@@ -2464,18 +2505,14 @@ static inline void decode_bus_error(int node_id, struct mce *m) ...@@ -2464,18 +2505,14 @@ static inline void decode_bus_error(int node_id, struct mce *m)
* To find the UMC channel represented by this bank we need to match on its * To find the UMC channel represented by this bank we need to match on its
* instance_id. The instance_id of a bank is held in the lower 32 bits of its * instance_id. The instance_id of a bank is held in the lower 32 bits of its
* IPID. * IPID.
*
* Currently, we can derive the channel number by looking at the 6th nibble in
* the instance_id. For example, instance_id=0xYXXXXX where Y is the channel
* number.
*/ */
static int find_umc_channel(struct amd64_pvt *pvt, struct mce *m) static int find_umc_channel(struct mce *m)
{ {
u32 umc_instance_id[] = {0x50f00, 0x150f00}; return (m->ipid & GENMASK(31, 0)) >> 20;
u32 instance_id = m->ipid & GENMASK(31, 0);
int i, channel = -1;
for (i = 0; i < ARRAY_SIZE(umc_instance_id); i++)
if (umc_instance_id[i] == instance_id)
channel = i;
return channel;
} }
static void decode_umc_error(int node_id, struct mce *m) static void decode_umc_error(int node_id, struct mce *m)
...@@ -2497,11 +2534,7 @@ static void decode_umc_error(int node_id, struct mce *m) ...@@ -2497,11 +2534,7 @@ static void decode_umc_error(int node_id, struct mce *m)
if (m->status & MCI_STATUS_DEFERRED) if (m->status & MCI_STATUS_DEFERRED)
ecc_type = 3; ecc_type = 3;
err.channel = find_umc_channel(pvt, m); err.channel = find_umc_channel(m);
if (err.channel < 0) {
err.err_code = ERR_CHANNEL;
goto log_error;
}
if (umc_normaddr_to_sysaddr(m->addr, pvt->mc_node_id, err.channel, &sys_addr)) { if (umc_normaddr_to_sysaddr(m->addr, pvt->mc_node_id, err.channel, &sys_addr)) {
err.err_code = ERR_NORM_ADDR; err.err_code = ERR_NORM_ADDR;
...@@ -2603,19 +2636,19 @@ static void determine_ecc_sym_sz(struct amd64_pvt *pvt) ...@@ -2603,19 +2636,19 @@ static void determine_ecc_sym_sz(struct amd64_pvt *pvt)
if (pvt->umc) { if (pvt->umc) {
u8 i; u8 i;
for (i = 0; i < NUM_UMCS; i++) { for_each_umc(i) {
/* Check enabled channels only: */ /* Check enabled channels only: */
if ((pvt->umc[i].sdp_ctrl & UMC_SDP_INIT) && if (pvt->umc[i].sdp_ctrl & UMC_SDP_INIT) {
(pvt->umc[i].ecc_ctrl & BIT(7))) { if (pvt->umc[i].ecc_ctrl & BIT(9)) {
pvt->ecc_sym_sz = 16;
return;
} else if (pvt->umc[i].ecc_ctrl & BIT(7)) {
pvt->ecc_sym_sz = 8; pvt->ecc_sym_sz = 8;
break; return;
} }
} }
return;
} }
} else if (pvt->fam >= 0x10) {
if (pvt->fam >= 0x10) {
u32 tmp; u32 tmp;
amd64_read_pci_cfg(pvt->F3, EXT_NB_MCA_CFG, &tmp); amd64_read_pci_cfg(pvt->F3, EXT_NB_MCA_CFG, &tmp);
...@@ -2639,7 +2672,7 @@ static void __read_mc_regs_df(struct amd64_pvt *pvt) ...@@ -2639,7 +2672,7 @@ static void __read_mc_regs_df(struct amd64_pvt *pvt)
u32 i, umc_base; u32 i, umc_base;
/* Read registers from each UMC */ /* Read registers from each UMC */
for (i = 0; i < NUM_UMCS; i++) { for_each_umc(i) {
umc_base = get_umc_base(i); umc_base = get_umc_base(i);
umc = &pvt->umc[i]; umc = &pvt->umc[i];
...@@ -3052,7 +3085,7 @@ static bool ecc_enabled(struct pci_dev *F3, u16 nid) ...@@ -3052,7 +3085,7 @@ static bool ecc_enabled(struct pci_dev *F3, u16 nid)
if (boot_cpu_data.x86 >= 0x17) { if (boot_cpu_data.x86 >= 0x17) {
u8 umc_en_mask = 0, ecc_en_mask = 0; u8 umc_en_mask = 0, ecc_en_mask = 0;
for (i = 0; i < NUM_UMCS; i++) { for_each_umc(i) {
u32 base = get_umc_base(i); u32 base = get_umc_base(i);
/* Only check enabled UMCs. */ /* Only check enabled UMCs. */
...@@ -3105,7 +3138,7 @@ f17h_determine_edac_ctl_cap(struct mem_ctl_info *mci, struct amd64_pvt *pvt) ...@@ -3105,7 +3138,7 @@ f17h_determine_edac_ctl_cap(struct mem_ctl_info *mci, struct amd64_pvt *pvt)
{ {
u8 i, ecc_en = 1, cpk_en = 1; u8 i, ecc_en = 1, cpk_en = 1;
for (i = 0; i < NUM_UMCS; i++) { for_each_umc(i) {
if (pvt->umc[i].sdp_ctrl & UMC_SDP_INIT) { if (pvt->umc[i].sdp_ctrl & UMC_SDP_INIT) {
ecc_en &= !!(pvt->umc[i].umc_cap_hi & UMC_ECC_ENABLED); ecc_en &= !!(pvt->umc[i].umc_cap_hi & UMC_ECC_ENABLED);
cpk_en &= !!(pvt->umc[i].umc_cap_hi & UMC_ECC_CHIPKILL_CAP); cpk_en &= !!(pvt->umc[i].umc_cap_hi & UMC_ECC_CHIPKILL_CAP);
...@@ -3203,6 +3236,10 @@ static struct amd64_family_type *per_family_init(struct amd64_pvt *pvt) ...@@ -3203,6 +3236,10 @@ static struct amd64_family_type *per_family_init(struct amd64_pvt *pvt)
fam_type = &family_types[F17_M10H_CPUS]; fam_type = &family_types[F17_M10H_CPUS];
pvt->ops = &family_types[F17_M10H_CPUS].ops; pvt->ops = &family_types[F17_M10H_CPUS].ops;
break; break;
} else if (pvt->model >= 0x30 && pvt->model <= 0x3f) {
fam_type = &family_types[F17_M30H_CPUS];
pvt->ops = &family_types[F17_M30H_CPUS].ops;
break;
} }
/* fall through */ /* fall through */
case 0x18: case 0x18:
...@@ -3236,6 +3273,22 @@ static const struct attribute_group *amd64_edac_attr_groups[] = { ...@@ -3236,6 +3273,22 @@ static const struct attribute_group *amd64_edac_attr_groups[] = {
NULL NULL
}; };
/* Set the number of Unified Memory Controllers in the system. */
static void compute_num_umcs(void)
{
u8 model = boot_cpu_data.x86_model;
if (boot_cpu_data.x86 < 0x17)
return;
if (model >= 0x30 && model <= 0x3f)
num_umcs = 8;
else
num_umcs = 2;
edac_dbg(1, "Number of UMCs: %x", num_umcs);
}
static int init_one_instance(unsigned int nid) static int init_one_instance(unsigned int nid)
{ {
struct pci_dev *F3 = node_to_amd_nb(nid)->misc; struct pci_dev *F3 = node_to_amd_nb(nid)->misc;
...@@ -3260,7 +3313,7 @@ static int init_one_instance(unsigned int nid) ...@@ -3260,7 +3313,7 @@ static int init_one_instance(unsigned int nid)
goto err_free; goto err_free;
if (pvt->fam >= 0x17) { if (pvt->fam >= 0x17) {
pvt->umc = kcalloc(NUM_UMCS, sizeof(struct amd64_umc), GFP_KERNEL); pvt->umc = kcalloc(num_umcs, sizeof(struct amd64_umc), GFP_KERNEL);
if (!pvt->umc) { if (!pvt->umc) {
ret = -ENOMEM; ret = -ENOMEM;
goto err_free; goto err_free;
...@@ -3299,7 +3352,13 @@ static int init_one_instance(unsigned int nid) ...@@ -3299,7 +3352,13 @@ static int init_one_instance(unsigned int nid)
* Always allocate two channels since we can have setups with DIMMs on * Always allocate two channels since we can have setups with DIMMs on
* only one channel. Also, this simplifies handling later for the price * only one channel. Also, this simplifies handling later for the price
* of a couple of KBs tops. * of a couple of KBs tops.
*
* On Fam17h+, the number of controllers may be greater than two. So set
* the size equal to the maximum number of UMCs.
*/ */
if (pvt->fam >= 0x17)
layers[1].size = num_umcs;
else
layers[1].size = 2; layers[1].size = 2;
layers[1].is_virt_csrow = false; layers[1].is_virt_csrow = false;
...@@ -3481,6 +3540,8 @@ static int __init amd64_edac_init(void) ...@@ -3481,6 +3540,8 @@ static int __init amd64_edac_init(void)
if (!msrs) if (!msrs)
goto err_free; goto err_free;
compute_num_umcs();
for (i = 0; i < amd_nb_num(); i++) { for (i = 0; i < amd_nb_num(); i++) {
err = probe_one_instance(i); err = probe_one_instance(i);
if (err) { if (err) {
......
...@@ -117,6 +117,8 @@ ...@@ -117,6 +117,8 @@
#define PCI_DEVICE_ID_AMD_17H_DF_F6 0x1466 #define PCI_DEVICE_ID_AMD_17H_DF_F6 0x1466
#define PCI_DEVICE_ID_AMD_17H_M10H_DF_F0 0x15e8 #define PCI_DEVICE_ID_AMD_17H_M10H_DF_F0 0x15e8
#define PCI_DEVICE_ID_AMD_17H_M10H_DF_F6 0x15ee #define PCI_DEVICE_ID_AMD_17H_M10H_DF_F6 0x15ee
#define PCI_DEVICE_ID_AMD_17H_M30H_DF_F0 0x1490
#define PCI_DEVICE_ID_AMD_17H_M30H_DF_F6 0x1496
/* /*
* Function 1 - Address Map * Function 1 - Address Map
...@@ -272,8 +274,6 @@ ...@@ -272,8 +274,6 @@
#define UMC_SDP_INIT BIT(31) #define UMC_SDP_INIT BIT(31)
#define NUM_UMCS 2
enum amd_families { enum amd_families {
K8_CPUS = 0, K8_CPUS = 0,
F10_CPUS, F10_CPUS,
...@@ -284,6 +284,7 @@ enum amd_families { ...@@ -284,6 +284,7 @@ enum amd_families {
F16_M30H_CPUS, F16_M30H_CPUS,
F17_CPUS, F17_CPUS,
F17_M10H_CPUS, F17_M10H_CPUS,
F17_M30H_CPUS,
NUM_FAMILIES, NUM_FAMILIES,
}; };
...@@ -363,7 +364,7 @@ struct amd64_pvt { ...@@ -363,7 +364,7 @@ struct amd64_pvt {
u32 dct_sel_hi; /* DRAM Controller Select High */ u32 dct_sel_hi; /* DRAM Controller Select High */
u32 online_spare; /* On-Line spare Reg */ u32 online_spare; /* On-Line spare Reg */
/* x4 or x8 syndromes in use */ /* x4, x8, or x16 syndromes in use */
u8 ecc_sym_sz; u8 ecc_sym_sz;
/* place to store error injection parameters prior to issue */ /* place to store error injection parameters prior to issue */
...@@ -396,8 +397,8 @@ struct err_info { ...@@ -396,8 +397,8 @@ struct err_info {
static inline u32 get_umc_base(u8 channel) static inline u32 get_umc_base(u8 channel)
{ {
/* ch0: 0x50000, ch1: 0x150000 */ /* chY: 0xY50000 */
return 0x50000 + (!!channel << 20); return 0x50000 + (channel << 20);
} }
static inline u64 get_dram_base(struct amd64_pvt *pvt, u8 i) static inline u64 get_dram_base(struct amd64_pvt *pvt, u8 i)
......
...@@ -181,6 +181,54 @@ static struct notifier_block i10nm_mce_dec = { ...@@ -181,6 +181,54 @@ static struct notifier_block i10nm_mce_dec = {
.priority = MCE_PRIO_EDAC, .priority = MCE_PRIO_EDAC,
}; };
#ifdef CONFIG_EDAC_DEBUG
/*
* Debug feature.
* Exercise the address decode logic by writing an address to
* /sys/kernel/debug/edac/i10nm_test/addr.
*/
static struct dentry *i10nm_test;
static int debugfs_u64_set(void *data, u64 val)
{
struct mce m;
pr_warn_once("Fake error to 0x%llx injected via debugfs\n", val);
memset(&m, 0, sizeof(m));
/* ADDRV + MemRd + Unknown channel */
m.status = MCI_STATUS_ADDRV + 0x90;
/* One corrected error */
m.status |= BIT_ULL(MCI_STATUS_CEC_SHIFT);
m.addr = val;
skx_mce_check_error(NULL, 0, &m);
return 0;
}
DEFINE_SIMPLE_ATTRIBUTE(fops_u64_wo, NULL, debugfs_u64_set, "%llu\n");
static void setup_i10nm_debug(void)
{
i10nm_test = edac_debugfs_create_dir("i10nm_test");
if (!i10nm_test)
return;
if (!edac_debugfs_create_file("addr", 0200, i10nm_test,
NULL, &fops_u64_wo)) {
debugfs_remove(i10nm_test);
i10nm_test = NULL;
}
}
static void teardown_i10nm_debug(void)
{
debugfs_remove_recursive(i10nm_test);
}
#else
static inline void setup_i10nm_debug(void) {}
static inline void teardown_i10nm_debug(void) {}
#endif /*CONFIG_EDAC_DEBUG*/
static int __init i10nm_init(void) static int __init i10nm_init(void)
{ {
u8 mc = 0, src_id = 0, node_id = 0; u8 mc = 0, src_id = 0, node_id = 0;
...@@ -249,7 +297,7 @@ static int __init i10nm_init(void) ...@@ -249,7 +297,7 @@ static int __init i10nm_init(void)
opstate_init(); opstate_init();
mce_register_decode_chain(&i10nm_mce_dec); mce_register_decode_chain(&i10nm_mce_dec);
setup_skx_debug("i10nm_test"); setup_i10nm_debug();
i10nm_printk(KERN_INFO, "%s\n", I10NM_REVISION); i10nm_printk(KERN_INFO, "%s\n", I10NM_REVISION);
...@@ -262,7 +310,7 @@ static int __init i10nm_init(void) ...@@ -262,7 +310,7 @@ static int __init i10nm_init(void)
static void __exit i10nm_exit(void) static void __exit i10nm_exit(void)
{ {
edac_dbg(2, "\n"); edac_dbg(2, "\n");
teardown_skx_debug(); teardown_i10nm_debug();
mce_unregister_decode_chain(&i10nm_mce_dec); mce_unregister_decode_chain(&i10nm_mce_dec);
skx_adxl_put(); skx_adxl_put();
skx_remove(); skx_remove();
......
...@@ -540,6 +540,54 @@ static struct notifier_block skx_mce_dec = { ...@@ -540,6 +540,54 @@ static struct notifier_block skx_mce_dec = {
.priority = MCE_PRIO_EDAC, .priority = MCE_PRIO_EDAC,
}; };
#ifdef CONFIG_EDAC_DEBUG
/*
* Debug feature.
* Exercise the address decode logic by writing an address to
* /sys/kernel/debug/edac/skx_test/addr.
*/
static struct dentry *skx_test;
static int debugfs_u64_set(void *data, u64 val)
{
struct mce m;
pr_warn_once("Fake error to 0x%llx injected via debugfs\n", val);
memset(&m, 0, sizeof(m));
/* ADDRV + MemRd + Unknown channel */
m.status = MCI_STATUS_ADDRV + 0x90;
/* One corrected error */
m.status |= BIT_ULL(MCI_STATUS_CEC_SHIFT);
m.addr = val;
skx_mce_check_error(NULL, 0, &m);
return 0;
}
DEFINE_SIMPLE_ATTRIBUTE(fops_u64_wo, NULL, debugfs_u64_set, "%llu\n");
static void setup_skx_debug(void)
{
skx_test = edac_debugfs_create_dir("skx_test");
if (!skx_test)
return;
if (!edac_debugfs_create_file("addr", 0200, skx_test,
NULL, &fops_u64_wo)) {
debugfs_remove(skx_test);
skx_test = NULL;
}
}
static void teardown_skx_debug(void)
{
debugfs_remove_recursive(skx_test);
}
#else
static inline void setup_skx_debug(void) {}
static inline void teardown_skx_debug(void) {}
#endif /*CONFIG_EDAC_DEBUG*/
/* /*
* skx_init: * skx_init:
* make sure we are running on the correct cpu model * make sure we are running on the correct cpu model
...@@ -619,7 +667,7 @@ static int __init skx_init(void) ...@@ -619,7 +667,7 @@ static int __init skx_init(void)
/* Ensure that the OPSTATE is set correctly for POLL or NMI */ /* Ensure that the OPSTATE is set correctly for POLL or NMI */
opstate_init(); opstate_init();
setup_skx_debug("skx_test"); setup_skx_debug();
mce_register_decode_chain(&skx_mce_dec); mce_register_decode_chain(&skx_mce_dec);
......
// SPDX-License-Identifier: GPL-2.0 // SPDX-License-Identifier: GPL-2.0
/* /*
* Common codes for both the skx_edac driver and Intel 10nm server EDAC driver. *
* Originally split out from the skx_edac driver. * Shared code by both skx_edac and i10nm_edac. Originally split out
* from the skx_edac driver.
*
* This file is linked into both skx_edac and i10nm_edac drivers. In
* order to avoid link errors, this file must be like a pure library
* without including symbols and defines which would otherwise conflict,
* when linked once into a module and into a built-in object, at the
* same time. For example, __this_module symbol references when that
* file is being linked into a built-in object.
* *
* Copyright (c) 2018, Intel Corporation. * Copyright (c) 2018, Intel Corporation.
*/ */
...@@ -644,48 +652,3 @@ void skx_remove(void) ...@@ -644,48 +652,3 @@ void skx_remove(void)
kfree(d); kfree(d);
} }
} }
#ifdef CONFIG_EDAC_DEBUG
/*
* Debug feature.
* Exercise the address decode logic by writing an address to
* /sys/kernel/debug/edac/dirname/addr.
*/
static struct dentry *skx_test;
static int debugfs_u64_set(void *data, u64 val)
{
struct mce m;
pr_warn_once("Fake error to 0x%llx injected via debugfs\n", val);
memset(&m, 0, sizeof(m));
/* ADDRV + MemRd + Unknown channel */
m.status = MCI_STATUS_ADDRV + 0x90;
/* One corrected error */
m.status |= BIT_ULL(MCI_STATUS_CEC_SHIFT);
m.addr = val;
skx_mce_check_error(NULL, 0, &m);
return 0;
}
DEFINE_SIMPLE_ATTRIBUTE(fops_u64_wo, NULL, debugfs_u64_set, "%llu\n");
void setup_skx_debug(const char *dirname)
{
skx_test = edac_debugfs_create_dir(dirname);
if (!skx_test)
return;
if (!edac_debugfs_create_file("addr", 0200, skx_test,
NULL, &fops_u64_wo)) {
debugfs_remove(skx_test);
skx_test = NULL;
}
}
void teardown_skx_debug(void)
{
debugfs_remove_recursive(skx_test);
}
#endif /*CONFIG_EDAC_DEBUG*/
...@@ -141,12 +141,4 @@ int skx_mce_check_error(struct notifier_block *nb, unsigned long val, ...@@ -141,12 +141,4 @@ int skx_mce_check_error(struct notifier_block *nb, unsigned long val,
void skx_remove(void); void skx_remove(void);
#ifdef CONFIG_EDAC_DEBUG
void setup_skx_debug(const char *dirname);
void teardown_skx_debug(void);
#else
static inline void setup_skx_debug(const char *dirname) {}
static inline void teardown_skx_debug(void) {}
#endif /*CONFIG_EDAC_DEBUG*/
#endif /* _SKX_COMM_EDAC_H */ #endif /* _SKX_COMM_EDAC_H */
...@@ -309,4 +309,23 @@ INTEL_SIP_SMC_FAST_CALL_VAL(INTEL_SIP_SMC_FUNCID_FPGA_CONFIG_COMPLETED_WRITE) ...@@ -309,4 +309,23 @@ INTEL_SIP_SMC_FAST_CALL_VAL(INTEL_SIP_SMC_FUNCID_FPGA_CONFIG_COMPLETED_WRITE)
#define INTEL_SIP_SMC_FUNCID_RSU_UPDATE 12 #define INTEL_SIP_SMC_FUNCID_RSU_UPDATE 12
#define INTEL_SIP_SMC_RSU_UPDATE \ #define INTEL_SIP_SMC_RSU_UPDATE \
INTEL_SIP_SMC_FAST_CALL_VAL(INTEL_SIP_SMC_FUNCID_RSU_UPDATE) INTEL_SIP_SMC_FAST_CALL_VAL(INTEL_SIP_SMC_FUNCID_RSU_UPDATE)
/*
* Request INTEL_SIP_SMC_ECC_DBE
*
* Sync call used by service driver at EL1 to alert EL3 that a Double
* Bit ECC error has occurred.
*
* Call register usage:
* a0 INTEL_SIP_SMC_ECC_DBE
* a1 SysManager Double Bit Error value
* a2-7 not used
*
* Return status
* a0 INTEL_SIP_SMC_STATUS_OK
*/
#define INTEL_SIP_SMC_FUNCID_ECC_DBE 13
#define INTEL_SIP_SMC_ECC_DBE \
INTEL_SIP_SMC_FAST_CALL_VAL(INTEL_SIP_SMC_FUNCID_ECC_DBE)
#endif #endif
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment