Commit d1e3306b authored by Joerg Roedel's avatar Joerg Roedel

Merge tag 'arm-smmu-updates' of...

Merge tag 'arm-smmu-updates' of git://git.kernel.org/pub/scm/linux/kernel/git/will/linux into arm/smmu

Arm SMMU updates for 5.12

- Support for MT8192 IOMMU from Mediatek

- Arm v7s io-pgtable extensions for MT8192

- Removal of TLBI_ON_MAP quirk

- New Qualcomm compatible strings

- Allow SVA without hardware broadcast TLB maintenance on SMMUv3

- Virtualization Host Extension support for SMMUv3 (SVA)

- Allow SMMUv3 PMU (perf) driver to be built independently from IOMMU

- Misc cleanups
parents 6ee1d745 7060377c
......@@ -34,9 +34,11 @@ properties:
items:
- enum:
- qcom,sc7180-smmu-500
- qcom,sc8180x-smmu-500
- qcom,sdm845-smmu-500
- qcom,sm8150-smmu-500
- qcom,sm8250-smmu-500
- qcom,sm8350-smmu-500
- const: arm,mmu-500
- description: Qcom Adreno GPUs implementing "arm,smmu-v2"
items:
......
* Mediatek IOMMU Architecture Implementation
Some Mediatek SOCs contain a Multimedia Memory Management Unit (M4U), and
this M4U have two generations of HW architecture. Generation one uses flat
pagetable, and only supports 4K size page mapping. Generation two uses the
ARM Short-Descriptor translation table format for address translation.
About the M4U Hardware Block Diagram, please check below:
EMI (External Memory Interface)
|
m4u (Multimedia Memory Management Unit)
|
+--------+
| |
gals0-rx gals1-rx (Global Async Local Sync rx)
| |
| |
gals0-tx gals1-tx (Global Async Local Sync tx)
| | Some SoCs may have GALS.
+--------+
|
SMI Common(Smart Multimedia Interface Common)
|
+----------------+-------
| |
| gals-rx There may be GALS in some larbs.
| |
| |
| gals-tx
| |
SMI larb0 SMI larb1 ... SoCs have several SMI local arbiter(larb).
(display) (vdec)
| |
| |
+-----+-----+ +----+----+
| | | | | |
| | |... | | | ... There are different ports in each larb.
| | | | | |
OVL0 RDMA0 WDMA0 MC PP VLD
As above, The Multimedia HW will go through SMI and M4U while it
access EMI. SMI is a bridge between m4u and the Multimedia HW. It contain
smi local arbiter and smi common. It will control whether the Multimedia
HW should go though the m4u for translation or bypass it and talk
directly with EMI. And also SMI help control the power domain and clocks for
each local arbiter.
Normally we specify a local arbiter(larb) for each multimedia HW
like display, video decode, and camera. And there are different ports
in each larb. Take a example, There are many ports like MC, PP, VLD in the
video decode local arbiter, all these ports are according to the video HW.
In some SoCs, there may be a GALS(Global Async Local Sync) module between
smi-common and m4u, and additional GALS module between smi-larb and
smi-common. GALS can been seen as a "asynchronous fifo" which could help
synchronize for the modules in different clock frequency.
Required properties:
- compatible : must be one of the following string:
"mediatek,mt2701-m4u" for mt2701 which uses generation one m4u HW.
"mediatek,mt2712-m4u" for mt2712 which uses generation two m4u HW.
"mediatek,mt6779-m4u" for mt6779 which uses generation two m4u HW.
"mediatek,mt7623-m4u", "mediatek,mt2701-m4u" for mt7623 which uses
generation one m4u HW.
"mediatek,mt8167-m4u" for mt8167 which uses generation two m4u HW.
"mediatek,mt8173-m4u" for mt8173 which uses generation two m4u HW.
"mediatek,mt8183-m4u" for mt8183 which uses generation two m4u HW.
- reg : m4u register base and size.
- interrupts : the interrupt of m4u.
- clocks : must contain one entry for each clock-names.
- clock-names : Only 1 optional clock:
- "bclk": the block clock of m4u.
Here is the list which require this "bclk":
- mt2701, mt2712, mt7623 and mt8173.
Note that m4u use the EMI clock which always has been enabled before kernel
if there is no this "bclk".
- mediatek,larbs : List of phandle to the local arbiters in the current Socs.
Refer to bindings/memory-controllers/mediatek,smi-larb.txt. It must sort
according to the local arbiter index, like larb0, larb1, larb2...
- iommu-cells : must be 1. This is the mtk_m4u_id according to the HW.
Specifies the mtk_m4u_id as defined in
dt-binding/memory/mt2701-larb-port.h for mt2701, mt7623
dt-binding/memory/mt2712-larb-port.h for mt2712,
dt-binding/memory/mt6779-larb-port.h for mt6779,
dt-binding/memory/mt8167-larb-port.h for mt8167,
dt-binding/memory/mt8173-larb-port.h for mt8173, and
dt-binding/memory/mt8183-larb-port.h for mt8183.
Example:
iommu: iommu@10205000 {
compatible = "mediatek,mt8173-m4u";
reg = <0 0x10205000 0 0x1000>;
interrupts = <GIC_SPI 139 IRQ_TYPE_LEVEL_LOW>;
clocks = <&infracfg CLK_INFRA_M4U>;
clock-names = "bclk";
mediatek,larbs = <&larb0 &larb1 &larb2 &larb3 &larb4 &larb5>;
#iommu-cells = <1>;
};
Example for a client device:
display {
compatible = "mediatek,mt8173-disp";
iommus = <&iommu M4U_PORT_DISP_OVL0>,
<&iommu M4U_PORT_DISP_RDMA0>;
...
};
# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
%YAML 1.2
---
$id: http://devicetree.org/schemas/iommu/mediatek,iommu.yaml#
$schema: http://devicetree.org/meta-schemas/core.yaml#
title: MediaTek IOMMU Architecture Implementation
maintainers:
- Yong Wu <yong.wu@mediatek.com>
description: |+
Some MediaTek SOCs contain a Multimedia Memory Management Unit (M4U), and
this M4U have two generations of HW architecture. Generation one uses flat
pagetable, and only supports 4K size page mapping. Generation two uses the
ARM Short-Descriptor translation table format for address translation.
About the M4U Hardware Block Diagram, please check below:
EMI (External Memory Interface)
|
m4u (Multimedia Memory Management Unit)
|
+--------+
| |
gals0-rx gals1-rx (Global Async Local Sync rx)
| |
| |
gals0-tx gals1-tx (Global Async Local Sync tx)
| | Some SoCs may have GALS.
+--------+
|
SMI Common(Smart Multimedia Interface Common)
|
+----------------+-------
| |
| gals-rx There may be GALS in some larbs.
| |
| |
| gals-tx
| |
SMI larb0 SMI larb1 ... SoCs have several SMI local arbiter(larb).
(display) (vdec)
| |
| |
+-----+-----+ +----+----+
| | | | | |
| | |... | | | ... There are different ports in each larb.
| | | | | |
OVL0 RDMA0 WDMA0 MC PP VLD
As above, The Multimedia HW will go through SMI and M4U while it
access EMI. SMI is a bridge between m4u and the Multimedia HW. It contain
smi local arbiter and smi common. It will control whether the Multimedia
HW should go though the m4u for translation or bypass it and talk
directly with EMI. And also SMI help control the power domain and clocks for
each local arbiter.
Normally we specify a local arbiter(larb) for each multimedia HW
like display, video decode, and camera. And there are different ports
in each larb. Take a example, There are many ports like MC, PP, VLD in the
video decode local arbiter, all these ports are according to the video HW.
In some SoCs, there may be a GALS(Global Async Local Sync) module between
smi-common and m4u, and additional GALS module between smi-larb and
smi-common. GALS can been seen as a "asynchronous fifo" which could help
synchronize for the modules in different clock frequency.
properties:
compatible:
oneOf:
- enum:
- mediatek,mt2701-m4u # generation one
- mediatek,mt2712-m4u # generation two
- mediatek,mt6779-m4u # generation two
- mediatek,mt8167-m4u # generation two
- mediatek,mt8173-m4u # generation two
- mediatek,mt8183-m4u # generation two
- mediatek,mt8192-m4u # generation two
- description: mt7623 generation one
items:
- const: mediatek,mt7623-m4u
- const: mediatek,mt2701-m4u
reg:
maxItems: 1
interrupts:
maxItems: 1
clocks:
items:
- description: bclk is the block clock.
clock-names:
items:
- const: bclk
mediatek,larbs:
$ref: /schemas/types.yaml#/definitions/phandle-array
minItems: 1
maxItems: 32
description: |
List of phandle to the local arbiters in the current Socs.
Refer to bindings/memory-controllers/mediatek,smi-larb.yaml. It must sort
according to the local arbiter index, like larb0, larb1, larb2...
'#iommu-cells':
const: 1
description: |
This is the mtk_m4u_id according to the HW. Specifies the mtk_m4u_id as
defined in
dt-binding/memory/mt2701-larb-port.h for mt2701 and mt7623,
dt-binding/memory/mt2712-larb-port.h for mt2712,
dt-binding/memory/mt6779-larb-port.h for mt6779,
dt-binding/memory/mt8167-larb-port.h for mt8167,
dt-binding/memory/mt8173-larb-port.h for mt8173,
dt-binding/memory/mt8183-larb-port.h for mt8183,
dt-binding/memory/mt8192-larb-port.h for mt8192.
power-domains:
maxItems: 1
required:
- compatible
- reg
- interrupts
- mediatek,larbs
- '#iommu-cells'
allOf:
- if:
properties:
compatible:
contains:
enum:
- mediatek,mt2701-m4u
- mediatek,mt2712-m4u
- mediatek,mt8173-m4u
- mediatek,mt8192-m4u
then:
required:
- clocks
- if:
properties:
compatible:
enum:
- mediatek,mt8192-m4u
then:
required:
- power-domains
additionalProperties: false
examples:
- |
#include <dt-bindings/clock/mt8173-clk.h>
#include <dt-bindings/interrupt-controller/arm-gic.h>
iommu: iommu@10205000 {
compatible = "mediatek,mt8173-m4u";
reg = <0x10205000 0x1000>;
interrupts = <GIC_SPI 139 IRQ_TYPE_LEVEL_LOW>;
clocks = <&infracfg CLK_INFRA_M4U>;
clock-names = "bclk";
mediatek,larbs = <&larb0 &larb1 &larb2
&larb3 &larb4 &larb5>;
#iommu-cells = <1>;
};
- |
#include <dt-bindings/memory/mt8173-larb-port.h>
/* Example for a client device */
display {
compatible = "mediatek,mt8173-disp";
iommus = <&iommu M4U_PORT_DISP_OVL0>,
<&iommu M4U_PORT_DISP_RDMA0>;
};
......@@ -11178,6 +11178,15 @@ S: Maintained
F: Documentation/devicetree/bindings/i2c/i2c-mt65xx.txt
F: drivers/i2c/busses/i2c-mt65xx.c
MEDIATEK IOMMU DRIVER
M: Yong Wu <yong.wu@mediatek.com>
L: iommu@lists.linux-foundation.org
L: linux-mediatek@lists.infradead.org (moderated for non-subscribers)
S: Supported
F: Documentation/devicetree/bindings/iommu/mediatek*
F: drivers/iommu/mtk-iommu*
F: include/dt-bindings/memory/mt*-port.h
MEDIATEK JPEG DRIVER
M: Rick Chang <rick.chang@mediatek.com>
M: Bin Liu <bin.liu@mediatek.com>
......
......@@ -182,9 +182,13 @@ static void arm_smmu_mm_invalidate_range(struct mmu_notifier *mn,
unsigned long start, unsigned long end)
{
struct arm_smmu_mmu_notifier *smmu_mn = mn_to_smmu(mn);
struct arm_smmu_domain *smmu_domain = smmu_mn->domain;
size_t size = end - start + 1;
arm_smmu_atc_inv_domain(smmu_mn->domain, mm->pasid, start,
end - start + 1);
if (!(smmu_domain->smmu->features & ARM_SMMU_FEAT_BTM))
arm_smmu_tlb_inv_range_asid(start, size, smmu_mn->cd->asid,
PAGE_SIZE, false, smmu_domain);
arm_smmu_atc_inv_domain(smmu_domain, mm->pasid, start, size);
}
static void arm_smmu_mm_release(struct mmu_notifier *mn, struct mm_struct *mm)
......@@ -391,7 +395,7 @@ bool arm_smmu_sva_supported(struct arm_smmu_device *smmu)
unsigned long reg, fld;
unsigned long oas;
unsigned long asid_bits;
u32 feat_mask = ARM_SMMU_FEAT_BTM | ARM_SMMU_FEAT_COHERENCY;
u32 feat_mask = ARM_SMMU_FEAT_COHERENCY;
if (vabits_actual == 52)
feat_mask |= ARM_SMMU_FEAT_VAX;
......
This diff is collapsed.
......@@ -139,15 +139,15 @@
#define ARM_SMMU_CMDQ_CONS 0x9c
#define ARM_SMMU_EVTQ_BASE 0xa0
#define ARM_SMMU_EVTQ_PROD 0x100a8
#define ARM_SMMU_EVTQ_CONS 0x100ac
#define ARM_SMMU_EVTQ_PROD 0xa8
#define ARM_SMMU_EVTQ_CONS 0xac
#define ARM_SMMU_EVTQ_IRQ_CFG0 0xb0
#define ARM_SMMU_EVTQ_IRQ_CFG1 0xb8
#define ARM_SMMU_EVTQ_IRQ_CFG2 0xbc
#define ARM_SMMU_PRIQ_BASE 0xc0
#define ARM_SMMU_PRIQ_PROD 0x100c8
#define ARM_SMMU_PRIQ_CONS 0x100cc
#define ARM_SMMU_PRIQ_PROD 0xc8
#define ARM_SMMU_PRIQ_CONS 0xcc
#define ARM_SMMU_PRIQ_IRQ_CFG0 0xd0
#define ARM_SMMU_PRIQ_IRQ_CFG1 0xd8
#define ARM_SMMU_PRIQ_IRQ_CFG2 0xdc
......@@ -430,6 +430,8 @@ struct arm_smmu_cmdq_ent {
#define CMDQ_OP_TLBI_NH_ASID 0x11
#define CMDQ_OP_TLBI_NH_VA 0x12
#define CMDQ_OP_TLBI_EL2_ALL 0x20
#define CMDQ_OP_TLBI_EL2_ASID 0x21
#define CMDQ_OP_TLBI_EL2_VA 0x22
#define CMDQ_OP_TLBI_S12_VMALL 0x28
#define CMDQ_OP_TLBI_S2_IPA 0x2a
#define CMDQ_OP_TLBI_NSNH_ALL 0x30
......@@ -604,6 +606,7 @@ struct arm_smmu_device {
#define ARM_SMMU_FEAT_RANGE_INV (1 << 15)
#define ARM_SMMU_FEAT_BTM (1 << 16)
#define ARM_SMMU_FEAT_SVA (1 << 17)
#define ARM_SMMU_FEAT_E2H (1 << 18)
u32 features;
#define ARM_SMMU_OPT_SKIP_PREFETCH (1 << 0)
......@@ -694,6 +697,9 @@ extern struct arm_smmu_ctx_desc quiet_cd;
int arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain, int ssid,
struct arm_smmu_ctx_desc *cd);
void arm_smmu_tlb_inv_asid(struct arm_smmu_device *smmu, u16 asid);
void arm_smmu_tlb_inv_range_asid(unsigned long iova, size_t size, int asid,
size_t granule, bool leaf,
struct arm_smmu_domain *smmu_domain);
bool arm_smmu_free_asid(struct arm_smmu_ctx_desc *cd);
int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain, int ssid,
unsigned long iova, size_t size);
......
......@@ -166,6 +166,7 @@ static const struct of_device_id qcom_smmu_client_of_match[] __maybe_unused = {
{ .compatible = "qcom,mdss" },
{ .compatible = "qcom,sc7180-mdss" },
{ .compatible = "qcom,sc7180-mss-pil" },
{ .compatible = "qcom,sc8180x-mdss" },
{ .compatible = "qcom,sdm845-mdss" },
{ .compatible = "qcom,sdm845-mss-pil" },
{ }
......@@ -206,6 +207,8 @@ static int qcom_smmu_cfg_probe(struct arm_smmu_device *smmu)
smr = arm_smmu_gr0_read(smmu, ARM_SMMU_GR0_SMR(i));
if (FIELD_GET(ARM_SMMU_SMR_VALID, smr)) {
/* Ignore valid bit for SMR mask extraction. */
smr &= ~ARM_SMMU_SMR_VALID;
smmu->smrs[i].id = FIELD_GET(ARM_SMMU_SMR_ID, smr);
smmu->smrs[i].mask = FIELD_GET(ARM_SMMU_SMR_MASK, smr);
smmu->smrs[i].valid = true;
......@@ -327,10 +330,12 @@ static struct arm_smmu_device *qcom_smmu_create(struct arm_smmu_device *smmu,
static const struct of_device_id __maybe_unused qcom_smmu_impl_of_match[] = {
{ .compatible = "qcom,msm8998-smmu-v2" },
{ .compatible = "qcom,sc7180-smmu-500" },
{ .compatible = "qcom,sc8180x-smmu-500" },
{ .compatible = "qcom,sdm630-smmu-v2" },
{ .compatible = "qcom,sdm845-smmu-500" },
{ .compatible = "qcom,sm8150-smmu-500" },
{ .compatible = "qcom,sm8250-smmu-500" },
{ .compatible = "qcom,sm8350-smmu-500" },
{ }
};
......
......@@ -44,26 +44,25 @@
/*
* We have 32 bits total; 12 bits resolved at level 1, 8 bits at level 2,
* and 12 bits in a page. With some carefully-chosen coefficients we can
* hide the ugly inconsistencies behind these macros and at least let the
* rest of the code pretend to be somewhat sane.
* and 12 bits in a page.
* MediaTek extend 2 bits to reach 34bits, 14 bits at lvl1 and 8 bits at lvl2.
*/
#define ARM_V7S_ADDR_BITS 32
#define _ARM_V7S_LVL_BITS(lvl) (16 - (lvl) * 4)
#define ARM_V7S_LVL_SHIFT(lvl) (ARM_V7S_ADDR_BITS - (4 + 8 * (lvl)))
#define _ARM_V7S_LVL_BITS(lvl, cfg) ((lvl) == 1 ? ((cfg)->ias - 20) : 8)
#define ARM_V7S_LVL_SHIFT(lvl) ((lvl) == 1 ? 20 : 12)
#define ARM_V7S_TABLE_SHIFT 10
#define ARM_V7S_PTES_PER_LVL(lvl) (1 << _ARM_V7S_LVL_BITS(lvl))
#define ARM_V7S_TABLE_SIZE(lvl) \
(ARM_V7S_PTES_PER_LVL(lvl) * sizeof(arm_v7s_iopte))
#define ARM_V7S_PTES_PER_LVL(lvl, cfg) (1 << _ARM_V7S_LVL_BITS(lvl, cfg))
#define ARM_V7S_TABLE_SIZE(lvl, cfg) \
(ARM_V7S_PTES_PER_LVL(lvl, cfg) * sizeof(arm_v7s_iopte))
#define ARM_V7S_BLOCK_SIZE(lvl) (1UL << ARM_V7S_LVL_SHIFT(lvl))
#define ARM_V7S_LVL_MASK(lvl) ((u32)(~0U << ARM_V7S_LVL_SHIFT(lvl)))
#define ARM_V7S_TABLE_MASK ((u32)(~0U << ARM_V7S_TABLE_SHIFT))
#define _ARM_V7S_IDX_MASK(lvl) (ARM_V7S_PTES_PER_LVL(lvl) - 1)
#define ARM_V7S_LVL_IDX(addr, lvl) ({ \
#define _ARM_V7S_IDX_MASK(lvl, cfg) (ARM_V7S_PTES_PER_LVL(lvl, cfg) - 1)
#define ARM_V7S_LVL_IDX(addr, lvl, cfg) ({ \
int _l = lvl; \
((u32)(addr) >> ARM_V7S_LVL_SHIFT(_l)) & _ARM_V7S_IDX_MASK(_l); \
((addr) >> ARM_V7S_LVL_SHIFT(_l)) & _ARM_V7S_IDX_MASK(_l, cfg); \
})
/*
......@@ -112,9 +111,10 @@
#define ARM_V7S_TEX_MASK 0x7
#define ARM_V7S_ATTR_TEX(val) (((val) & ARM_V7S_TEX_MASK) << ARM_V7S_TEX_SHIFT)
/* MediaTek extend the two bits for PA 32bit/33bit */
/* MediaTek extend the bits below for PA 32bit/33bit/34bit */
#define ARM_V7S_ATTR_MTK_PA_BIT32 BIT(9)
#define ARM_V7S_ATTR_MTK_PA_BIT33 BIT(4)
#define ARM_V7S_ATTR_MTK_PA_BIT34 BIT(5)
/* *well, except for TEX on level 2 large pages, of course :( */
#define ARM_V7S_CONT_PAGE_TEX_SHIFT 6
......@@ -194,6 +194,8 @@ static arm_v7s_iopte paddr_to_iopte(phys_addr_t paddr, int lvl,
pte |= ARM_V7S_ATTR_MTK_PA_BIT32;
if (paddr & BIT_ULL(33))
pte |= ARM_V7S_ATTR_MTK_PA_BIT33;
if (paddr & BIT_ULL(34))
pte |= ARM_V7S_ATTR_MTK_PA_BIT34;
return pte;
}
......@@ -218,6 +220,8 @@ static phys_addr_t iopte_to_paddr(arm_v7s_iopte pte, int lvl,
paddr |= BIT_ULL(32);
if (pte & ARM_V7S_ATTR_MTK_PA_BIT33)
paddr |= BIT_ULL(33);
if (pte & ARM_V7S_ATTR_MTK_PA_BIT34)
paddr |= BIT_ULL(34);
return paddr;
}
......@@ -234,7 +238,7 @@ static void *__arm_v7s_alloc_table(int lvl, gfp_t gfp,
struct device *dev = cfg->iommu_dev;
phys_addr_t phys;
dma_addr_t dma;
size_t size = ARM_V7S_TABLE_SIZE(lvl);
size_t size = ARM_V7S_TABLE_SIZE(lvl, cfg);
void *table = NULL;
if (lvl == 1)
......@@ -280,7 +284,7 @@ static void __arm_v7s_free_table(void *table, int lvl,
{
struct io_pgtable_cfg *cfg = &data->iop.cfg;
struct device *dev = cfg->iommu_dev;
size_t size = ARM_V7S_TABLE_SIZE(lvl);
size_t size = ARM_V7S_TABLE_SIZE(lvl, cfg);
if (!cfg->coherent_walk)
dma_unmap_single(dev, __arm_v7s_dma_addr(table), size,
......@@ -424,7 +428,7 @@ static int arm_v7s_init_pte(struct arm_v7s_io_pgtable *data,
arm_v7s_iopte *tblp;
size_t sz = ARM_V7S_BLOCK_SIZE(lvl);
tblp = ptep - ARM_V7S_LVL_IDX(iova, lvl);
tblp = ptep - ARM_V7S_LVL_IDX(iova, lvl, cfg);
if (WARN_ON(__arm_v7s_unmap(data, NULL, iova + i * sz,
sz, lvl, tblp) != sz))
return -EINVAL;
......@@ -477,7 +481,7 @@ static int __arm_v7s_map(struct arm_v7s_io_pgtable *data, unsigned long iova,
int num_entries = size >> ARM_V7S_LVL_SHIFT(lvl);
/* Find our entry at the current level */
ptep += ARM_V7S_LVL_IDX(iova, lvl);
ptep += ARM_V7S_LVL_IDX(iova, lvl, cfg);
/* If we can install a leaf entry at this level, then do so */
if (num_entries)
......@@ -519,7 +523,6 @@ static int arm_v7s_map(struct io_pgtable_ops *ops, unsigned long iova,
phys_addr_t paddr, size_t size, int prot, gfp_t gfp)
{
struct arm_v7s_io_pgtable *data = io_pgtable_ops_to_data(ops);
struct io_pgtable *iop = &data->iop;
int ret;
if (WARN_ON(iova >= (1ULL << data->iop.cfg.ias) ||
......@@ -535,12 +538,7 @@ static int arm_v7s_map(struct io_pgtable_ops *ops, unsigned long iova,
* Synchronise all PTE updates for the new mapping before there's
* a chance for anything to kick off a table walk for the new iova.
*/
if (iop->cfg.quirks & IO_PGTABLE_QUIRK_TLBI_ON_MAP) {
io_pgtable_tlb_flush_walk(iop, iova, size,
ARM_V7S_BLOCK_SIZE(2));
} else {
wmb();
}
wmb();
return ret;
}
......@@ -550,7 +548,7 @@ static void arm_v7s_free_pgtable(struct io_pgtable *iop)
struct arm_v7s_io_pgtable *data = io_pgtable_to_data(iop);
int i;
for (i = 0; i < ARM_V7S_PTES_PER_LVL(1); i++) {
for (i = 0; i < ARM_V7S_PTES_PER_LVL(1, &data->iop.cfg); i++) {
arm_v7s_iopte pte = data->pgd[i];
if (ARM_V7S_PTE_IS_TABLE(pte, 1))
......@@ -602,9 +600,9 @@ static size_t arm_v7s_split_blk_unmap(struct arm_v7s_io_pgtable *data,
if (!tablep)
return 0; /* Bytes unmapped */
num_ptes = ARM_V7S_PTES_PER_LVL(2);
num_ptes = ARM_V7S_PTES_PER_LVL(2, cfg);
num_entries = size >> ARM_V7S_LVL_SHIFT(2);
unmap_idx = ARM_V7S_LVL_IDX(iova, 2);
unmap_idx = ARM_V7S_LVL_IDX(iova, 2, cfg);
pte = arm_v7s_prot_to_pte(arm_v7s_pte_to_prot(blk_pte, 1), 2, cfg);
if (num_entries > 1)
......@@ -646,7 +644,7 @@ static size_t __arm_v7s_unmap(struct arm_v7s_io_pgtable *data,
if (WARN_ON(lvl > 2))
return 0;
idx = ARM_V7S_LVL_IDX(iova, lvl);
idx = ARM_V7S_LVL_IDX(iova, lvl, &iop->cfg);
ptep += idx;
do {
pte[i] = READ_ONCE(ptep[i]);
......@@ -717,7 +715,7 @@ static size_t arm_v7s_unmap(struct io_pgtable_ops *ops, unsigned long iova,
{
struct arm_v7s_io_pgtable *data = io_pgtable_ops_to_data(ops);
if (WARN_ON(upper_32_bits(iova)))
if (WARN_ON(iova >= (1ULL << data->iop.cfg.ias)))
return 0;
return __arm_v7s_unmap(data, gather, iova, size, 1, data->pgd);
......@@ -732,7 +730,7 @@ static phys_addr_t arm_v7s_iova_to_phys(struct io_pgtable_ops *ops,
u32 mask;
do {
ptep += ARM_V7S_LVL_IDX(iova, ++lvl);
ptep += ARM_V7S_LVL_IDX(iova, ++lvl, &data->iop.cfg);
pte = READ_ONCE(*ptep);
ptep = iopte_deref(pte, lvl, data);
} while (ARM_V7S_PTE_IS_TABLE(pte, lvl));
......@@ -751,15 +749,14 @@ static struct io_pgtable *arm_v7s_alloc_pgtable(struct io_pgtable_cfg *cfg,
{
struct arm_v7s_io_pgtable *data;
if (cfg->ias > ARM_V7S_ADDR_BITS)
if (cfg->ias > (arm_v7s_is_mtk_enabled(cfg) ? 34 : ARM_V7S_ADDR_BITS))
return NULL;
if (cfg->oas > (arm_v7s_is_mtk_enabled(cfg) ? 34 : ARM_V7S_ADDR_BITS))
if (cfg->oas > (arm_v7s_is_mtk_enabled(cfg) ? 35 : ARM_V7S_ADDR_BITS))
return NULL;
if (cfg->quirks & ~(IO_PGTABLE_QUIRK_ARM_NS |
IO_PGTABLE_QUIRK_NO_PERMS |
IO_PGTABLE_QUIRK_TLBI_ON_MAP |
IO_PGTABLE_QUIRK_ARM_MTK_EXT |
IO_PGTABLE_QUIRK_NON_STRICT))
return NULL;
......@@ -775,8 +772,8 @@ static struct io_pgtable *arm_v7s_alloc_pgtable(struct io_pgtable_cfg *cfg,
spin_lock_init(&data->split_lock);
data->l2_tables = kmem_cache_create("io-pgtable_armv7s_l2",
ARM_V7S_TABLE_SIZE(2),
ARM_V7S_TABLE_SIZE(2),
ARM_V7S_TABLE_SIZE(2, cfg),
ARM_V7S_TABLE_SIZE(2, cfg),
ARM_V7S_TABLE_SLAB_FLAGS, NULL);
if (!data->l2_tables)
goto out_free_data;
......
......@@ -2426,9 +2426,6 @@ static int __iommu_map(struct iommu_domain *domain, unsigned long iova,
size -= pgsize;
}
if (ops->iotlb_sync_map)
ops->iotlb_sync_map(domain);
/* unroll mapping in case something went wrong */
if (ret)
iommu_unmap(domain, orig_iova, orig_size - size);
......@@ -2438,18 +2435,31 @@ static int __iommu_map(struct iommu_domain *domain, unsigned long iova,
return ret;
}
static int _iommu_map(struct iommu_domain *domain, unsigned long iova,
phys_addr_t paddr, size_t size, int prot, gfp_t gfp)
{
const struct iommu_ops *ops = domain->ops;
int ret;
ret = __iommu_map(domain, iova, paddr, size, prot, GFP_KERNEL);
if (ret == 0 && ops->iotlb_sync_map)
ops->iotlb_sync_map(domain, iova, size);
return ret;
}
int iommu_map(struct iommu_domain *domain, unsigned long iova,
phys_addr_t paddr, size_t size, int prot)
{
might_sleep();
return __iommu_map(domain, iova, paddr, size, prot, GFP_KERNEL);
return _iommu_map(domain, iova, paddr, size, prot, GFP_KERNEL);
}
EXPORT_SYMBOL_GPL(iommu_map);
int iommu_map_atomic(struct iommu_domain *domain, unsigned long iova,
phys_addr_t paddr, size_t size, int prot)
{
return __iommu_map(domain, iova, paddr, size, prot, GFP_ATOMIC);
return _iommu_map(domain, iova, paddr, size, prot, GFP_ATOMIC);
}
EXPORT_SYMBOL_GPL(iommu_map_atomic);
......@@ -2533,6 +2543,7 @@ static size_t __iommu_map_sg(struct iommu_domain *domain, unsigned long iova,
struct scatterlist *sg, unsigned int nents, int prot,
gfp_t gfp)
{
const struct iommu_ops *ops = domain->ops;
size_t len = 0, mapped = 0;
phys_addr_t start;
unsigned int i = 0;
......@@ -2563,6 +2574,8 @@ static size_t __iommu_map_sg(struct iommu_domain *domain, unsigned long iova,
sg = sg_next(sg);
}
if (ops->iotlb_sync_map)
ops->iotlb_sync_map(domain, iova, mapped);
return mapped;
out_err:
......
......@@ -343,7 +343,6 @@ static int msm_iommu_domain_config(struct msm_priv *priv)
spin_lock_init(&priv->pgtlock);
priv->cfg = (struct io_pgtable_cfg) {
.quirks = IO_PGTABLE_QUIRK_TLBI_ON_MAP,
.pgsize_bitmap = msm_iommu_ops.pgsize_bitmap,
.ias = 32,
.oas = 32,
......@@ -490,6 +489,14 @@ static int msm_iommu_map(struct iommu_domain *domain, unsigned long iova,
return ret;
}
static void msm_iommu_sync_map(struct iommu_domain *domain, unsigned long iova,
size_t size)
{
struct msm_priv *priv = to_msm_priv(domain);
__flush_iotlb_range(iova, size, SZ_4K, false, priv);
}
static size_t msm_iommu_unmap(struct iommu_domain *domain, unsigned long iova,
size_t len, struct iommu_iotlb_gather *gather)
{
......@@ -680,6 +687,7 @@ static struct iommu_ops msm_iommu_ops = {
* kick starting the other master.
*/
.iotlb_sync = NULL,
.iotlb_sync_map = msm_iommu_sync_map,
.iova_to_phys = msm_iommu_iova_to_phys,
.probe_device = msm_iommu_probe_device,
.release_device = msm_iommu_release_device,
......
This diff is collapsed.
......@@ -17,10 +17,13 @@
#include <linux/spinlock.h>
#include <linux/dma-mapping.h>
#include <soc/mediatek/smi.h>
#include <dt-bindings/memory/mtk-memory-port.h>
#define MTK_LARB_COM_MAX 8
#define MTK_LARB_SUBCOM_MAX 4
#define MTK_IOMMU_GROUP_MAX 8
struct mtk_iommu_suspend_reg {
union {
u32 standard_axi_mode;/* v1 */
......@@ -42,12 +45,18 @@ enum mtk_iommu_plat {
M4U_MT8167,
M4U_MT8173,
M4U_MT8183,
M4U_MT8192,
};
struct mtk_iommu_iova_region;
struct mtk_iommu_plat_data {
enum mtk_iommu_plat m4u_plat;
u32 flags;
u32 inv_sel_reg;
unsigned int iova_region_nr;
const struct mtk_iommu_iova_region *iova_region;
unsigned char larbid_remap[MTK_LARB_COM_MAX][MTK_LARB_SUBCOM_MAX];
};
......@@ -61,12 +70,13 @@ struct mtk_iommu_data {
phys_addr_t protect_base; /* protect memory base */
struct mtk_iommu_suspend_reg reg;
struct mtk_iommu_domain *m4u_dom;
struct iommu_group *m4u_group;
struct iommu_group *m4u_group[MTK_IOMMU_GROUP_MAX];
bool enable_4GB;
spinlock_t tlb_lock; /* lock for tlb range flush */
struct iommu_device iommu;
const struct mtk_iommu_plat_data *plat_data;
struct device *smicomm_dev;
struct dma_iommu_mapping *mapping; /* For mtk_iommu_v1.c */
......
......@@ -261,7 +261,8 @@ static int gart_iommu_of_xlate(struct device *dev,
return 0;
}
static void gart_iommu_sync_map(struct iommu_domain *domain)
static void gart_iommu_sync_map(struct iommu_domain *domain, unsigned long iova,
size_t size)
{
FLUSH_GART_REGS(gart_handle);
}
......@@ -269,7 +270,9 @@ static void gart_iommu_sync_map(struct iommu_domain *domain)
static void gart_iommu_sync(struct iommu_domain *domain,
struct iommu_iotlb_gather *gather)
{
gart_iommu_sync_map(domain);
size_t length = gather->end - gather->start + 1;
gart_iommu_sync_map(domain, gather->start, length);
}
static const struct iommu_ops gart_iommu_ops = {
......
......@@ -15,6 +15,7 @@
#include <linux/pm_runtime.h>
#include <soc/mediatek/smi.h>
#include <dt-bindings/memory/mt2701-larb-port.h>
#include <dt-bindings/memory/mtk-memory-port.h>
/* mt8173 */
#define SMI_LARB_MMU_EN 0xf00
......@@ -43,6 +44,10 @@
/* mt2712 */
#define SMI_LARB_NONSEC_CON(id) (0x380 + ((id) * 4))
#define F_MMU_EN BIT(0)
#define BANK_SEL(id) ({ \
u32 _id = (id) & 0x3; \
(_id << 8 | _id << 10 | _id << 12 | _id << 14); \
})
/* SMI COMMON */
#define SMI_BUS_SEL 0x220
......@@ -87,6 +92,7 @@ struct mtk_smi_larb { /* larb: local arbiter */
const struct mtk_smi_larb_gen *larb_gen;
int larbid;
u32 *mmu;
unsigned char *bank;
};
static int mtk_smi_clk_enable(const struct mtk_smi *smi)
......@@ -153,6 +159,7 @@ mtk_smi_larb_bind(struct device *dev, struct device *master, void *data)
if (dev == larb_mmu[i].dev) {
larb->larbid = i;
larb->mmu = &larb_mmu[i].mmu;
larb->bank = larb_mmu[i].bank;
return 0;
}
}
......@@ -171,6 +178,7 @@ static void mtk_smi_larb_config_port_gen2_general(struct device *dev)
for_each_set_bit(i, (unsigned long *)larb->mmu, 32) {
reg = readl_relaxed(larb->base + SMI_LARB_NONSEC_CON(i));
reg |= F_MMU_EN;
reg |= BANK_SEL(larb->bank[i]);
writel(reg, larb->base + SMI_LARB_NONSEC_CON(i));
}
}
......
......@@ -62,7 +62,7 @@ config ARM_PMU_ACPI
config ARM_SMMU_V3_PMU
tristate "ARM SMMUv3 Performance Monitors Extension"
depends on ARM64 && ACPI && ARM_SMMU_V3
depends on ARM64 && ACPI
help
Provides support for the ARM SMMUv3 Performance Monitor Counter
Groups (PMCG), which provide monitoring of transactions passing
......
......@@ -4,8 +4,8 @@
* Author: Honghui Zhang <honghui.zhang@mediatek.com>
*/
#ifndef _MT2701_LARB_PORT_H_
#define _MT2701_LARB_PORT_H_
#ifndef _DT_BINDINGS_MEMORY_MT2701_LARB_PORT_H_
#define _DT_BINDINGS_MEMORY_MT2701_LARB_PORT_H_
/*
* Mediatek m4u generation 1 such as mt2701 has flat m4u port numbers,
......
......@@ -3,10 +3,10 @@
* Copyright (c) 2017 MediaTek Inc.
* Author: Yong Wu <yong.wu@mediatek.com>
*/
#ifndef __DTS_IOMMU_PORT_MT2712_H
#define __DTS_IOMMU_PORT_MT2712_H
#ifndef _DT_BINDINGS_MEMORY_MT2712_LARB_PORT_H_
#define _DT_BINDINGS_MEMORY_MT2712_LARB_PORT_H_
#define MTK_M4U_ID(larb, port) (((larb) << 5) | (port))
#include <dt-bindings/memory/mtk-memory-port.h>
#define M4U_LARB0_ID 0
#define M4U_LARB1_ID 1
......
......@@ -4,10 +4,10 @@
* Author: Chao Hao <chao.hao@mediatek.com>
*/
#ifndef _DTS_IOMMU_PORT_MT6779_H_
#define _DTS_IOMMU_PORT_MT6779_H_
#ifndef _DT_BINDINGS_MEMORY_MT6779_LARB_PORT_H_
#define _DT_BINDINGS_MEMORY_MT6779_LARB_PORT_H_
#define MTK_M4U_ID(larb, port) (((larb) << 5) | (port))
#include <dt-bindings/memory/mtk-memory-port.h>
#define M4U_LARB0_ID 0
#define M4U_LARB1_ID 1
......
......@@ -5,10 +5,10 @@
* Author: Honghui Zhang <honghui.zhang@mediatek.com>
* Author: Fabien Parent <fparent@baylibre.com>
*/
#ifndef __DTS_IOMMU_PORT_MT8167_H
#define __DTS_IOMMU_PORT_MT8167_H
#ifndef _DT_BINDINGS_MEMORY_MT8167_LARB_PORT_H_
#define _DT_BINDINGS_MEMORY_MT8167_LARB_PORT_H_
#define MTK_M4U_ID(larb, port) (((larb) << 5) | (port))
#include <dt-bindings/memory/mtk-memory-port.h>
#define M4U_LARB0_ID 0
#define M4U_LARB1_ID 1
......
......@@ -3,10 +3,10 @@
* Copyright (c) 2015-2016 MediaTek Inc.
* Author: Yong Wu <yong.wu@mediatek.com>
*/
#ifndef __DTS_IOMMU_PORT_MT8173_H
#define __DTS_IOMMU_PORT_MT8173_H
#ifndef _DT_BINDINGS_MEMORY_MT8173_LARB_PORT_H_
#define _DT_BINDINGS_MEMORY_MT8173_LARB_PORT_H_
#define MTK_M4U_ID(larb, port) (((larb) << 5) | (port))
#include <dt-bindings/memory/mtk-memory-port.h>
#define M4U_LARB0_ID 0
#define M4U_LARB1_ID 1
......
......@@ -3,10 +3,10 @@
* Copyright (c) 2018 MediaTek Inc.
* Author: Yong Wu <yong.wu@mediatek.com>
*/
#ifndef __DTS_IOMMU_PORT_MT8183_H
#define __DTS_IOMMU_PORT_MT8183_H
#ifndef _DT_BINDINGS_MEMORY_MT8183_LARB_PORT_H_
#define _DT_BINDINGS_MEMORY_MT8183_LARB_PORT_H_
#define MTK_M4U_ID(larb, port) (((larb) << 5) | (port))
#include <dt-bindings/memory/mtk-memory-port.h>
#define M4U_LARB0_ID 0
#define M4U_LARB1_ID 1
......
This diff is collapsed.
/* SPDX-License-Identifier: GPL-2.0-only */
/*
* Copyright (c) 2020 MediaTek Inc.
* Author: Yong Wu <yong.wu@mediatek.com>
*/
#ifndef __DT_BINDINGS_MEMORY_MTK_MEMORY_PORT_H_
#define __DT_BINDINGS_MEMORY_MTK_MEMORY_PORT_H_
#define MTK_LARB_NR_MAX 32
#define MTK_M4U_ID(larb, port) (((larb) << 5) | (port))
#define MTK_M4U_TO_LARB(id) (((id) >> 5) & 0x1f)
#define MTK_M4U_TO_PORT(id) ((id) & 0x1f)
#endif
......@@ -68,13 +68,9 @@ struct io_pgtable_cfg {
* hardware which does not implement the permissions of a given
* format, and/or requires some format-specific default value.
*
* IO_PGTABLE_QUIRK_TLBI_ON_MAP: If the format forbids caching invalid
* (unmapped) entries but the hardware might do so anyway, perform
* TLB maintenance when mapping as well as when unmapping.
*
* IO_PGTABLE_QUIRK_ARM_MTK_EXT: (ARM v7s format) MediaTek IOMMUs extend
* to support up to 34 bits PA where the bit32 and bit33 are
* encoded in the bit9 and bit4 of the PTE respectively.
* to support up to 35 bits PA where the bit32, bit33 and bit34 are
* encoded in the bit9, bit4 and bit5 of the PTE respectively.
*
* IO_PGTABLE_QUIRK_NON_STRICT: Skip issuing synchronous leaf TLBIs
* on unmap, for DMA domains using the flush queue mechanism for
......@@ -88,7 +84,6 @@ struct io_pgtable_cfg {
*/
#define IO_PGTABLE_QUIRK_ARM_NS BIT(0)
#define IO_PGTABLE_QUIRK_NO_PERMS BIT(1)
#define IO_PGTABLE_QUIRK_TLBI_ON_MAP BIT(2)
#define IO_PGTABLE_QUIRK_ARM_MTK_EXT BIT(3)
#define IO_PGTABLE_QUIRK_NON_STRICT BIT(4)
#define IO_PGTABLE_QUIRK_ARM_TTBR1 BIT(5)
......@@ -214,14 +209,16 @@ struct io_pgtable_domain_attr {
static inline void io_pgtable_tlb_flush_all(struct io_pgtable *iop)
{
iop->cfg.tlb->tlb_flush_all(iop->cookie);
if (iop->cfg.tlb && iop->cfg.tlb->tlb_flush_all)
iop->cfg.tlb->tlb_flush_all(iop->cookie);
}
static inline void
io_pgtable_tlb_flush_walk(struct io_pgtable *iop, unsigned long iova,
size_t size, size_t granule)
{
iop->cfg.tlb->tlb_flush_walk(iova, size, granule, iop->cookie);
if (iop->cfg.tlb && iop->cfg.tlb->tlb_flush_walk)
iop->cfg.tlb->tlb_flush_walk(iova, size, granule, iop->cookie);
}
static inline void
......@@ -229,7 +226,7 @@ io_pgtable_tlb_add_page(struct io_pgtable *iop,
struct iommu_iotlb_gather * gather, unsigned long iova,
size_t granule)
{
if (iop->cfg.tlb->tlb_add_page)
if (iop->cfg.tlb && iop->cfg.tlb->tlb_add_page)
iop->cfg.tlb->tlb_add_page(gather, iova, granule, iop->cookie);
}
......
......@@ -170,7 +170,7 @@ enum iommu_dev_features {
* struct iommu_iotlb_gather - Range information for a pending IOTLB flush
*
* @start: IOVA representing the start of the range to be flushed
* @end: IOVA representing the end of the range to be flushed (exclusive)
* @end: IOVA representing the end of the range to be flushed (inclusive)
* @pgsize: The interval at which to perform the flush
*
* This structure is intended to be updated by multiple calls to the
......@@ -246,7 +246,8 @@ struct iommu_ops {
size_t (*unmap)(struct iommu_domain *domain, unsigned long iova,
size_t size, struct iommu_iotlb_gather *iotlb_gather);
void (*flush_iotlb_all)(struct iommu_domain *domain);
void (*iotlb_sync_map)(struct iommu_domain *domain);
void (*iotlb_sync_map)(struct iommu_domain *domain, unsigned long iova,
size_t size);
void (*iotlb_sync)(struct iommu_domain *domain,
struct iommu_iotlb_gather *iotlb_gather);
phys_addr_t (*iova_to_phys)(struct iommu_domain *domain, dma_addr_t iova);
......@@ -538,7 +539,7 @@ static inline void iommu_iotlb_gather_add_page(struct iommu_domain *domain,
struct iommu_iotlb_gather *gather,
unsigned long iova, size_t size)
{
unsigned long start = iova, end = start + size;
unsigned long start = iova, end = start + size - 1;
/*
* If the new page is disjoint from the current range or is mapped at
......
......@@ -11,13 +11,12 @@
#ifdef CONFIG_MTK_SMI
#define MTK_LARB_NR_MAX 16
#define MTK_SMI_MMU_EN(port) BIT(port)
struct mtk_smi_larb_iommu {
struct device *dev;
unsigned int mmu;
unsigned char bank[32];
};
/*
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment