Commit db9bde2f authored by Olof Johansson's avatar Olof Johansson

Merge branch 'VExpress_DCSCB' of git://git.linaro.org/people/nico/linux into next/soc

From Nicolas Pitre:

This is the first MCPM backend submission for VExpress running on RTSM
aka Fast Models implementing the big.LITTLE system architecture.  This
enables SMP secondary boot as well as CPU hotplug on this platform.

A big prerequisite for this support is the CCI driver from Lorenzo
included in this pull request.

Also included is Rob Herring's set_auxcr/get_auxcr allowing nicer code.
Signed-off-by: default avatarOlof Johansson <olof@lixom.net>

* 'VExpress_DCSCB' of git://git.linaro.org/people/nico/linux:
  ARM: vexpress: Select multi-cluster SMP operation if required
  ARM: vexpress/dcscb: handle platform coherency exit/setup and CCI
  ARM: vexpress/dcscb: do not hardcode number of CPUs per cluster
  ARM: vexpress/dcscb: add CPU use counts to the power up/down API implementation
  ARM: vexpress: introduce DCSCB support
  ARM: introduce common set_auxcr/get_auxcr functions
  drivers/bus: arm-cci: function to enable CCI ports from early boot code
  drivers: bus: add ARM CCI support
parents 6678e389 033a899c
=======================================================
ARM CCI cache coherent interconnect binding description
=======================================================
ARM multi-cluster systems maintain intra-cluster coherency through a
cache coherent interconnect (CCI) that is capable of monitoring bus
transactions and manage coherency, TLB invalidations and memory barriers.
It allows snooping and distributed virtual memory message broadcast across
clusters, through memory mapped interface, with a global control register
space and multiple sets of interface control registers, one per slave
interface.
Bindings for the CCI node follow the ePAPR standard, available from:
www.power.org/documentation/epapr-version-1-1/
with the addition of the bindings described in this document which are
specific to ARM.
* CCI interconnect node
Description: Describes a CCI cache coherent Interconnect component
Node name must be "cci".
Node's parent must be the root node /, and the address space visible
through the CCI interconnect is the same as the one seen from the
root node (ie from CPUs perspective as per DT standard).
Every CCI node has to define the following properties:
- compatible
Usage: required
Value type: <string>
Definition: must be set to
"arm,cci-400"
- reg
Usage: required
Value type: <prop-encoded-array>
Definition: A standard property. Specifies base physical
address of CCI control registers common to all
interfaces.
- ranges:
Usage: required
Value type: <prop-encoded-array>
Definition: A standard property. Follow rules in the ePAPR for
hierarchical bus addressing. CCI interfaces
addresses refer to the parent node addressing
scheme to declare their register bases.
CCI interconnect node can define the following child nodes:
- CCI control interface nodes
Node name must be "slave-if".
Parent node must be CCI interconnect node.
A CCI control interface node must contain the following
properties:
- compatible
Usage: required
Value type: <string>
Definition: must be set to
"arm,cci-400-ctrl-if"
- interface-type:
Usage: required
Value type: <string>
Definition: must be set to one of {"ace", "ace-lite"}
depending on the interface type the node
represents.
- reg:
Usage: required
Value type: <prop-encoded-array>
Definition: the base address and size of the
corresponding interface programming
registers.
* CCI interconnect bus masters
Description: masters in the device tree connected to a CCI port
(inclusive of CPUs and their cpu nodes).
A CCI interconnect bus master node must contain the following
properties:
- cci-control-port:
Usage: required
Value type: <phandle>
Definition: a phandle containing the CCI control interface node
the master is connected to.
Example:
cpus {
#size-cells = <0>;
#address-cells = <1>;
CPU0: cpu@0 {
device_type = "cpu";
compatible = "arm,cortex-a15";
cci-control-port = <&cci_control1>;
reg = <0x0>;
};
CPU1: cpu@1 {
device_type = "cpu";
compatible = "arm,cortex-a15";
cci-control-port = <&cci_control1>;
reg = <0x1>;
};
CPU2: cpu@100 {
device_type = "cpu";
compatible = "arm,cortex-a7";
cci-control-port = <&cci_control2>;
reg = <0x100>;
};
CPU3: cpu@101 {
device_type = "cpu";
compatible = "arm,cortex-a7";
cci-control-port = <&cci_control2>;
reg = <0x101>;
};
};
dma0: dma@3000000 {
compatible = "arm,pl330", "arm,primecell";
cci-control-port = <&cci_control0>;
reg = <0x0 0x3000000 0x0 0x1000>;
interrupts = <10>;
#dma-cells = <1>;
#dma-channels = <8>;
#dma-requests = <32>;
};
cci@2c090000 {
compatible = "arm,cci-400";
#address-cells = <1>;
#size-cells = <1>;
reg = <0x0 0x2c090000 0 0x1000>;
ranges = <0x0 0x0 0x2c090000 0x6000>;
cci_control0: slave-if@1000 {
compatible = "arm,cci-400-ctrl-if";
interface-type = "ace-lite";
reg = <0x1000 0x1000>;
};
cci_control1: slave-if@4000 {
compatible = "arm,cci-400-ctrl-if";
interface-type = "ace";
reg = <0x4000 0x1000>;
};
cci_control2: slave-if@5000 {
compatible = "arm,cci-400-ctrl-if";
interface-type = "ace";
reg = <0x5000 0x1000>;
};
};
This CCI node corresponds to a CCI component whose control registers sits
at address 0x000000002c090000.
CCI slave interface @0x000000002c091000 is connected to dma controller dma0.
CCI slave interface @0x000000002c094000 is connected to CPUs {CPU0, CPU1};
CCI slave interface @0x000000002c095000 is connected to CPUs {CPU2, CPU3};
ARM Dual Cluster System Configuration Block
-------------------------------------------
The Dual Cluster System Configuration Block (DCSCB) provides basic
functionality for controlling clocks, resets and configuration pins in
the Dual Cluster System implemented by the Real-Time System Model (RTSM).
Required properties:
- compatible : should be "arm,rtsm,dcscb"
- reg : physical base address and the size of the registers window
Example:
dcscb@60000000 {
compatible = "arm,rtsm,dcscb";
reg = <0x60000000 0x1000>;
};
......@@ -61,6 +61,20 @@ static inline void set_cr(unsigned int val)
isb();
}
static inline unsigned int get_auxcr(void)
{
unsigned int val;
asm("mrc p15, 0, %0, c1, c0, 1 @ get AUXCR" : "=r" (val));
return val;
}
static inline void set_auxcr(unsigned int val)
{
asm volatile("mcr p15, 0, %0, c1, c0, 1 @ set AUXCR"
: : "r" (val));
isb();
}
#ifndef CONFIG_SMP
extern void adjust_cr(unsigned long mask, unsigned long set);
#endif
......
......@@ -57,4 +57,13 @@ config ARCH_VEXPRESS_CORTEX_A5_A9_ERRATA
config ARCH_VEXPRESS_CA9X4
bool "Versatile Express Cortex-A9x4 tile"
config ARCH_VEXPRESS_DCSCB
bool "Dual Cluster System Control Block (DCSCB) support"
depends on MCPM
select ARM_CCI
help
Support for the Dual Cluster System Configuration Block (DCSCB).
This is needed to provide CPU and cluster power management
on RTSM implementing big.LITTLE.
endmenu
......@@ -6,5 +6,6 @@ ccflags-$(CONFIG_ARCH_MULTIPLATFORM) := -I$(srctree)/$(src)/include \
obj-y := v2m.o
obj-$(CONFIG_ARCH_VEXPRESS_CA9X4) += ct-ca9x4.o
obj-$(CONFIG_ARCH_VEXPRESS_DCSCB) += dcscb.o dcscb_setup.o
obj-$(CONFIG_SMP) += platsmp.o
obj-$(CONFIG_HOTPLUG_CPU) += hotplug.o
......@@ -6,6 +6,8 @@
void vexpress_dt_smp_map_io(void);
bool vexpress_smp_init_ops(void);
extern struct smp_operations vexpress_smp_ops;
extern void vexpress_cpu_die(unsigned int cpu);
/*
* arch/arm/mach-vexpress/dcscb.c - Dual Cluster System Configuration Block
*
* Created by: Nicolas Pitre, May 2012
* Copyright: (C) 2012-2013 Linaro Limited
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
*/
#include <linux/init.h>
#include <linux/kernel.h>
#include <linux/io.h>
#include <linux/spinlock.h>
#include <linux/errno.h>
#include <linux/of_address.h>
#include <linux/vexpress.h>
#include <linux/arm-cci.h>
#include <asm/mcpm.h>
#include <asm/proc-fns.h>
#include <asm/cacheflush.h>
#include <asm/cputype.h>
#include <asm/cp15.h>
#define RST_HOLD0 0x0
#define RST_HOLD1 0x4
#define SYS_SWRESET 0x8
#define RST_STAT0 0xc
#define RST_STAT1 0x10
#define EAG_CFG_R 0x20
#define EAG_CFG_W 0x24
#define KFC_CFG_R 0x28
#define KFC_CFG_W 0x2c
#define DCS_CFG_R 0x30
/*
* We can't use regular spinlocks. In the switcher case, it is possible
* for an outbound CPU to call power_down() while its inbound counterpart
* is already live using the same logical CPU number which trips lockdep
* debugging.
*/
static arch_spinlock_t dcscb_lock = __ARCH_SPIN_LOCK_UNLOCKED;
static void __iomem *dcscb_base;
static int dcscb_use_count[4][2];
static int dcscb_allcpus_mask[2];
static int dcscb_power_up(unsigned int cpu, unsigned int cluster)
{
unsigned int rst_hold, cpumask = (1 << cpu);
unsigned int all_mask = dcscb_allcpus_mask[cluster];
pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
if (cpu >= 4 || cluster >= 2)
return -EINVAL;
/*
* Since this is called with IRQs enabled, and no arch_spin_lock_irq
* variant exists, we need to disable IRQs manually here.
*/
local_irq_disable();
arch_spin_lock(&dcscb_lock);
dcscb_use_count[cpu][cluster]++;
if (dcscb_use_count[cpu][cluster] == 1) {
rst_hold = readl_relaxed(dcscb_base + RST_HOLD0 + cluster * 4);
if (rst_hold & (1 << 8)) {
/* remove cluster reset and add individual CPU's reset */
rst_hold &= ~(1 << 8);
rst_hold |= all_mask;
}
rst_hold &= ~(cpumask | (cpumask << 4));
writel_relaxed(rst_hold, dcscb_base + RST_HOLD0 + cluster * 4);
} else if (dcscb_use_count[cpu][cluster] != 2) {
/*
* The only possible values are:
* 0 = CPU down
* 1 = CPU (still) up
* 2 = CPU requested to be up before it had a chance
* to actually make itself down.
* Any other value is a bug.
*/
BUG();
}
arch_spin_unlock(&dcscb_lock);
local_irq_enable();
return 0;
}
static void dcscb_power_down(void)
{
unsigned int mpidr, cpu, cluster, rst_hold, cpumask, all_mask;
bool last_man = false, skip_wfi = false;
mpidr = read_cpuid_mpidr();
cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0);
cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1);
cpumask = (1 << cpu);
all_mask = dcscb_allcpus_mask[cluster];
pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
BUG_ON(cpu >= 4 || cluster >= 2);
__mcpm_cpu_going_down(cpu, cluster);
arch_spin_lock(&dcscb_lock);
BUG_ON(__mcpm_cluster_state(cluster) != CLUSTER_UP);
dcscb_use_count[cpu][cluster]--;
if (dcscb_use_count[cpu][cluster] == 0) {
rst_hold = readl_relaxed(dcscb_base + RST_HOLD0 + cluster * 4);
rst_hold |= cpumask;
if (((rst_hold | (rst_hold >> 4)) & all_mask) == all_mask) {
rst_hold |= (1 << 8);
last_man = true;
}
writel_relaxed(rst_hold, dcscb_base + RST_HOLD0 + cluster * 4);
} else if (dcscb_use_count[cpu][cluster] == 1) {
/*
* A power_up request went ahead of us.
* Even if we do not want to shut this CPU down,
* the caller expects a certain state as if the WFI
* was aborted. So let's continue with cache cleaning.
*/
skip_wfi = true;
} else
BUG();
if (last_man && __mcpm_outbound_enter_critical(cpu, cluster)) {
arch_spin_unlock(&dcscb_lock);
/*
* Flush all cache levels for this cluster.
*
* A15/A7 can hit in the cache with SCTLR.C=0, so we don't need
* a preliminary flush here for those CPUs. At least, that's
* the theory -- without the extra flush, Linux explodes on
* RTSM (to be investigated).
*/
flush_cache_all();
set_cr(get_cr() & ~CR_C);
flush_cache_all();
/*
* This is a harmless no-op. On platforms with a real
* outer cache this might either be needed or not,
* depending on where the outer cache sits.
*/
outer_flush_all();
/* Disable local coherency by clearing the ACTLR "SMP" bit: */
set_auxcr(get_auxcr() & ~(1 << 6));
/*
* Disable cluster-level coherency by masking
* incoming snoops and DVM messages:
*/
cci_disable_port_by_cpu(mpidr);
__mcpm_outbound_leave_critical(cluster, CLUSTER_DOWN);
} else {
arch_spin_unlock(&dcscb_lock);
/*
* Flush the local CPU cache.
*
* A15/A7 can hit in the cache with SCTLR.C=0, so we don't need
* a preliminary flush here for those CPUs. At least, that's
* the theory -- without the extra flush, Linux explodes on
* RTSM (to be investigated).
*/
flush_cache_louis();
set_cr(get_cr() & ~CR_C);
flush_cache_louis();
/* Disable local coherency by clearing the ACTLR "SMP" bit: */
set_auxcr(get_auxcr() & ~(1 << 6));
}
__mcpm_cpu_down(cpu, cluster);
/* Now we are prepared for power-down, do it: */
dsb();
if (!skip_wfi)
wfi();
/* Not dead at this point? Let our caller cope. */
}
static const struct mcpm_platform_ops dcscb_power_ops = {
.power_up = dcscb_power_up,
.power_down = dcscb_power_down,
};
static void __init dcscb_usage_count_init(void)
{
unsigned int mpidr, cpu, cluster;
mpidr = read_cpuid_mpidr();
cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0);
cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1);
pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
BUG_ON(cpu >= 4 || cluster >= 2);
dcscb_use_count[cpu][cluster] = 1;
}
extern void dcscb_power_up_setup(unsigned int affinity_level);
static int __init dcscb_init(void)
{
struct device_node *node;
unsigned int cfg;
int ret;
if (!cci_probed())
return -ENODEV;
node = of_find_compatible_node(NULL, NULL, "arm,rtsm,dcscb");
if (!node)
return -ENODEV;
dcscb_base = of_iomap(node, 0);
if (!dcscb_base)
return -EADDRNOTAVAIL;
cfg = readl_relaxed(dcscb_base + DCS_CFG_R);
dcscb_allcpus_mask[0] = (1 << (((cfg >> 16) >> (0 << 2)) & 0xf)) - 1;
dcscb_allcpus_mask[1] = (1 << (((cfg >> 16) >> (1 << 2)) & 0xf)) - 1;
dcscb_usage_count_init();
ret = mcpm_platform_register(&dcscb_power_ops);
if (!ret)
ret = mcpm_sync_init(dcscb_power_up_setup);
if (ret) {
iounmap(dcscb_base);
return ret;
}
pr_info("VExpress DCSCB support installed\n");
/*
* Future entries into the kernel can now go
* through the cluster entry vectors.
*/
vexpress_flags_set(virt_to_phys(mcpm_entry_point));
return 0;
}
early_initcall(dcscb_init);
/*
* arch/arm/include/asm/dcscb_setup.S
*
* Created by: Dave Martin, 2012-06-22
* Copyright: (C) 2012-2013 Linaro Limited
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
*/
#include <linux/linkage.h>
ENTRY(dcscb_power_up_setup)
cmp r0, #0 @ check affinity level
beq 2f
/*
* Enable cluster-level coherency, in preparation for turning on the MMU.
* The ACTLR SMP bit does not need to be set here, because cpu_resume()
* already restores that.
*
* A15/A7 may not require explicit L2 invalidation on reset, dependent
* on hardware integration decisions.
* For now, this code assumes that L2 is either already invalidated,
* or invalidation is not required.
*/
b cci_enable_port_for_self
2: @ Implementation-specific local CPU setup operations should go here,
@ if any. In this case, there is nothing to do.
bx lr
ENDPROC(dcscb_power_up_setup)
......@@ -12,9 +12,11 @@
#include <linux/errno.h>
#include <linux/smp.h>
#include <linux/io.h>
#include <linux/of.h>
#include <linux/of_fdt.h>
#include <linux/vexpress.h>
#include <asm/mcpm.h>
#include <asm/smp_scu.h>
#include <asm/mach/map.h>
......@@ -203,3 +205,21 @@ struct smp_operations __initdata vexpress_smp_ops = {
.cpu_die = vexpress_cpu_die,
#endif
};
bool __init vexpress_smp_init_ops(void)
{
#ifdef CONFIG_MCPM
/*
* The best way to detect a multi-cluster configuration at the moment
* is to look for the presence of a CCI in the system.
* Override the default vexpress_smp_ops if so.
*/
struct device_node *node;
node = of_find_compatible_node(NULL, NULL, "arm,cci-400");
if (node && of_device_is_available(node)) {
mcpm_smp_set_ops();
return true;
}
#endif
return false;
}
......@@ -456,6 +456,7 @@ static const char * const v2m_dt_match[] __initconst = {
DT_MACHINE_START(VEXPRESS_DT, "ARM-Versatile Express")
.dt_compat = v2m_dt_match,
.smp = smp_ops(vexpress_smp_ops),
.smp_init = smp_init_ops(vexpress_smp_init_ops),
.map_io = v2m_dt_map_io,
.init_early = v2m_dt_init_early,
.init_irq = irqchip_init,
......
......@@ -26,4 +26,11 @@ config OMAP_INTERCONNECT
help
Driver to enable OMAP interconnect error handling driver.
config ARM_CCI
bool "ARM CCI driver support"
depends on ARM
help
Driver supporting the CCI cache coherent interconnect for ARM
platforms.
endmenu
......@@ -7,3 +7,5 @@ obj-$(CONFIG_OMAP_OCP2SCP) += omap-ocp2scp.o
# Interconnect bus driver for OMAP SoCs.
obj-$(CONFIG_OMAP_INTERCONNECT) += omap_l3_smx.o omap_l3_noc.o
# CCI cache coherent interconnect for ARM platforms
obj-$(CONFIG_ARM_CCI) += arm-cci.o
This diff is collapsed.
......@@ -37,20 +37,6 @@
extern void highbank_set_cpu_jump(int cpu, void *jump_addr);
extern void *scu_base_addr;
static inline unsigned int get_auxcr(void)
{
unsigned int val;
asm("mrc p15, 0, %0, c1, c0, 1 @ get AUXCR" : "=r" (val) : : "cc");
return val;
}
static inline void set_auxcr(unsigned int val)
{
asm volatile("mcr p15, 0, %0, c1, c0, 1 @ set AUXCR"
: : "r" (val) : "cc");
isb();
}
static noinline void calxeda_idle_restore(void)
{
set_cr(get_cr() | CR_C);
......
/*
* CCI cache coherent interconnect support
*
* Copyright (C) 2013 ARM Ltd.
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
*/
#ifndef __LINUX_ARM_CCI_H
#define __LINUX_ARM_CCI_H
#include <linux/errno.h>
#include <linux/types.h>
struct device_node;
#ifdef CONFIG_ARM_CCI
extern bool cci_probed(void);
extern int cci_ace_get_port(struct device_node *dn);
extern int cci_disable_port_by_cpu(u64 mpidr);
extern int __cci_control_port_by_device(struct device_node *dn, bool enable);
extern int __cci_control_port_by_index(u32 port, bool enable);
#else
static inline bool cci_probed(void) { return false; }
static inline int cci_ace_get_port(struct device_node *dn)
{
return -ENODEV;
}
static inline int cci_disable_port_by_cpu(u64 mpidr) { return -ENODEV; }
static inline int __cci_control_port_by_device(struct device_node *dn,
bool enable)
{
return -ENODEV;
}
static inline int __cci_control_port_by_index(u32 port, bool enable)
{
return -ENODEV;
}
#endif
#define cci_disable_port_by_device(dev) \
__cci_control_port_by_device(dev, false)
#define cci_enable_port_by_device(dev) \
__cci_control_port_by_device(dev, true)
#define cci_disable_port_by_index(dev) \
__cci_control_port_by_index(dev, false)
#define cci_enable_port_by_index(dev) \
__cci_control_port_by_index(dev, true)
#endif
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment