1. 30 Oct, 2017 1 commit
    • Catalin Marinas's avatar
      arm64: Implement arch-specific pte_access_permitted() · 6218f96c
      Catalin Marinas authored
      The generic pte_access_permitted() implementation only checks for
      pte_present() (together with the write permission where applicable).
      However, for both kernel ptes and PROT_NONE mappings pte_present() also
      returns true on arm64 even though such mappings are not user accessible.
      Additionally, arm64 now supports execute-only user permission
      (PROT_EXEC) which is implemented by clearing the PTE_USER bit.
      
      With this patch the arm64 implementation of pte_access_permitted()
      checks for the PTE_VALID and PTE_USER bits together with writable access
      if applicable.
      
      Cc: <stable@vger.kernel.org>
      Reported-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      6218f96c
  2. 27 Oct, 2017 4 commits
  3. 25 Oct, 2017 3 commits
  4. 24 Oct, 2017 4 commits
  5. 19 Oct, 2017 8 commits
  6. 18 Oct, 2017 7 commits
    • Will Deacon's avatar
      drivers/perf: Add support for ARMv8.2 Statistical Profiling Extension · d5d9696b
      Will Deacon authored
      The ARMv8.2 architecture introduces the optional Statistical Profiling
      Extension (SPE).
      
      SPE can be used to profile a population of operations in the CPU pipeline
      after instruction decode. These are either architected instructions (i.e.
      a dynamic instruction trace) or CPU-specific uops and the choice is fixed
      statically in the hardware and advertised to userspace via caps/. Sampling
      is controlled using a sampling interval, similar to a regular PMU counter,
      but also with an optional random perturbation to avoid falling into patterns
      where you continuously profile the same instruction in a hot loop.
      
      After each operation is decoded, the interval counter is decremented. When
      it hits zero, an operation is chosen for profiling and tracked within the
      pipeline until it retires. Along the way, information such as TLB lookups,
      cache misses, time spent to issue etc is captured in the form of a sample.
      The sample is then filtered according to certain criteria (e.g. load
      latency) that can be specified in the event config (described under
      format/) and, if the sample satisfies the filter, it is written out to
      memory as a record, otherwise it is discarded. Only one operation can
      be sampled at a time.
      
      The in-memory buffer is linear and virtually addressed, raising an
      interrupt when it fills up. The PMU driver handles these interrupts to
      give the appearance of a ring buffer, as expected by the AUX code.
      
      The in-memory trace-like format is self-describing (though not parseable
      in reverse) and written as a series of records, with each record
      corresponding to a sample and consisting of a sequence of packets. These
      packets are defined by the architecture, although some have CPU-specific
      fields for recording information specific to the microarchitecture.
      
      As a simple example, a record generated for a branch instruction may
      consist of the following packets:
      
        0 (Address) : Virtual PC of the branch instruction
        1 (Type)    : Conditional direct branch
        2 (Counter) : Number of cycles taken from Dispatch to Issue
        3 (Address) : Virtual branch target + condition flags
        4 (Counter) : Number of cycles taken from Dispatch to Complete
        5 (Events)  : Mispredicted as not-taken
        6 (END)     : End of record
      
      It is also possible to toggle properties such as timestamp packets in
      each record.
      
      This patch adds support for SPE in the form of a new perf driver.
      
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Reviewed-by: default avatarMark Rutland <mark.rutland@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      d5d9696b
    • Will Deacon's avatar
      dt-bindings: Document devicetree binding for ARM SPE · 4b8b77a4
      Will Deacon authored
      This patch documents the devicetree binding in use for ARM SPE.
      
      Cc: Rob Herring <robh@kernel.org>
      Acked-by: default avatarMark Rutland <mark.rutland@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      4b8b77a4
    • Will Deacon's avatar
      arm64: head: Init PMSCR_EL2.{PA,PCT} when entered at EL2 without VHE · b0c57e10
      Will Deacon authored
      When booting at EL2, ensure that we permit the EL1 host to sample
      physical addresses and physical counter values using SPE.
      Acked-by: default avatarMark Rutland <mark.rutland@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      b0c57e10
    • Will Deacon's avatar
      arm64: sysreg: Move SPE registers and PSB into common header files · a173c390
      Will Deacon authored
      SPE is part of the v8.2 architecture, so move its system register and
      field definitions into sysreg.h and the new PSB barrier into barrier.h
      
      Finally, move KVM over to using the generic definitions so that it
      doesn't have to open-code its own versions.
      Acked-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Acked-by: default avatarMark Rutland <mark.rutland@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      a173c390
    • Will Deacon's avatar
      perf/core: Add PERF_AUX_FLAG_COLLISION to report colliding samples · 085b3062
      Will Deacon authored
      The ARM SPE architecture permits an implementation to ignore a sample
      if the sample is due to be taken whilst another sample is already being
      produced. In this case, it is desirable to report the collision to
      userspace, as they may want to lower the sample period.
      
      This patch adds a PERF_AUX_FLAG_COLLISION flag, so that such events can
      be relayed to userspace.
      Acked-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      085b3062
    • Will Deacon's avatar
      perf/core: Export AUX buffer helpers to modules · bc1d2020
      Will Deacon authored
      Perf PMU drivers using AUX buffers cannot be built as modules unless
      the AUX helpers are exported.
      
      This patch exports perf_aux_output_{begin,end,skip} and perf_get_aux to
      modules.
      
      Cc: Peter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      bc1d2020
    • Will Deacon's avatar
      genirq: export irq_get_percpu_devid_partition to modules · 5ffeb050
      Will Deacon authored
      Any modular driver using cluster-affine PPIs needs to be able to call
      irq_get_percpu_devid_partition so that it can enable the IRQ on the
      correct subset of CPUs.
      
      This patch exports the symbol so that it can be called from within a
      module.
      Acked-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Acked-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      5ffeb050
  7. 17 Oct, 2017 1 commit
  8. 16 Oct, 2017 8 commits
    • Lorenzo Pieralisi's avatar
      ACPI/IORT: Enable SMMUv3/PMCG IORT MSI domain set-up · 65637901
      Lorenzo Pieralisi authored
      ITS specific mappings for SMMUv3/PMCG components can be retrieved
      through special index mapping entries introduced in IORT revision C.
      
      Introduce a new API iort_set_device_domain() to set the MSI domain for
      SMMUv3/PMCG nodes (extendable to any future IORT node requiring special
      index ITS mapping entries) that represent MSI through special index
      mappings in order to enable MSI support for the devices their nodes
      represent.
      Signed-off-by: default avatarLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Signed-off-by: default avatarHanjun Guo <hanjun.guo@linaro.org>
      65637901
    • Hanjun Guo's avatar
      ACPI/IORT: Add SMMUv3 specific special index mapping handling · 86456a3f
      Hanjun Guo authored
      IORT revision C introduced a mapping entry binding to describe ITS
      device ID mapping for SMMUv3 MSI interrupts.
      
      Enable the single mapping flag (ie that is used by SMMUv3 component for
      its special index mappings) for the SMMUv3 node in the IORT mapping API
      and add IORT code to handle special index mapping entry for the SMMUv3
      IORT nodes to enable their MSI interrupts. In case the ACPICA for
      SMMUv3 device ID mapping is not ready, use the ACPICA version as a guard
      for function iort_get_id_mapping_index().
      Signed-off-by: default avatarHanjun Guo <hanjun.guo@linaro.org>
      [lorenzo.pieralisi@arm.com: patch split, typos fixing, rewrote the log]
      Signed-off-by: default avatarLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      86456a3f
    • Hanjun Guo's avatar
      ACPI/IORT: Enable special index ITS group mappings for IORT nodes · 8c8df8dc
      Hanjun Guo authored
      IORT revision C introduced SMMUv3 and PMCG MSI support by adding
      specific mapping entries in the SMMUv3/PMCG subtables to retrieve
      the device ID and the ITS group it maps to for a given SMMUv3/PMCG
      IORT node.
      
      Introduce a mapping function (ie iort_get_id_mapping_index()), that
      for a given IORT node looks up if an ITS specific ID mapping entry
      exists and if so retrieve the corresponding mapping index in the IORT
      node mapping array.
      
      Since an ITS specific index mapping can be present for an IORT
      node that is not a leaf node (eg SMMUv3 - to describe its own
      ITS device ID) special handling is required for two steps mapping
      cases such as PCI/NamedComponent--->SMMUv3--->ITS because the SMMUv3
      ITS specific index mapping entry should be skipped to prevent the
      IORT API from considering the mapping entry as a regular mapping one.
      
      If we take the following IORT topology example:
      
      |----------------------|
      |  Root Complex Node   |
      |----------------------|
      |    map entry[x]      |
      |----------------------|
      |       id value       |
      | output_reference     |
      |---|------------------|
          |
          |   |----------------------|
          |-->|        SMMUv3        |
              |----------------------|
              |     SMMUv3 dev ID    |
              |     mapping index 0  |
              |----------------------|
              |      map entry[0]    |
              |----------------------|
              |       id value       |
              | output_reference-----------> ITS 1 (SMMU MSI domain)
              |----------------------|
              |      map entry[1]    |
              |----------------------|
              |       id value       |
              | output_reference-----------> ITS 2 (PCI MSI domain)
              |----------------------|
      
      where the SMMUv3 ITS specific mapping entry is index 0 and it
      represents the SMMUv3 ITS specific index mapping entry (describing its
      own ITS device ID), we need to skip that mapping entry while carrying
      out the Root Complex Node regular mappings to prevent erroneous
      translations.
      
      Reuse the iort_get_id_mapping_index() function to detect the ITS
      specific mapping index for a specific IORT node and skip it in the IORT
      mapping API (ie iort_node_map_id()) loop to prevent considering it a
      normal PCI/Named Component ID mapping entry.
      Signed-off-by: default avatarHanjun Guo <hanjun.guo@linaro.org>
      [lorenzo.pieralisi@arm.com: split patch/rewrote commit log]
      Signed-off-by: default avatarLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      8c8df8dc
    • Hanjun Guo's avatar
      ACPI/IORT: Look up IORT node through struct fwnode_handle pointer · 0a71d8b9
      Hanjun Guo authored
      Current IORT code provides a function (ie iort_get_fwnode())
      which looks up a struct fwnode_handle pointer through a
      struct acpi_iort_node pointer for SMMU components but it
      lacks a function that implements the reverse look-up, namely
      struct fwnode_handle* -> struct acpi_iort_node*.
      
      Devices that are not IORT named components cannot be retrieved through
      their associated IORT named component scan interface because they just
      are not represented in the ACPI namespace; the reverse look-up is
      therefore required for all platform devices that represent IORT nodes
      (eg SMMUs) so that the struct acpi_iort_node* can be retrieved from the
      struct device->fwnode pointer.
      Signed-off-by: default avatarHanjun Guo <hanjun.guo@linaro.org>
      [lorenzo.pieralisi@arm.com: re-indented/rewrote the commit log]
      Signed-off-by: default avatarLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      0a71d8b9
    • Lorenzo Pieralisi's avatar
      ACPI/IORT: Make platform devices initialization code SMMU agnostic · 896dd2c3
      Lorenzo Pieralisi authored
      The way current IORT code initializes platform devices for SMMU nodes
      is somewhat tied (mostly for naming convention) to the SMMU nodes
      themselves but it need not be in that it is completely generic and
      can easily be made so by structures renaming and code reshuffling.
      
      Rework IORT platform devices initialization code to make the functions
      and data structures SMMU agnostic.
      
      No functional changes intended.
      Signed-off-by: default avatarLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Acked-by: default avatarHanjun Guo <hanjun.guo@linaro.org>
      Cc: Hanjun Guo <hanjun.guo@linaro.org>
      Cc: Sudeep Holla <sudeep.holla@arm.com>
      896dd2c3
    • Lorenzo Pieralisi's avatar
      ACPI/IORT: Improve functions return type/storage class specifier indentation · e3d49392
      Lorenzo Pieralisi authored
      Some functions definition indentations are using a style that is frowned
      upon with return value type/storage class specifier in a separate line.
      
      Reindent the function definitions to fix them.
      Signed-off-by: default avatarLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Acked-by: default avatarHanjun Guo <hanjun.guo@linaro.org>
      Cc: Hanjun Guo <hanjun.guo@linaro.org>
      Cc: Sudeep Holla <sudeep.holla@arm.com>
      e3d49392
    • Lorenzo Pieralisi's avatar
      ACPI/IORT: Remove leftover ACPI_IORT_SMMU_V3_PXM_VALID guard · 75808131
      Lorenzo Pieralisi authored
      The conditional ACPI_IORT_SMMU_V3_PXM_VALID guard around
      arm_smmu_v3_set_proximity() was added to manage a cross tree
      ACPICA merge dependency; with ACPICA changes merged in:
      
      commit c9442300 ("ACPICA: iasl: Update to IORT SMMUv3
      disassembling")
      
      the guard has become useless. Remove it.
      Signed-off-by: default avatarLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Acked-by: default avatarHanjun Guo <hanjun.guo@linaro.org>
      Cc: Hanjun Guo <hanjun.guo@linaro.org>
      Cc: Sudeep Holla <sudeep.holla@arm.com>
      Cc: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>
      75808131
    • Arvind Yadav's avatar
      acpi/arm64: pr_err() strings should end with newlines · ee10b9c9
      Arvind Yadav authored
      pr_err() messages should terminated with a new-line to avoid
      other messages being concatenated onto the end.
      Signed-off-by: default avatarArvind Yadav <arvind.yadav.cs@gmail.com>
      Signed-off-by: default avatarLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Acked-by: default avatarHanjun Guo <hanjun.guo@linaro.org>
      ee10b9c9
  9. 13 Oct, 2017 2 commits
    • Julien Thierry's avatar
      arm64: use WFE for long delays · 7b77452e
      Julien Thierry authored
      The current delay implementation uses the yield instruction, which is a
      hint that it is beneficial to schedule another thread. As this is a hint,
      it may be implemented as a NOP, causing all delays to be busy loops. This
      is the case for many existing CPUs.
      
      Taking advantage of the generic timer sending periodic events to all
      cores, we can use WFE during delays to reduce power consumption. This is
      beneficial only for delays longer than the period of the timer event
      stream.
      
      If timer event stream is not enabled, delays will behave as yield/busy
      loops.
      Signed-off-by: default avatarJulien Thierry <julien.thierry@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      7b77452e
    • Julien Thierry's avatar
      arm_arch_timer: Expose event stream status · ec5c8e42
      Julien Thierry authored
      The arch timer configuration for a CPU might get reset after suspending
      said CPU.
      
      In order to reliably use the event stream in the kernel (e.g. for delays),
      we keep track of the state where we can safely consider the event stream as
      properly configured. After writing to cntkctl, we issue an ISB to ensure
      that subsequent delay loops can rely on the event stream being enabled.
      Signed-off-by: default avatarJulien Thierry <julien.thierry@arm.com>
      Acked-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      ec5c8e42
  10. 11 Oct, 2017 2 commits
    • Mark Rutland's avatar
      arm64: docs: describe ELF hwcaps · 611a7bc7
      Mark Rutland authored
      We don't document our ELF hwcaps, leaving developers to interpret them
      according to hearsay, guesswork, or (in exceptional cases) inspection of
      the current kernel code.
      
      This is less than optimal, and it would be far better if we had some
      definitive description of each of the ELF hwcaps that developers could
      refer to.
      
      This patch adds a document describing the (native) arm64 ELF hwcaps.
      
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Dave Martin <Dave.Martin@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      [ Updated new hwcap entries in the document ]
      Signed-off-by: default avatarSuzuki K Poulose <suzuki.poulose@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      611a7bc7
    • Suzuki K Poulose's avatar
      arm64: Expose support for optional ARMv8-A features · f5e035f8
      Suzuki K Poulose authored
      ARMv8-A adds a few optional features for ARMv8.2 and ARMv8.3.
      Expose them to the userspace via HWCAPs and mrs emulation.
      
      SHA2-512  - Instruction support for SHA512 Hash algorithm (e.g SHA512H,
      	    SHA512H2, SHA512U0, SHA512SU1)
      SHA3 	  - SHA3 crypto instructions (EOR3, RAX1, XAR, BCAX).
      SM3	  - Instruction support for Chinese cryptography algorithm SM3
      SM4 	  - Instruction support for Chinese cryptography algorithm SM4
      DP	  - Dot Product instructions (UDOT, SDOT).
      
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Dave Martin <dave.martin@arm.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Reviewed-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarSuzuki K Poulose <suzuki.poulose@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      f5e035f8