1. 09 May, 2017 6 commits
    • Mark Rutland's avatar
      arm64: ensure extension of smp_store_release value · 994870be
      Mark Rutland authored
      When an inline assembly operand's type is narrower than the register it
      is allocated to, the least significant bits of the register (up to the
      operand type's width) are valid, and any other bits are permitted to
      contain any arbitrary value. This aligns with the AAPCS64 parameter
      passing rules.
      
      Our __smp_store_release() implementation does not account for this, and
      implicitly assumes that operands have been zero-extended to the width of
      the type being stored to. Thus, we may store unknown values to memory
      when the value type is narrower than the pointer type (e.g. when storing
      a char to a long).
      
      This patch fixes the issue by casting the value operand to the same
      width as the pointer operand in all cases, which ensures that the value
      is zero-extended as we expect. We use the same union trickery as
      __smp_load_acquire and {READ,WRITE}_ONCE() to avoid GCC complaining that
      pointers are potentially cast to narrower width integers in unreachable
      paths.
      
      A whitespace issue at the top of __smp_store_release() is also
      corrected.
      
      No changes are necessary for __smp_load_acquire(). Load instructions
      implicitly clear any upper bits of the register, and the compiler will
      only consider the least significant bits of the register as valid
      regardless.
      
      Fixes: 47933ad4 ("arch: Introduce smp_load_acquire(), smp_store_release()")
      Fixes: 878a84d5 ("arm64: add missing data types in smp_load_acquire/smp_store_release")
      Cc: <stable@vger.kernel.org> # 3.14.x-
      Acked-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Matthias Kaehlcke <mka@chromium.org>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      994870be
    • Mark Rutland's avatar
      arm64: xchg: hazard against entire exchange variable · fee960be
      Mark Rutland authored
      The inline assembly in __XCHG_CASE() uses a +Q constraint to hazard
      against other accesses to the memory location being exchanged. However,
      the pointer passed to the constraint is a u8 pointer, and thus the
      hazard only applies to the first byte of the location.
      
      GCC can take advantage of this, assuming that other portions of the
      location are unchanged, as demonstrated with the following test case:
      
      union u {
      	unsigned long l;
      	unsigned int i[2];
      };
      
      unsigned long update_char_hazard(union u *u)
      {
      	unsigned int a, b;
      
      	a = u->i[1];
      	asm ("str %1, %0" : "+Q" (*(char *)&u->l) : "r" (0UL));
      	b = u->i[1];
      
      	return a ^ b;
      }
      
      unsigned long update_long_hazard(union u *u)
      {
      	unsigned int a, b;
      
      	a = u->i[1];
      	asm ("str %1, %0" : "+Q" (*(long *)&u->l) : "r" (0UL));
      	b = u->i[1];
      
      	return a ^ b;
      }
      
      The linaro 15.08 GCC 5.1.1 toolchain compiles the above as follows when
      using -O2 or above:
      
      0000000000000000 <update_char_hazard>:
         0:	d2800001 	mov	x1, #0x0                   	// #0
         4:	f9000001 	str	x1, [x0]
         8:	d2800000 	mov	x0, #0x0                   	// #0
         c:	d65f03c0 	ret
      
      0000000000000010 <update_long_hazard>:
        10:	b9400401 	ldr	w1, [x0,#4]
        14:	d2800002 	mov	x2, #0x0                   	// #0
        18:	f9000002 	str	x2, [x0]
        1c:	b9400400 	ldr	w0, [x0,#4]
        20:	4a000020 	eor	w0, w1, w0
        24:	d65f03c0 	ret
      
      This patch fixes the issue by passing an unsigned long pointer into the
      +Q constraint, as we do for our cmpxchg code. This may hazard against
      more than is necessary, but this is better than missing a necessary
      hazard.
      
      Fixes: 305d454a ("arm64: atomics: implement native {relaxed, acquire, release} atomics")
      Cc: <stable@vger.kernel.org> # 4.4.x-
      Acked-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      fee960be
    • Kristina Martsenko's avatar
      arm64: documentation: document tagged pointer stack constraints · f0e421b1
      Kristina Martsenko authored
      Some kernel features don't currently work if a task puts a non-zero
      address tag in its stack pointer, frame pointer, or frame record entries
      (FP, LR).
      
      For example, with a tagged stack pointer, the kernel can't deliver
      signals to the process, and the task is killed instead. As another
      example, with a tagged frame pointer or frame records, perf fails to
      generate call graphs or resolve symbols.
      
      For now, just document these limitations, instead of finding and fixing
      everything that doesn't work, as it's not known if anyone needs to use
      tags in these places anyway.
      
      In addition, as requested by Dave Martin, generalize the limitations
      into a general kernel address tag policy, and refactor
      tagged-pointers.txt to include it.
      
      Fixes: d50240a5 ("arm64: mm: permit use of tagged pointers at EL0")
      Cc: <stable@vger.kernel.org> # 3.12.x-
      Reviewed-by: default avatarDave Martin <Dave.Martin@arm.com>
      Acked-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarKristina Martsenko <kristina.martsenko@arm.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      f0e421b1
    • Kristina Martsenko's avatar
      arm64: entry: improve data abort handling of tagged pointers · 276e9327
      Kristina Martsenko authored
      When handling a data abort from EL0, we currently zero the top byte of
      the faulting address, as we assume the address is a TTBR0 address, which
      may contain a non-zero address tag. However, the address may be a TTBR1
      address, in which case we should not zero the top byte. This patch fixes
      that. The effect is that the full TTBR1 address is passed to the task's
      signal handler (or printed out in the kernel log).
      
      When handling a data abort from EL1, we leave the faulting address
      intact, as we assume it's either a TTBR1 address or a TTBR0 address with
      tag 0x00. This is true as far as I'm aware, we don't seem to access a
      tagged TTBR0 address anywhere in the kernel. Regardless, it's easy to
      forget about address tags, and code added in the future may not always
      remember to remove tags from addresses before accessing them. So add tag
      handling to the EL1 data abort handler as well. This also makes it
      consistent with the EL0 data abort handler.
      
      Fixes: d50240a5 ("arm64: mm: permit use of tagged pointers at EL0")
      Cc: <stable@vger.kernel.org> # 3.12.x-
      Reviewed-by: default avatarDave Martin <Dave.Martin@arm.com>
      Acked-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarKristina Martsenko <kristina.martsenko@arm.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      276e9327
    • Kristina Martsenko's avatar
      arm64: hw_breakpoint: fix watchpoint matching for tagged pointers · 7dcd9dd8
      Kristina Martsenko authored
      When we take a watchpoint exception, the address that triggered the
      watchpoint is found in FAR_EL1. We compare it to the address of each
      configured watchpoint to see which one was hit.
      
      The configured watchpoint addresses are untagged, while the address in
      FAR_EL1 will have an address tag if the data access was done using a
      tagged address. The tag needs to be removed to compare the address to
      the watchpoints.
      
      Currently we don't remove it, and as a result can report the wrong
      watchpoint as being hit (specifically, always either the highest TTBR0
      watchpoint or lowest TTBR1 watchpoint). This patch removes the tag.
      
      Fixes: d50240a5 ("arm64: mm: permit use of tagged pointers at EL0")
      Cc: <stable@vger.kernel.org> # 3.12.x-
      Acked-by: default avatarMark Rutland <mark.rutland@arm.com>
      Acked-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarKristina Martsenko <kristina.martsenko@arm.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      7dcd9dd8
    • Kristina Martsenko's avatar
      arm64: traps: fix userspace cache maintenance emulation on a tagged pointer · 81cddd65
      Kristina Martsenko authored
      When we emulate userspace cache maintenance in the kernel, we can
      currently send the task a SIGSEGV even though the maintenance was done
      on a valid address. This happens if the address has a non-zero address
      tag, and happens to not be mapped in.
      
      When we get the address from a user register, we don't currently remove
      the address tag before performing cache maintenance on it. If the
      maintenance faults, we end up in either __do_page_fault, where find_vma
      can't find the VMA if the address has a tag, or in do_translation_fault,
      where the tagged address will appear to be above TASK_SIZE. In both
      cases, the address is not mapped in, and the task is sent a SIGSEGV.
      
      This patch removes the tag from the address before using it. With this
      patch, the fault is handled correctly, the address gets mapped in, and
      the cache maintenance succeeds.
      
      As a second bug, if cache maintenance (correctly) fails on an invalid
      tagged address, the address gets passed into arm64_notify_segfault,
      where find_vma fails to find the VMA due to the tag, and the wrong
      si_code may be sent as part of the siginfo_t of the segfault. With this
      patch, the correct si_code is sent.
      
      Fixes: 7dd01aef ("arm64: trap userspace "dc cvau" cache operation on errata-affected core")
      Cc: <stable@vger.kernel.org> # 4.8.x-
      Acked-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarKristina Martsenko <kristina.martsenko@arm.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      81cddd65
  2. 05 May, 2017 1 commit
  3. 28 Apr, 2017 2 commits
  4. 26 Apr, 2017 1 commit
    • Ard Biesheuvel's avatar
      arm64: module: split core and init PLT sections · 24af6c4e
      Ard Biesheuvel authored
      The arm64 module PLT code allocates all PLT entries in a single core
      section, since the overhead of having a separate init PLT section is
      not justified by the small number of PLT entries usually required for
      init code.
      
      However, the core and init module regions are allocated independently,
      and there is a corner case where the core region may be allocated from
      the VMALLOC region if the dedicated module region is exhausted, but the
      init region, being much smaller, can still be allocated from the module
      region. This leads to relocation failures if the distance between those
      regions exceeds 128 MB. (In fact, this corner case is highly unlikely to
      occur on arm64, but the issue has been observed on ARM, whose module
      region is much smaller).
      
      So split the core and init PLT regions, and name the latter ".init.plt"
      so it gets allocated along with (and sufficiently close to) the .init
      sections that it serves. Also, given that init PLT entries may need to
      be emitted for branches that target the core module, modify the logic
      that disregards defined symbols to only disregard symbols that are
      defined in the same section as the relocated branch instruction.
      
      Since there may now be two PLT entries associated with each entry in
      the symbol table, we can no longer hijack the symbol::st_size fields
      to record the addresses of PLT entries as we emit them for zero-addend
      relocations. So instead, perform an explicit comparison to check for
      duplicate entries.
      Signed-off-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      24af6c4e
  5. 25 Apr, 2017 1 commit
  6. 24 Apr, 2017 1 commit
  7. 12 Apr, 2017 2 commits
    • Catalin Marinas's avatar
      Merge branch 'will/for-next/perf' into for-next/core · 494bc3cd
      Catalin Marinas authored
      * will/for-next/perf:
        arm64: pmuv3: use arm_pmu ACPI framework
        arm64: pmuv3: handle !PMUv3 when probing
        drivers/perf: arm_pmu: add ACPI framework
        arm64: add function to get a cpu's MADT GICC table
        drivers/perf: arm_pmu: split out platform device probe logic
        drivers/perf: arm_pmu: move irq request/free into probe
        drivers/perf: arm_pmu: split cpu-local irq request/free
        drivers/perf: arm_pmu: rename irq request/free functions
        drivers/perf: arm_pmu: handle no platform_device
        drivers/perf: arm_pmu: simplify cpu_pmu_request_irqs()
        drivers/perf: arm_pmu: factor out pmu registration
        drivers/perf: arm_pmu: fold init into alloc
        drivers/perf: arm_pmu: define armpmu_init_fn
        drivers/perf: arm_pmu: remove pointless PMU disabling
        perf: qcom: Add L3 cache PMU driver
        drivers/perf: arm_pmu: split irq request from enable
        drivers/perf: arm_pmu: manage interrupts per-cpu
        drivers/perf: arm_pmu: rework per-cpu allocation
        MAINTAINERS: Add file patterns for perf device tree bindings
      494bc3cd
    • Marc Zyngier's avatar
      arm64: Silence spurious kbuild warning on menuconfig · d91750f1
      Marc Zyngier authored
      Since bbb56c27 ("arm64: Add detection code for broken .inst support
      in binutils"), running any make target that doesn't involve the cross
      compiler results in a spurious warning:
      
      $ make ARCH=arm64 menuconfig
      arch/arm64/Makefile:43: Detected assembler with broken .inst; disassembly will be unreliable
      
      while
      
      $ make ARCH=arm64 CROSS_COMPILE=aarch64-arm-linux- menuconfig
      
      is silent (assuming your compiler is not affected). That's because
      the code that tests for the workaround is always run, irrespective
      of the current configuration being available or not.
      
      An easy fix is to make the detection conditional on CONFIG_ARM64
      being defined, which is only the case when actually building
      something.
      
      Fixes: bbb56c27 ("arm64: Add detection code for broken .inst support in binutils")
      Reviewed-by: default avatarRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      d91750f1
  8. 11 Apr, 2017 14 commits
  9. 07 Apr, 2017 6 commits
  10. 06 Apr, 2017 1 commit
    • Stephen Boyd's avatar
      arm64: print a fault message when attempting to write RO memory · b824b930
      Stephen Boyd authored
      If a page is marked read only we should print out that fact,
      instead of printing out that there was a page fault. Right now we
      get a cryptic error message that something went wrong with an
      unhandled fault, but we don't evaluate the esr to figure out that
      it was a read/write permission fault.
      
      Instead of seeing:
      
        Unable to handle kernel paging request at virtual address ffff000008e460d8
        pgd = ffff800003504000
        [ffff000008e460d8] *pgd=0000000083473003, *pud=0000000083503003, *pmd=0000000000000000
        Internal error: Oops: 9600004f [#1] PREEMPT SMP
      
      we'll see:
      
        Unable to handle kernel write to read-only memory at virtual address ffff000008e760d8
        pgd = ffff80003d3de000
        [ffff000008e760d8] *pgd=0000000083472003, *pud=0000000083435003, *pmd=0000000000000000
        Internal error: Oops: 9600004f [#1] PREEMPT SMP
      
      We also add a userspace address check into is_permission_fault()
      so that the function doesn't return true for ttbr0 PAN faults
      when it shouldn't.
      Reviewed-by: default avatarJames Morse <james.morse@arm.com>
      Tested-by: default avatarJames Morse <james.morse@arm.com>
      Acked-by: default avatarLaura Abbott <labbott@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Signed-off-by: default avatarStephen Boyd <stephen.boyd@linaro.org>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      b824b930
  11. 05 Apr, 2017 5 commits