Commits · c848791f0336914a3081ea3fe029cf177d81de81 · Kirill Smelkov / linux

14 Apr, 2015 12 commits

Merge branches 'misc', 'vdso' and 'fixes' into for-next · c848791f
Russell King authored Apr 14, 2015
```
Conflicts:
	arch/arm/mm/proc-macros.S
```
c848791f

ARM: update errata 430973 documentation to cover Cortex A8 r1p* · 79403cda

Russell King authored Apr 13, 2015

This errata covers all r1 variants of Cortex A8, it's not limited to
just r1p0..r1p2. Update the documentation to reflect this. The code
already applies the workaround to all r1p* A8 CPUs.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

79403cda

ARM: ensure delay timer has sufficient accuracy for delays · 57ca654b

Russell King authored Apr 13, 2015

We have recently had an example of someone wanting to use a 90kHz timer
for the software delay loop.

udelay() needs to have at least microsecond resolution to allow drivers
access to a delay mechanism with a reasonable chance of delaying the
period they requested within at least a 50% marging of error, especially
for small delays.

Discussion about the udelay() accuracy can be found at:
	https://lkml.org/lkml/2011/1/9/37

Reject timers which are unable to supply this level of resolution.
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

57ca654b

ARM: switch to use the generic show_mem() implementation · 37463be8

Russell King authored Mar 26, 2015

Switch ARM to use the generic show_mem() implementation, which displays
the statistics from the mm zone rather than walking the page arrays.
Acked-by: Mel Gorman &lt;mgorman <mgorman@suse.de>
Tested-by: Gregory Fong <gregory.0xf0@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

37463be8

ARM: proc-v7: avoid errata 430973 workaround for non-Cortex A8 CPUs · a6d74678

Russell King authored Apr 07, 2015

Avoid the errata 430973 workaround for non-Cortex A8 CPUs.  Having this
workaround enabled introduces an additional branch target buffer flush
into the context switching path, something we wish to avoid.  To allow
this errata to be enabled in multiplatform kernels while reducing its
impact, rearrange the Cortex-A8 CPU support to avoid impacting on other
Version 7 CPUs.
Tested-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

a6d74678

ARM: enable ARM errata 643719 workaround by default · e5a5de44

Russell King authored Apr 02, 2015

The effects of not having ARM errata 643719 enabled on affected CPUs
can be very confusing and hard to debug. Rather than leave this to
chance, enable this workaround by default. Now that we have rearranged
the code, it should have a low impact on the majority of CPUs.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

e5a5de44

ARM: cache-v7: optimise test for Cortex A9 r0pX devices · aaf4b5d9

Russell King authored Apr 03, 2015

Eliminate one unnecessary instruction from this test by pre-shifting
the Cortex A9 ID - we can shift the actual ID in the teq instruction
thereby losing the pX bit of the ID at no cost.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

aaf4b5d9

ARM: cache-v7: optimise branches in v7_flush_cache_louis · d3cd451d

Russell King authored Apr 03, 2015

Optimise the branches such that for the majority of unaffected devices,
we avoid needing to execute the errata work-around code path by
branching to start_flush_levels early.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

d3cd451d

ARM: cache-v7: consolidate initialisation of cache level index · cd8b24d9

Russell King authored Apr 03, 2015

Both v7_flush_cache_louis and v7_flush_dcache_all both begin the
flush_levels loop with r10 initialised to zero. In each case, this
is done immediately prior to entering the loop. Branch to this
instruction in v7_flush_dcache_all from v7_flush_cache_louis and
eliminate the unnecessary initialisation in v7_flush_cache_louis.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

cd8b24d9

ARM: cache-v7: shift CLIDR to extract appropriate field before masking · 47b8484e

Russell King authored Apr 03, 2015

Rather than have code which masks and then shifts, such as:

	mrc     p15, 1, r0, c0, c0, 1
ALT_SMP(ands	r3, r0, #7 << 21)
ALT_UP( ands	r3, r0, #7 << 27)
ALT_SMP(mov	r3, r3, lsr #20)
ALT_UP(	mov	r3, r3, lsr #26)

re-arrange this as a shift and then mask.  The masking is the same for
each field which we want to extract, so this allows the mask to be
shared amongst code paths:

	mrc     p15, 1, r0, c0, c0, 1
ALT_SMP(mov	r3, r0, lsr #20)
ALT_UP(	mov	r3, r0, lsr #26)
	ands	r3, r3, #7 << 1

Use this method for the LoUIS, LoUU and LoC fields.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

47b8484e

ARM: cache-v7: use movw/movt instructions · 5aca3708

Russell King authored Apr 03, 2015

We always build cache-v7.S for ARMv7, so we can use the ARMv7 16-bit
move instructions to load large constants, rather than using constants
in a literal pool.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

5aca3708

ARM: allow 16-bit instructions in ALT_UP() · 89c6bc58

Russell King authored Apr 09, 2015

Allow ALT_UP() to cope with a 16-bit Thumb instruction by automatically
inserting a following nop instruction. This allows us to care less
about getting the assembler to emit a 32-bit thumb instruction.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

89c6bc58

10 Apr, 2015 1 commit

ARM: proc-arm94*.S: fix setup function · 6c5c2a01

Russell King authored Apr 04, 2015

Both ARM946 and ARM940 setup functions were corrupting r1 and r2,
which is not permissible - these are used to carry the machine ID
and boot data into the kernel, and must be preserved.

The code responsible for this was the same in both files: they were
using the registers to generate a protection region register value.

Fix this by turning this process into a macro, and using that macro
in both these files with an alternative register allocation.  r0,
r3 and r7 can be used for temporary values here.
Reported-by: Alex Dumitrache <broscutamaker@gmail.com>
Tested-by: Georg Hofstetter <g3gg0.de@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

6c5c2a01

07 Apr, 2015 1 commit

ARM: vexpress: fix CPU hotplug with CT9x4 tile. · f6ac49ba

Russell King authored Apr 02, 2015

The Cortex A9 tile fails to unplug CPUs if errata 643719 is not enabled.
This leads to random weird behaviours, but ultimately seem to lock the
kernel one way or another when a CPU is hot unplugged.

Symptoms range from a spinlock lockup in the scheduler, the entire
system hanging, to dumping out the kernel printk buffer a few lines at
a time, and other weird behaviours.

This is caused by the outgoing CPU not having its inner caches properly
flushed before it exits coherency - flush_cache_louis() is used to
achieve this, but as a result of the hardware bug, this function ends
up doing nothing without the errata workaround enabled.

As the Versatile Express has an affected CPU, this errata must always
be enabled.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

f6ac49ba

02 Apr, 2015 9 commits

ARM: 8276/1: Make CPU_DCACHE_DISABLE depend on !SMP · e1e2f6e4

Florian Fainelli authored Jan 09, 2015

Enabling CPU_DCACHE_DISABLE on a SMP capable system will prevent the
kernel from booting because of the following ldrex instruction in
arch_spin_lock:

(gdb) x/10i $pc
=> 0xc053cfa8 <_raw_spin_lock+4>:       ldrex   r3, [r0]
   0xc053cfac <_raw_spin_lock+8>:       add     r2, r3, #65536  ; 0x10000

which is taken by the very first printk call:

    at /home/fainelli/work/linux/arch/arm/include/asm/spinlock.h:65
    fmt=0xc0637650 " 01 66Booting Linux on physical CPU 0x%xn", args=<incomplete type>)
    at kernel/printk/printk.c:1525
    fmt=0xc05370f4 <printk+52> " 24320215342 04340235344 20320215342 36377/341 17") at kernel/printk/printk.c:1688

ldrex requires exclusive monitor(s) (local or global) which are no longer
working when the Data cache is disabled in CP15 and will just hang the CPU
there.
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

e1e2f6e4

ARM: 8335/1: Documentation: DT bindings: Tegra AHB: document the legacy base address · 38e42f12

Paul Walmsley authored Mar 26, 2015

Documentation: DT bindings: Tegra AHB: require the legacy base address for existing chips

Per Stephen Warren, note in the Tegra AHB DT binding documentation
that we specifically deprecate any attempt to use the IP block's
actual hardware base address, and advocate the use of the legacy
"off-by-four" address in the 'regs' property, for Tegra chips with
existing upstream Linux DT files that include a Tegra AHB node.  This
patch updates the documentation accordingly.

Changing the existing kernel DT data isn't under consideration because
Linux kernel DT data policy is to preserve compatibility between newer
DT data files and older kernels.  However, this additional step of
changing the documentation should discourage others from sending
kernel patches to try to change the legacy kernel DT data.
Furthermore, for out-of-tree software (such as bootloaders or other
operating systems) that may rely on Linux kernel DT binding
documentation as an ABI (but not the Linux kernel DT data itself),
such a change may allow future convergence with the Linux kernel DT
data without additional code changes.
Signed-off-by: Paul Walmsley <paul@pwsan.com>
Cc: Paul Walmsley <pwalmsley@nvidia.com>
Cc: Stephen Warren <swarren@wwwdotorg.org>
Cc: Alexandre Courbot <gnurou@gmail.com>
Cc: Eduardo Valentin <edubezval@gmail.com>
Cc: Ian Campbell <ijc+devicetree@hellion.org.uk>
Cc: Kumar Gala <galak@codeaurora.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Thierry Reding <thierry.reding@gmail.com>
Cc: devicetree@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Acked-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

38e42f12

ARM: 8334/1: amba: tegra-ahb: detect and correct bogus base address · ce7a10b0

Paul Walmsley authored Mar 26, 2015

amba: tegra-ahb: detect and correct bogus base address

From a hardware SoC integration point of view, the starting address of
this IP block in the existing Tegra SoC DT files is off by 4 bytes
from the actual base address.  Since we attempt to make old DT files
forward-compatible with newer kernels, we cannot fix the IP block base
address in old DT data. This patch works around the problem by
detecting the four byte base address offset in the driver code, and
correcting it if it's detected.  (In general, IP block base addresses
almost always have a null low byte.)

Future SoC DT data for Tegra AHB should use the correct Tegra AHB base
address, in cases where there is no DT data backward compatibility
requirement.

This patch is a revision of the patch originally titled
"amba: tegra-ahb: use correct base address for future chip support".
This revision implements changes requested by Russell King:

http://marc.info/?l=linux-tegra&m=142658851825062&w=2
http://marc.info/?l=linux-tegra&m=142658873925178&w=2Signed-off-by: Paul Walmsley <paul@pwsan.com>
Cc: Paul Walmsley <pwalmsley@nvidia.com>
Cc: Alexandre Courbot <gnurou@gmail.com>
Cc: Hiroshi DOYU <hdoyu@nvidia.com>
Cc: Stephen Warren <swarren@wwwdotorg.org>
Cc: Thierry Reding <thierry.reding@gmail.com>
Cc: linux-kernel@vger.kernel.org
Acked-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

ce7a10b0

ARM: 8333/1: amba: tegra-ahb: fix register offsets in the macros · 049e4b3f

Paul Walmsley authored Mar 26, 2015

amba: tegra-ahb: fix register offsets in the macros

From a hardware SoC integration point of view, the offsets of the
Tegra AHB registers that are currently defined in tegra-ahb.c macros
are all off by four bytes.  Similarly, the starting address of this IP
block in our existing DT files is also off by four bytes.  Since we
attempt to make old DT files forward-compatible with newer kernels, we
cannot fix the IP block base address in old DT data.  However, we can
fix the offsets in the driver so that they are correct with respect to
the hardware, which is what this patch does.  And a subsequent patch
will allow the offset to be removed for DT 'compatible' strings used
in future DT files for newer Tegra chips that the kernel does not yet
support.
Signed-off-by: Paul Walmsley <paul@pwsan.com>
Cc: Paul Walmsley <pwalmsley@nvidia.com>
Cc: Alexandre Courbot <gnurou@gmail.com>
Cc: Hiroshi DOYU <hdoyu@nvidia.com>
Cc: Stephen Warren <swarren@wwwdotorg.org>
Cc: Thierry Reding <thierry.reding@gmail.com>
Cc: linux-kernel@vger.kernel.org
Acked-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

049e4b3f

ARM: 8339/1: Enable CONFIG_GENERIC_IRQ_SHOW_LEVEL · 7c07005e

Geert Uytterhoeven authored Apr 01, 2015

Several interrupt controllers support both edge and level interrupts, so
it's useful to provide that information in /proc/interrupts.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

7c07005e

ARM: 8338/1: kexec: Relax SMP validation to improve DT compatibility · fee3fd4f

Geert Uytterhoeven authored Apr 01, 2015

When trying to kexec into a new kernel on a platform where multiple CPU
cores are present, but no SMP bringup code is available yet, the
kexec_load system call fails with:

    kexec_load failed: Invalid argument

The SMP test added to machine_kexec_prepare() in commit 2103f6cb
("ARM: 7807/1: kexec: validate CPU hotplug support") wants to prohibit
kexec on SMP platforms where it cannot disable secondary CPUs.
However, this test is too strict: if the secondary CPUs couldn't be
enabled in the first place, there's no need to disable them later at
kexec time.  Hence skip the test in the absence of SMP bringup code.

This allows to add all CPU cores to the DTS from the beginning, without
having to implement SMP bringup first, improving DT compatibility.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

fee3fd4f

ARM: 8337/1: mm: Do not invoke OOM for higher order IOMMU DMA allocations · 49f28aa6

Tomasz Figa authored Apr 01, 2015

IOMMU should be able to use single pages as well as bigger blocks, so if
higher order allocations fail, we should not affect state of the system,
with events such as OOM killer, but rather fall back to order 0
allocations.

This patch changes the behavior of ARM IOMMU DMA allocator to use
__GFP_NORETRY, which bypasses OOM invocation, for orders higher than
zero and, only if that fails, fall back to normal order 0 allocation
which might invoke OOM killer.
Signed-off-by: Tomasz Figa <tfiga@chromium.org>
Reviewed-by: Doug Anderson <dianders@chromium.org>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

49f28aa6

ARM: move reboot code to arch/arm/kernel/reboot.c · 045ab94e

Russell King authored Apr 01, 2015

Move shutdown and reboot related code to a separate file, out of
process.c.  This helps to avoid polluting process.c with non-process
related code.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

045ab94e

ARM: fix broken hibernation · 767bf7e7

Russell King authored Apr 01, 2015

Normally, when a CPU wants to clear a cache line to zero in the external
L2 cache, it would generate bus cycles to write each word as it would do
with any other data access.

However, a Cortex A9 connected to a L2C-310 has a specific feature where
the CPU can detect this operation, and signal that it wants to zero an
entire cache line.  This feature, known as Full Line of Zeros (FLZ),
involves a non-standard AXI signalling mechanism which only the L2C-310
can properly interpret.

There are separate enable bits in both the L2C-310 and the Cortex A9 -
the L2C-310 needs to be enabled and have the FLZ enable bit set in the
auxiliary control register before the Cortex A9 has this feature
enabled.

Unfortunately, the suspend code was not respecting this - it's not
obvious from the code:

swsusp_arch_suspend()
 cpu_suspend() /* saves the Cortex A9 auxiliary control register */
  arch_save_image()
  soft_restart() /* turns off FLZ in Cortex A9, and disables L2C */
   cpu_resume() /* restores the Cortex A9 registers, inc auxcr */

At this point, we end up with the L2C disabled, but the Cortex A9 with
FLZ enabled - which means any memset() or zeroing of a full cache line
will fail to take effect.

A similar issue exists in the resume path, but it's slightly more
complex:

swsusp_arch_suspend()
 cpu_suspend() /* saves the Cortex A9 auxiliary control register */
  arch_save_image() /* image with A9 auxcr saved */
...
swsusp_arch_resume()
 call_with_stack()
  arch_restore_image() /* restores image with A9 auxcr saved above */
  soft_restart() /* turns off FLZ in Cortex A9, and disables L2C */
   cpu_resume() /* restores the Cortex A9 registers, inc auxcr */

Again, here we end up with the L2C disabled, but Cortex A9 FLZ enabled.

There's no need to turn off the L2C in either of these two paths; there
are benefits from not doing so - for example, the page copies will be
faster with the L2C enabled.

Hence, fix this by providing a variant of soft_restart() which can be
used without turning the L2 cache controller off, and use it in both
of these paths to keep the L2C enabled across the respective resume
transitions.

Fixes: 8ef418c7 ("ARM: l2c: trial at enabling some Cortex-A9 optimisations")
Reported-by: Sean Cross <xobs@kosagi.com>
Tested-by: Sean Cross <xobs@kosagi.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

767bf7e7

29 Mar, 2015 7 commits

ARM: 8326/1: s5pv210: move resume code to .text section · 00ee68ec

Ard Biesheuvel authored Mar 25, 2015

This code calls cpu_resume() using a straight branch (b), so
now that we have moved cpu_resume() back to .text, this should
be moved there as well.
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

00ee68ec

ARM: 8325/1: exynos: move resume code to .text section · 12833bac

Ard Biesheuvel authored Mar 25, 2015

This code calls cpu_resume() using a straight branch (b), so
now that we have moved cpu_resume() back to .text, this should
be moved there as well. Any direct references to symbols that will
remain in the .data section are replaced with explicit PC-relative
references.
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

12833bac

ARM: 8324/1: move cpu_resume() to .text section · d0776aff

Ard Biesheuvel authored Mar 25, 2015

Move cpu_resume() to the .text section where it belongs. Change
the adr reference to sleep_save_sp to an explicit PC relative
reference so sleep_save_sp itself can remain in .data.

This helps prevent linker failure on large kernels, as the code
in the .data section may be too far away to be in range for normal
b/bl instructions.
Reviewed-by: Nicolas Pitre <nico@linaro.org>
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

d0776aff

ARM: 8323/1: force linker to use PIC veneers · 02e541db

Ard Biesheuvel authored Mar 25, 2015

When building a very large kernel, it is up to the linker to decide
when and where to insert stubs to allow calls to functions that are
out of range for the ordinary b/bl instructions.

However, since the kernel is built as a position dependent binary,
these stubs (aka veneers) may contain absolute addresses, which will
break far calls performed with the MMU off.

For instance, the call from __enable_mmu() in the .head.text section
to __turn_mmu_on() in the .idmap.text section may be turned into
something like this:

c0008168 <__enable_mmu>:
c0008168:       f020 0002       bic.w   r0, r0, #2
c000816c:       f420 5080       bic.w   r0, r0, #4096
c0008170:       f000 b846       b.w     c0008200 <____turn_mmu_on_veneer>
[...]
c0008200 <____turn_mmu_on_veneer>:
c0008200:       4778            bx      pc
c0008202:       46c0            nop
c0008204:       e59fc000        ldr     ip, [pc]
c0008208:       e12fff1c        bx      ip
c000820c:       c13dfae1        teqgt   sp, r1, ror #21
[...]
c13dfae0 <__turn_mmu_on>:
c13dfae0:       4600            mov     r0, r0
[...]

After adding --pic-veneer to the LDFLAGS, the veneer is emitted like
this instead:

c0008200 <____turn_mmu_on_veneer>:
c0008200:       4778            bx      pc
c0008202:       46c0            nop
c0008204:       e59fc004        ldr     ip, [pc, #4]
c0008208:       e08fc00c        add     ip, pc, ip
c000820c:       e12fff1c        bx      ip
c0008210:       013d7d31        teqeq   sp, r1, lsr sp
c0008214:       00000000        andeq   r0, r0, r0

Note that this particular example is best addressed by moving
.head.text and .idmap.text closer together, but this issue could
potentially affect any code that needs to execute with the
MMU off.
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

02e541db

ARM: 8322/1: keep .text and .fixup regions closer together · c4a84ae3

Ard Biesheuvel authored Mar 24, 2015

This moves all fixup snippets to the .text.fixup section, which is
a special section that gets emitted along with the .text section
for each input object file, i.e., the snippets are kept much closer
to the code they refer to, which helps prevent linker failure on
large kernels.
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

c4a84ae3

ARM: 8321/1: asm-generic: introduce .text.fixup input section · 779c88c9

Ard Biesheuvel authored Mar 24, 2015

This introduces a new .text.fixup input section that gets emitted
together with the .text section for each input object file.

Note that

  *(.text)
  *(.text.fixup)

is not the same as

  *(.text .text.fixup)

and we are looking for the latter, to ensure that fixup snippets that
are assembled into a separate section in the object file do not end
up out of range for the relative branch instructions it contains if
the .text section itself grows very large.

This helps prevent linker failures on large ARM kernels.
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

779c88c9

ARM: 8307/1: psci: move psci firmware calls out of line · c0978773

Mark Rutland authored Mar 06, 2015

arm64 builds with GCC 5 have caused the __asmeq assertions in the PSCI
calling code to fire, so move the ARM PSCI calls out of line into their
own assembly file for consistency and to safeguard against the same
issue occuring with the 32-bit toolchain.

[will: brought into line with arm64 implementation]
Reported-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

c0978773

28 Mar, 2015 7 commits

ARM: 8328/1: remove empty preprocessor #else branch · 15955e70

Uwe Kleine-König authored Mar 25, 2015

When the patch for e16343c4 (ARM: 8160/1: drop warning about
return_address not using unwind tables) was created there was still more
code in said branch. Probably this simplification was just missed during
conflict resolution when the patch was applied.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

15955e70

ARM: 8327/1: zImage: add support for ARMv7-M · c20611df

Joachim Eastwood authored Mar 25, 2015

This patch makes it possible to enter zImage in Thumb mode for ARMv7-M
(Cortex-M) CPUs that do not support ARM mode. The kernel entry is also
made in Thumb mode.

[ukl: fix spelling in commit log, return early in call_cache_fn]
Signed-off-by: Joachim Eastwood <manabian@gmail.com>
Tested-by: Stefan Agner <stefan@agner.ch>
Tested-by: Ezequiel Garcia <ezequiel@vanguardiasur.com.ar>
Tested-by: Chanwoo Choi <cw00.choi@samsung.com>
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

c20611df

ARM: 8320/1: fix integer overflow in ELF_ET_DYN_BASE · 8defb336

Andrey Ryabinin authored Mar 20, 2015

Usually ELF_ET_DYN_BASE is 2/3 of TASK_SIZE. With 3G/1G user/kernel
split this is not so, because 2*TASK_SIZE overflows 32 bits,
so the actual value of ELF_ET_DYN_BASE is:
	(2 * TASK_SIZE / 3) = 0x2a000000

When ASLR is disabled PIE binaries will load at ELF_ET_DYN_BASE address.
On 32bit platforms AddressSanitzer uses addresses [0x20000000 - 0x40000000]
for shadow memory [1]. So ASan doesn't work for PIE binaries when ASLR disabled
as it fails to map shadow memory.
Also after Kees's 'split ET_DYN ASLR from mmap ASLR' patchset PIE binaries
has a high chance of loading somewhere in between [0x2a000000 - 0x40000000]
even if ASLR enabled. This makes ASan with PIE absolutely incompatible.

Fix overflow by dividing TASK_SIZE prior to multiplying.
After this patch ELF_ET_DYN_BASE equals to (for CONFIG_VMSPLIT_3G=y):
	(TASK_SIZE / 3 * 2) = 0x7f555554

[1] https://code.google.com/p/address-sanitizer/wiki/AddressSanitizerAlgorithm#MappingSigned-off-by: Andrey Ryabinin <a.ryabinin@samsung.com>
Reported-by: Maria Guseva <m.guseva@samsung.com>
Cc: stable@vger.kernel.org
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

8defb336

ARM: 8319/1: advertise availability of v8 Crypto instructions · a092aedb

Ard Biesheuvel authored Mar 19, 2015

When running the 32-bit ARM kernel on ARMv8 capable bare metal (e.g.,
32-bit Android userland and kernel on a Cortex-A53), or as a KVM guest
on a 64-bit host, we should advertise the availability of the Crypto
instructions, so that userland libraries such as OpenSSL may use them.
(Support for the v8 Crypto instructions in the 32-bit build was added
to OpenSSL more than six months ago)

This adds the ID feature bit detection, and sets elf_hwcap2 accordingly.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

a092aedb

ARM: 8318/1: treat CPU feature register fields as signed quantities · b8c9592b

Ard Biesheuvel authored Mar 19, 2015

The various CPU feature registers consist of 4-bit blocks that
represent signed quantities, whose positive values represent
incremental features, and whose negative values are reserved.

To improve forward compatibility, update the feature detection
code to take possible future higher values into account, but
ignore negative values.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

b8c9592b

ARM: 8317/1: move the .idmap.text section closer to .head.text · eb765c1c

Ard Biesheuvel authored Mar 18, 2015

This moves the .idmap.text section closer to .head.text, so that
relative branches are less likely to go out of range if the kernel
text gets bigger.
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

eb765c1c

ARM: 8314/1: replace PROCINFO embedded branch with relative offset · bf35706f

Ard Biesheuvel authored Mar 18, 2015

This patch replaces the 'branch to setup()' instructions embedded
in the PROCINFO structs with the offset to that setup function
relative to the base of the struct. This preserves the position
independent nature of that field, but uses a data item rather
than an instruction.

This is mainly done to prevent linker failures on large kernels,
where the setup function is out of reach for the branch.
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

bf35706f

27 Mar, 2015 3 commits

ARM: add documentation for finding start of physical memory · 0a6a78b8

Russell King authored Mar 26, 2015

Occasionally, there's a question about the method we use to find the
start of physical memory.  Add some documentation so we don't have to
keep repeating outselves on the mailing list.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

0a6a78b8

ARM: 8332/1: add CONFIG_VDSO Kconfig and Makefile bits · e5b61deb

Nathan Lynch authored Mar 25, 2015

Allow users to enable the vdso in Kconfig; include the vdso in the
build if CONFIG_VDSO is enabled.  Add 'vdso_install' target.
Signed-off-by: Nathan Lynch <nathan_lynch@mentor.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

e5b61deb

ARM: 8331/1: VDSO initialization, mapping, and synchronization · ecf99a43

Nathan Lynch authored Mar 25, 2015

Initialize the VDSO page list at boot, install the VDSO mapping at
exec time, and update the data page during timer ticks.  This code is
not built if CONFIG_VDSO is not enabled.

Account for the VDSO length when randomizing the offset from the
stack.  The [vdso] and [vvar] pages are placed immediately following
the sigpage with separate _install_special_mapping calls.

We want to "penalize" systems lacking the arch timer as little
as possible.  Previous versions of this code installed the VDSO
unconditionally and unmodified, making it a measurably slower way for
glibc to invoke the real syscalls on such systems.  E.g. calling
gettimeofday via glibc goes from ~560ns to ~630ns on i.MX6Q.

If we can indicate to glibc that the time-related APIs in the VDSO are
not accelerated, glibc can continue to invoke the syscalls directly
instead of dispatching through the VDSO only to fall back to the slow
path.

Thus, if the architected timer is unusable for whatever reason, patch
the VDSO at boot time so that symbol lookups for gettimeofday and
clock_gettime return NULL.  (This is similar to what powerpc does and
borrows code from there.)  This allows glibc to perform the syscall
directly instead of passing control to the VDSO, which minimizes the
penalty.  In my measurements the time taken for a gettimeofday call
via glibc goes from ~560ns to ~580ns (again on i.MX6Q), and this is
solely due to adding a test and branch to glibc's gettimeofday syscall
wrapper.

An alternative to patching the VDSO at boot would be to not install
the VDSO at all when the arch timer isn't usable.  Another alternative
is to include a separate "dummy" vdso.so without gettimeofday and
clock_gettime, which would be selected at boot time.  Either of these
would get cumbersome if the VDSO were to gain support for an API such
as getcpu which is unrelated to arch timer support.
Signed-off-by: Nathan Lynch <nathan_lynch@mentor.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

ecf99a43