Commits · 2f51c8204ab3ea211ac92f3b7b88a38595ed6412 · Kirill Smelkov / linux

13 Mar, 2016 2 commits

Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 2f51c820

Linus Torvalds authored Mar 12, 2016

Pull x86 fixes from Ingo Molnar:
 "This fixes 3 FPU handling related bugs, an EFI boot crash and a
  runtime warning.

  The EFI fix arrived late but I didn't want to delay it to after v4.5
  because the effects are pretty bad for the systems that are affected
  by it"

[ Actually, I don't think the EFI fix really matters yet, because we
  haven't switched to the separate EFI page tables in mainline yet ]

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/efi: Fix boot crash by always mapping boot service regions into new EFI page tables
  x86/fpu: Fix eager-FPU handling on legacy FPU machines
  x86/delay: Avoid preemptible context checks in delay_mwaitx()
  x86/fpu: Revert ("x86/fpu: Disable AVX when eagerfpu is off")
  x86/fpu: Fix 'no387' regression

2f51c820

Merge git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending · fda604a4

Linus Torvalds authored Mar 12, 2016

Pull SCSI target bug fix from Nicholas Bellinger:
 "Here is an outstanding target-core bug-fix for v4.5 code."

  This patch addresses a recent Task Management (TMR) regression related
  to larger set of multi-port LUN_RESET bug-fixes in v4.5-rc5.

  It drops a left-over target_put_sess_cmd() of se_cmd->cmd_kref within
  ABORT_TASK failure path, once a se_cmd descriptor has already
  completed posting response to fabric driver logic, and must be skipped
  during normal ABORT_TASK se_cmd->tag lookup"

* git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
  target: Drop incorrect ABORT_TASK put for completed commands

fda604a4

12 Mar, 2016 5 commits

x86/efi: Fix boot crash by always mapping boot service regions into new EFI page tables · 452308de

Matt Fleming authored Mar 11, 2016

Some machines have EFI regions in page zero (physical address
0x00000000) and historically that region has been added to the e820
map via trim_bios_range(), and ultimately mapped into the kernel page
tables. It was not mapped via efi_map_regions() as one would expect.

Alexis reports that with the new separate EFI page tables some boot
services regions, such as page zero, are not mapped. This triggers an
oops during the SetVirtualAddressMap() runtime call.

For the EFI boot services quirk on x86 we need to memblock_reserve()
boot services regions until after SetVirtualAddressMap(). Doing that
while respecting the ownership of regions that may have already been
reserved by the kernel was the motivation behind this commit:

  7d68dc3f ("x86, efi: Do not reserve boot services regions within reserved areas")

That patch was merged at a time when the EFI runtime virtual mappings
were inserted into the kernel page tables as described above, and the
trick of setting ->numpages (and hence the region size) to zero to
track regions that should not be freed in efi_free_boot_services()
meant that we never mapped those regions in efi_map_regions(). Instead
we were relying solely on the existing kernel mappings.

Now that we have separate page tables we need to make sure the EFI
boot services regions are mapped correctly, even if someone else has
already called memblock_reserve(). Instead of stashing a tag in
->numpages, set the EFI_MEMORY_RUNTIME bit of ->attribute. Since it
generally makes no sense to mark a boot services region as required at
runtime, it's pretty much guaranteed the firmware will not have
already set this bit.

For the record, the specific circumstances under which Alexis
triggered this bug was that an EFI runtime driver on his machine was
responding to the EVT_SIGNAL_VIRTUAL_ADDRESS_CHANGE event during
SetVirtualAddressMap().

The event handler for this driver looks like this,

  sub rsp,0x28
  lea rdx,[rip+0x2445] # 0xaa948720
  mov ecx,0x4
  call func_aa9447c0  ; call to ConvertPointer(4, & 0xaa948720)
  mov r11,QWORD PTR [rip+0x2434] # 0xaa948720
  xor eax,eax
  mov BYTE PTR [r11+0x1],0x1
  add rsp,0x28
  ret

Which is pretty typical code for an EVT_SIGNAL_VIRTUAL_ADDRESS_CHANGE
handler. The "mov r11, QWORD PTR [rip+0x2424]" was the faulting
instruction because ConvertPointer() was being called to convert the
address 0x0000000000000000, which when converted is left unchanged and
remains 0x0000000000000000.

The output of the oops trace gave the impression of a standard NULL
pointer dereference bug, but because we're accessing physical
addresses during ConvertPointer(), it wasn't. EFI boot services code
is stored at that address on Alexis' machine.
Reported-by: Alexis Murzeau <amurzeau@gmail.com>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Raphael Hertzog <hertzog@debian.org>
Cc: Roger Shimizu <rogershimizu@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/1457695163-29632-2-git-send-email-matt@codeblueprint.co.uk
Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=815125Signed-off-by: Ingo Molnar <mingo@kernel.org>

452308de

x86/fpu: Fix eager-FPU handling on legacy FPU machines · 6e686709

Borislav Petkov authored Mar 11, 2016

i486 derived cores like Intel Quark support only the very old,
legacy x87 FPU (FSAVE/FRSTOR, CPUID bit FXSR is not set), and
our FPU code wasn't handling the saving and restoring there
properly in the 'eagerfpu' case.

So after we made eagerfpu the default for all CPU types:

  58122bf1 x86/fpu: Default eagerfpu=on on all CPUs

these old FPU designs broke. First, Andy Shevchenko reported a splat:

  WARNING: CPU: 0 PID: 823 at arch/x86/include/asm/fpu/internal.h:163 fpu__clear+0x8c/0x160

which was us trying to execute FXRSTOR on those machines even though
they don't support it.

After taking care of that, Bryan O'Donoghue reported that a simple FPU
test still failed because we weren't initializing the FPU state properly
on those machines.

Take care of all that.
Reported-and-tested-by: Bryan O'Donoghue <pure.logic@nexus-software.ie>
Reported-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yu-cheng <yu-cheng.yu@intel.com>
Link: http://lkml.kernel.org/r/20160311113206.GD4312@pd.tnicSigned-off-by: Ingo Molnar <mingo@kernel.org>

6e686709

Merge tag 'for-linus-20160311' of git://git.infradead.org/linux-mtd · 03c668a9

Linus Torvalds authored Mar 11, 2016

Pull MTD fixes from Brian Norris:
 "Late MTD fix for v4.5:

   - A simple error code handling fix for the NAND ECC test; this was a
     regression in v4.5-rc1

   - A MAINTAINERS update, which might as well go in ASAP"

* tag 'for-linus-20160311' of git://git.infradead.org/linux-mtd:
  MAINTAINERS: add a maintainer for the NAND subsystem
  mtd: nand: tests: fix regression introduced in mtd_nandectest

03c668a9

Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux · 3ab0a0f9

Linus Torvalds authored Mar 11, 2016

Pull drm/i915 fixes from Dave Airlie:
 "Just two i915 regression fixes, that should be it from me"

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
  drm/i915: Actually retry with bit-banging after GMBUS timeout
  drm/i915: Fix bogus dig_port_map[] assignment for pre-HSW

3ab0a0f9

mm/mempool: avoid KASAN marking mempool poison checks as use-after-free · 76401310

Matthew Dawson authored Mar 11, 2016

When removing an element from the mempool, mark it as unpoisoned in KASAN
before verifying its contents for SLUB/SLAB debugging.  Otherwise KASAN
will flag the reads checking the element use-after-free writes as
use-after-free reads.
Signed-off-by: Matthew Dawson <matthew@mjdsystems.ca>
Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

76401310

11 Mar, 2016 9 commits

Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc · 2a4fb270

Linus Torvalds authored Mar 11, 2016

Pull ARM SoC fixes from Olof Johansson:
 "Two more fixes for 4.5:

   - One is a fix for OMAP that is urgently needed to avoid DRA7xx chips
     from premature aging, by always keeping the Ethernet clock enabled.

   - The other solves a I/O memory layout issue on Armada, where SROM
     and PCI memory windows were conflicting in some configurations"

* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  ARM: mvebu: fix overlap of Crypto SRAM with PCIe memory window
  ARM: dts: dra7: do not gate cpsw clock due to errata i877
  ARM: OMAP2+: hwmod: Introduce ti,no-idle dt property

2a4fb270

Merge tag 'media/v4.5-5' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media · 95f41fb2

Linus Torvalds authored Mar 11, 2016

Pull media fix from Mauro Carvalho Chehab:
 "One last time fix: It adds a code that prevents some media tools like
  media-ctl to hide some entities that have their IDs out of the range
  expected by those apps"

* tag 'media/v4.5-5' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
  [media] media-device: map new functions into old types for legacy API

95f41fb2

ARM: mvebu: fix overlap of Crypto SRAM with PCIe memory window · d7d5a43c

Thomas Petazzoni authored Mar 08, 2016

When the Crypto SRAM mappings were added to the Device Tree files
describing the Armada XP boards in commit c466d997 ("ARM: mvebu:
define crypto SRAM ranges for all armada-xp boards"), the fact that
those mappings were overlaping with the PCIe memory aperture was
overlooked. Due to this, we currently have for all Armada XP platforms
a situation that looks like this:

Memory mapping on Armada XP boards with internal registers at
0xf1000000:

 - 0x00000000 -> 0xf0000000	3.75G 	RAM
 - 0xf0000000 -> 0xf1000000	16M	NOR flashes (AXP GP / AXP DB)
 - 0xf1000000 -> 0xf1100000	1M	internal registers
 - 0xf8000000 -> 0xffe0000	126M	PCIe memory aperture
 - 0xf8100000 -> 0xf8110000	64KB	Crypto SRAM #0	=> OVERLAPS WITH PCIE !
 - 0xf8110000 -> 0xf8120000	64KB	Crypto SRAM #1	=> OVERLAPS WITH PCIE !
 - 0xffe00000 -> 0xfff00000	1M	PCIe I/O aperture
 - 0xfff0000  -> 0xffffffff	1M	BootROM

The overlap means that when PCIe devices are added, depending on their
memory window needs, they might or might not be mapped into the
physical address space. Indeed, they will not be mapped if the area
allocated in the PCIe memory aperture by the PCI core overlaps with
one of the Crypto SRAM. Typically, a Intel IGB PCIe NIC that needs 8MB
of PCIe memory will see its PCIe memory window allocated from
0xf80000000 for 8MB, which overlaps with the Crypto SRAM windows. Due
to this, the PCIe window is not created, and any attempt to access the
PCIe window makes the kernel explode:

[    3.302213] igb: Copyright (c) 2007-2014 Intel Corporation.
[    3.307841] pci 0000:00:09.0: enabling device (0140 -> 0143)
[    3.313539] mvebu_mbus: cannot add window '4:f8', conflicts with another window
[    3.320870] mvebu-pcie soc:pcie-controller: Could not create MBus window at [mem 0xf8000000-0xf87fffff]: -22
[    3.330811] Unhandled fault: external abort on non-linefetch (0x1008) at 0xf08c0018

This problem does not occur on Armada 370 boards, because we use the
following memory mapping (for boards that have internal registers at
0xf1000000):

 - 0x00000000 -> 0xf0000000	3.75G 	RAM
 - 0xf0000000 -> 0xf1000000	16M	NOR flashes (AXP GP / AXP DB)
 - 0xf1000000 -> 0xf1100000	1M	internal registers
 - 0xf1100000 -> 0xf1110000	64KB	Crypto SRAM #0 => OK !
 - 0xf8000000 -> 0xffe0000	126M	PCIe memory
 - 0xffe00000 -> 0xfff00000	1M	PCIe I/O
 - 0xfff0000  -> 0xffffffff	1M	BootROM

Obviously, the solution is to align the location of the Crypto SRAM
mappings of Armada XP to be similar with the ones on Armada 370, i.e
have them between the "internal registers" area and the beginning of
the PCIe aperture.

However, we have a special case with the OpenBlocks AX3-4 platform,
which has a 128 MB NOR flash. Currently, this NOR flash is mapped from
0xf0000000 to 0xf8000000. This is possible because on OpenBlocks
AX3-4, the internal registers are not at 0xf1000000. And this explains
why the Crypto SRAM mappings were not configured at the same place on
Armada XP.

Hence, the solution is two-fold:

 (1) Move the NOR flash mapping on Armada XP OpenBlocks AX3-4 from
     0xe8000000 to 0xf0000000. This frees the 0xf0000000 ->
     0xf80000000 space.

 (2) Move the Crypto SRAM mappings on Armada XP to be similar to
     Armada 370 (except of course that Armada XP has two Crypto SRAM
     and not one).

After this patch, the memory mapping on Armada XP boards with
registers at 0xf1 is:

 - 0x00000000 -> 0xf0000000	3.75G 	RAM
 - 0xf0000000 -> 0xf1000000	16M	NOR flashes (AXP GP / AXP DB)
 - 0xf1000000 -> 0xf1100000	1M	internal registers
 - 0xf1100000 -> 0xf1110000	64KB	Crypto SRAM #0
 - 0xf1110000 -> 0xf1120000	64KB	Crypto SRAM #1
 - 0xf8000000 -> 0xffe0000	126M	PCIe memory
 - 0xffe00000 -> 0xfff00000	1M	PCIe I/O
 - 0xfff0000  -> 0xffffffff	1M	BootROM

And the memory mapping for the special case of the OpenBlocks AX3-4
(internal registers at 0xd0000000, NOR of 128 MB):

 - 0x00000000 -> 0xc0000000	3G 	RAM
 - 0xd0000000 -> 0xd1000000	1M	internal registers
 - 0xe800000  -> 0xf0000000	128M	NOR flash
 - 0xf1100000 -> 0xf1110000	64KB	Crypto SRAM #0
 - 0xf1110000 -> 0xf1120000	64KB	Crypto SRAM #1
 - 0xf8000000 -> 0xffe0000	126M	PCIe memory
 - 0xffe00000 -> 0xfff00000	1M	PCIe I/O
 - 0xfff0000  -> 0xffffffff	1M	BootROM

Fixes: c466d997 ("ARM: mvebu: define crypto SRAM ranges for all armada-xp boards")
Reported-by: Phil Sutter <phil@nwl.cc>
Cc: Phil Sutter <phil@nwl.cc>
Cc: <stable@vger.kernel.org>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Acked-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: Olof Johansson <olof@lixom.net>

d7d5a43c

Merge tag 'dmaengine-fix-4.5' of git://git.infradead.org/users/vkoul/slave-dma · 20698c92

Linus Torvalds authored Mar 11, 2016

Pull dmaengine fixes from Vinod Koul:
 "Two fixes showed up in last few days, and they should be included in
  4.5.  Summary:

  Two more late fixes to drivers, nothing major here:

   - A memory leak fix in fsdma unmap the dma descriptors on freeup

   - A fix in xdmac driver for residue calculation of dma descriptor"

* tag 'dmaengine-fix-4.5' of git://git.infradead.org/users/vkoul/slave-dma:
  dmaengine: at_xdmac: fix residue computation
  dmaengine: fsldma: fix memory leak

20698c92

Merge tag 'pm+acpi-4.5-final' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 7ae9c768

Linus Torvalds authored Mar 11, 2016

Pull power management and ACPI fixes from Rafael Wysocki:
 "Two more fixes for issues introduced recently, one in the generic
  device properties framework and one in ACPICA.

  Specifics:

   - Revert a recent ACPICA commit that has been reverted upstream,
     because it caused problems to happen on user systems and the
     problem it attempted to address will not be relevant any more after
     upcoming ACPI specification changes (Bob Moore).

   - Fix crash in the generic device properties framework introduced by
     a recent change that forgot to check pointers against error values
     in addition to checking them against NULL (Heikki Krogerus)"

* tag 'pm+acpi-4.5-final' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  device property: fwnode->secondary may contain ERR_PTR(-ENODEV)
  ACPICA: Revert "Parser: Fix for SuperName method invocation"

7ae9c768

Merge tag 'xfs-for-linus-4.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs · 2a62ec0a

Linus Torvalds authored Mar 11, 2016

Pull xfs fixes from Dave Chinner:
 "This is a fix for a regression introduced in 4.5-rc1 by the new torn
  log write detection code.  The regression only affects people moving a
  clean filesystem between machines/kernels of different architecture
  (such as changing between 32 bit and 64 bit kernels), but this is the
  recommended (and only!) safe way to migrate a filesystem between
  architectures so we really need to ensure it works.

  The changes are larger than I'd prefer right at the end of the release
  cycle, but the majority of the change is just factoring code to enable
  the detection of a clean log at the correct time to avoid this issue.

  Changes:

   - Only perform torn log write detection on dirty logs.  This prevents
     failures being detected due to a clean filesystem being moved
     between machines or kernels of different architectures (e.g.  32 ->
     64 bit, BE -> LE, etc).  This fixes a regression introduced by the
     torn log write detection in 4.5-rc1"

* tag 'xfs-for-linus-4.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs:
  xfs: only run torn log write detection on dirty logs
  xfs: refactor in-core log state update to helper
  xfs: refactor unmount record detection into helper
  xfs: separate log head record discovery from verification

2a62ec0a

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 63cf207e

Linus Torvalds authored Mar 11, 2016

Pull vfs fixes from Al Viro:
 "A couple of fixes: Fix for my dumb braino in ncpfs and a long-standing
  breakage on recovery from failed rename() in jffs2"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  jffs2: reduce the breakage on recovery from halfway failed rename()
  ncpfs: fix a braino in OOM handling in ncp_fill_cache()

63cf207e

Merge branches 'device-properties-fixes' and 'acpica-fixes' · 5b3e7e05

Rafael J. Wysocki authored Mar 11, 2016

* device-properties-fixes:
  device property: fwnode->secondary may contain ERR_PTR(-ENODEV)

* acpica-fixes:
  ACPICA: Revert "Parser: Fix for SuperName method invocation"

5b3e7e05

drm/i915: Actually retry with bit-banging after GMBUS timeout · 0bbca274

Ville Syrjälä authored Mar 07, 2016

After the GMBUS transfer times out, we set force_bit=1 and
return -EAGAIN expecting the i2c core to call the .master_xfer
hook again so that we will retry the same transfer via bit-banging.
This is in case the gmbus hardware is somehow faulty.

Unfortunately we left adapter->retries to 0, meaning the i2c core
didn't actually do the retry. Let's tell the core we want one retry
when we return -EAGAIN.

Note that i2c-algo-bit also uses this retry count for some internal
retries, so we'll end up increasing those a bit as well.

Cc: Jani Nikula <jani.nikula@intel.com>
Cc: drm-intel-fixes@lists.freedesktop.org
Fixes: bffce907 ("drm/i915: abstract i2c bit banging fallback in gmbus xfer")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1457366220-29409-2-git-send-email-ville.syrjala@linux.intel.comReviewed-by: Jani Nikula <jani.nikula@intel.com>
(cherry picked from commit 8b1f165a)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>

0bbca274

10 Mar, 2016 19 commits

MAINTAINERS: add a maintainer for the NAND subsystem · 9df4f913

Boris BREZILLON authored Mar 09, 2016

Add myself as the maintainer of the NAND subsystem.
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Acked-by: Richard Weinberger <richard@nod.at>
Acked-by: Brian Norris <computersforpeace@gmail.com>
Acked-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Brian Norris <computersforpeace@gmail.com>

9df4f913

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · f2c12421

Linus Torvalds authored Mar 10, 2016

Pull KVM fixes from Paolo Bonzini:
 "A few simple fixes for ARM, x86, PPC and generic code.

  The x86 MMU fix is a bit larger because the surrounding code needed a
  cleanup, but nothing worrisome"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: MMU: fix reserved bit check for ept=0/CR0.WP=0/CR4.SMEP=1/EFER.NX=0
  KVM: MMU: fix ept=0/pte.u=1/pte.w=0/CR0.WP=0/CR4.SMEP=1/EFER.NX=0 combo
  kvm: cap halt polling at exactly halt_poll_ns
  KVM: s390: correct fprs on SIGP (STOP AND) STORE STATUS
  KVM: VMX: disable PEBS before a guest entry
  KVM: PPC: Book3S HV: Sanitize special-purpose register values on guest exit

f2c12421

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux · c32c2cb2

Linus Torvalds authored Mar 10, 2016

Pull arm64 fixes from Will Deacon:
 "I thought we were done for 4.5, but then the 64k-page chaps came
  crawling out of the woodwork.  *sigh*

  The vmemmap fix I sent for -rc7 caused a regression with 64k pages and
  sparsemem and at some point during the release cycle the new hugetlb
  code using contiguous ptes started failing the libhugetlbfs tests with
  64k pages enabled.

  So here are a couple of patches that fix the vmemmap alignment and
  disable the new hugetlb page sizes whilst a proper fix is being
  developed:

   - Temporarily disable huge pages built using contiguous ptes

   - Ensure vmemmap region is sufficiently aligned for sparsemem
     sections"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: hugetlb: partial revert of 66b3923a
  arm64: account for sparsemem section alignment when choosing vmemmap offset

c32c2cb2

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · 2da33f9f

Linus Torvalds authored Mar 10, 2016

Pull s390 fixes from Martin Schwidefsky:
 "Three bug fixes:
   - The fix for the page table corruption (CVE-2016-2143)
   - The diagnose statistics introduced a regression for the dasd diag
     driver
   - Boot crash on systems without the set-program-parameters facility"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/mm: four page table levels vs. fork
  s390/cpumf: Fix lpp detection
  s390/dasd: fix diag 0x250 inline assembly

2da33f9f

[media] media-device: map new functions into old types for legacy API · b2cd2744

Mauro Carvalho Chehab authored Mar 05, 2016

The legacy media controller userspace API exposes entity types that
carry both type and function information. The new API replaces the type
with a function. It preserves backward compatibility by defining legacy
functions for the existing types and using them in drivers.

This works fine, as long as newer entity functions won't be added.

Unfortunately, some tools, like media-ctl with --print-dot argument
rely on the now legacy MEDIA_ENT_T_V4L2_SUBDEV and MEDIA_ENT_T_DEVNODE
numeric ranges to identify what entities will be shown.

Also, if the entity doesn't match those ranges, it will ignore the
major/minor information on devnodes, and won't be getting the devnode
name via udev or sysfs.

As we're now adding devices outside the old range, the legacy ioctl
needs to map the new entity functions into a type at the old range,
or otherwise we'll have a regression.

Detected on all released media-ctl versions (e. g. versions <= 1.10).

Fix this by deriving the type from the function to emulate the legacy
API if the function isn't in the legacy functions range.
Reported-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>

b2cd2744

dmaengine: at_xdmac: fix residue computation · 25c5e962

Ludovic Desroches authored Mar 10, 2016

When computing the residue we need two pieces of information: the current
descriptor and the remaining data of the current descriptor. To get
that information, we need to read consecutively two registers but we
can't do it in an atomic way. For that reason, we have to check manually
that current descriptor has not changed.
Signed-off-by: Ludovic Desroches <ludovic.desroches@atmel.com>
Suggested-by: Cyrille Pitchen <cyrille.pitchen@atmel.com>
Reported-by: David Engraf <david.engraf@sysgo.com>
Tested-by: David Engraf <david.engraf@sysgo.com>
Fixes: e1f7c9ee ("dmaengine: at_xdmac: creation of the atmel
eXtended DMA Controller driver")
Cc: stable@vger.kernel.org #4.1 and later
Signed-off-by: Vinod Koul <vinod.koul@intel.com>

25c5e962

x86/delay: Avoid preemptible context checks in delay_mwaitx() · 84477336

Borislav Petkov authored Mar 09, 2016

We do use this_cpu_ptr(&cpu_tss) as a cacheline-aligned, seldomly
accessed per-cpu var as the MONITORX target in delay_mwaitx(). However,
when called in preemptible context, this_cpu_ptr -> smp_processor_id() ->
debug_smp_processor_id() fires:

  BUG: using smp_processor_id() in preemptible [00000000] code: udevd/312
  caller is delay_mwaitx+0x40/0xa0

But we don't care about that check - we only need cpu_tss as a MONITORX
target and it doesn't really matter which CPU's var we're touching as
we're going idle anyway. Fix that.
Suggested-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: spg_linux_kernel@amd.com
Link: http://lkml.kernel.org/r/20160309205622.GG6564@pd.tnicSigned-off-by: Ingo Molnar <mingo@kernel.org>

84477336

KVM: MMU: fix reserved bit check for ept=0/CR0.WP=0/CR4.SMEP=1/EFER.NX=0 · 5f0b8199

Paolo Bonzini authored Mar 09, 2016

KVM has special logic to handle pages with pte.u=1 and pte.w=0 when
CR0.WP=1. These pages' SPTEs flip continuously between two states:
U=1/W=0 (user and supervisor reads allowed, supervisor writes not allowed)
and U=0/W=1 (supervisor reads and writes allowed, user writes not allowed).

When SMEP is in effect, however, U=0 will enable kernel execution of
this page. To avoid this, KVM also sets NX=1 in the shadow PTE together
with U=0, making the two states U=1/W=0/NX=gpte.NX and U=0/W=1/NX=1.
When guest EFER has the NX bit cleared, the reserved bit check thinks
that the latter state is invalid; teach it that the smep_andnot_wp case
will also use the NX bit of SPTEs.

Cc: stable@vger.kernel.org
Reviewed-by: Xiao Guangrong <guangrong.xiao@linux.inel.com>
Fixes: c258b62bSigned-off-by: Paolo Bonzini <pbonzini@redhat.com>

5f0b8199

KVM: MMU: fix ept=0/pte.u=1/pte.w=0/CR0.WP=0/CR4.SMEP=1/EFER.NX=0 combo · 844a5fe2

Paolo Bonzini authored Mar 08, 2016

Yes, all of these are needed. :) This is admittedly a bit odd, but
kvm-unit-tests access.flat tests this if you run it with "-cpu host"
and of course ept=0.

KVM runs the guest with CR0.WP=1, so it must handle supervisor writes
specially when pte.u=1/pte.w=0/CR0.WP=0. Such writes cause a fault
when U=1 and W=0 in the SPTE, but they must succeed because CR0.WP=0.
When KVM gets the fault, it sets U=0 and W=1 in the shadow PTE and
restarts execution. This will still cause a user write to fault, while
supervisor writes will succeed. User reads will fault spuriously now,
and KVM will then flip U and W again in the SPTE (U=1, W=0). User reads
will be enabled and supervisor writes disabled, going back to the
originary situation where supervisor writes fault spuriously.

When SMEP is in effect, however, U=0 will enable kernel execution of
this page. To avoid this, KVM also sets NX=1 in the shadow PTE together
with U=0. If the guest has not enabled NX, the result is a continuous
stream of page faults due to the NX bit being reserved.

The fix is to force EFER.NX=1 even if the CPU is taking care of the EFER
switch. (All machines with SMEP have the CPU_LOAD_IA32_EFER vm-entry
control, so they do not use user-return notifiers for EFER---if they did,
EFER.NX would be forced to the same value as the host).

There is another bug in the reserved bit check, which I've split to a
separate patch for easier application to stable kernels.

Cc: stable@vger.kernel.org
Cc: Andy Lutomirski <luto@amacapital.net>
Reviewed-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Fixes: f6577a5fSigned-off-by: Paolo Bonzini <pbonzini@redhat.com>

844a5fe2

x86/fpu: Revert ("x86/fpu: Disable AVX when eagerfpu is off") · a65050c6

Yu-cheng Yu authored Mar 09, 2016

Leonid Shatz noticed that the SDM interpretation of the following
recent commit:

  394db20c ("x86/fpu: Disable AVX when eagerfpu is off")

... is incorrect and that the original behavior of the FPU code was correct.

Because AVX is not stated in CR0 TS bit description, it was mistakenly
believed to be not supported for lazy context switch. This turns out
to be false:

  Intel Software Developer's Manual Vol. 3A, Sec. 2.5 Control Registers:

   'TS Task Switched bit (bit 3 of CR0) -- Allows the saving of the x87 FPU/
    MMX/SSE/SSE2/SSE3/SSSE3/SSE4 context on a task switch to be delayed until
    an x87 FPU/MMX/SSE/SSE2/SSE3/SSSE3/SSE4 instruction is actually executed
    by the new task.'

  Intel Software Developer's Manual Vol. 2A, Sec. 2.4 Instruction Exception
  Specification:

   'AVX instructions refer to exceptions by classes that include #NM
    "Device Not Available" exception for lazy context switch.'

So revert the commit.
Reported-by: Leonid Shatz <leonid.shatz@ravellosystems.com>
Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1457569734-3785-1-git-send-email-yu-cheng.yu@intel.comSigned-off-by: Ingo Molnar <mingo@kernel.org>

a65050c6

s390/mm: four page table levels vs. fork · 3446c13b

Martin Schwidefsky authored Feb 15, 2016

The fork of a process with four page table levels is broken since
git commit 6252d702 "[S390] dynamic page tables."

All new mm contexts are created with three page table levels and
an asce limit of 4TB. If the parent has four levels dup_mmap will
add vmas to the new context which are outside of the asce limit.
The subsequent call to copy_page_range will walk the three level
page table structure of the new process with non-zero pgd and pud
indexes. This leads to memory clobbers as the pgd_index *and* the
pud_index is added to the mm->pgd pointer without a pgd_deref
in between.

The init_new_context() function is selecting the number of page
table levels for a new context. The function is used by mm_init()
which in turn is called by dup_mm() and mm_alloc(). These two are
used by fork() and exec(). The init_new_context() function can
distinguish the two cases by looking at mm->context.asce_limit,
for fork() the mm struct has been copied and the number of page
table levels may not change. For exec() the mm_alloc() function
set the new mm structure to zero, in this case a three-level page
table is created as the temporary stack space is located at
STACK_TOP_MAX = 4TB.

This fixes CVE-2016-2143.
Reported-by: Marcin Kościelnicki <koriakin@0x04.net>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: stable@vger.kernel.org
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>

3446c13b

Merge tag 'spi-fix-v4.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi · 8e0f93cd

Linus Torvalds authored Mar 09, 2016

Pull spi fixes from Mark Brown:
 "A few driver specific fixes for the Rockchip and i.MX SPI controllers,
  especially for the i.MX they're annoying bugs if you run into them"

* tag 'spi-fix-v4.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  spi: imx: fix spi resource leak with dma transfer
  spi: imx: allow only WML aligned transfers to use DMA
  spi: rockchip: add missing spi_master_put
  spi: rockchip: disable runtime pm when in err case

8e0f93cd

Merge remote-tracking branch 'spi/fix/rockchip' into spi-linus · 3ee20abb
Mark Brown authored Mar 10, 2016

3ee20abb
Merge remote-tracking branch 'spi/fix/imx' into spi-linus · c23663ac
Mark Brown authored Mar 10, 2016

c23663ac

Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 · 718e47a5

Linus Torvalds authored Mar 09, 2016

Pull ext4 fix from Ted Ts'o:
 "This fixes a regression which crept in v4.5-rc5"

* tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  ext4: iterate over buffer heads correctly in move_extent_per_page()

718e47a5

Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux · a6e434e9

Linus Torvalds authored Mar 09, 2016

Pull drm fixes from Dave Airlie:
 "A few imx fixes I missed from a couple of weeks ago, they still aren't
  that big and fix some regression and a fail to boot problem.

  Other than that, a couple of regression fixes for radeon/amdgpu, one
  regression fix for vmwgfx and one regression fix for tda998x"

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
  Revert "drm/radeon/pm: adjust display configuration after powerstate"
  drm/amdgpu/dp: add back special handling for NUTMEG
  drm/radeon/dp: add back special handling for NUTMEG
  drm/i2c: tda998x: Choose between atomic or non atomic dpms helper
  drm/vmwgfx: Add back ->detect() and ->fill_modes()
  drm/radeon: Fix error handling in radeon_flip_work_func.
  drm/amdgpu: Fix error handling in amdgpu_flip_work_func.
  drm/imx: Add missing DRM_FORMAT_RGB565 to ipu_plane_formats
  drm/imx: notify DRM core about CRTC vblank state
  gpu: ipu-v3: Reset IPU before activating IRQ
  gpu: ipu-v3: Do not bail out on missing optional port nodes

a6e434e9

Merge tag 'trace-fixes-v4.5-rc7' of... · 8205ff1d

Linus Torvalds authored Mar 09, 2016

Merge tag 'trace-fixes-v4.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing fix from Steven Rostedt:
 "I previously sent a fix that prevents all trace events from being
  called if the current cpu is offline.

  But I forgot that in 3.18, we added lockdep checks to test RCU usage
  even when the event is disabled.  Although there cannot be any bug
  when a cpu is going offline, we now get false warnings triggered by
  the added checks of the event being disabled.

  I removed the check from the tracepoint code itself, and added it to
  the condition section (which is "1" for 'no condition').  This way the
  online cpu check will get checked in all the right locations"

* tag 'trace-fixes-v4.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing: Fix check for cpu online when event is disabled

8205ff1d

ext4: iterate over buffer heads correctly in move_extent_per_page() · 6ffe77ba

Eryu Guan authored Feb 21, 2016

In commit bcff2488 ("ext4: don't read blocks from disk after extents
being swapped") bh is not updated correctly in the for loop and wrong
data has been written to disk. generic/324 catches this on sub-page
block size ext4.

Fixes: bcff2488 ("ext4: don't read blocks from disk after extentsbeing swapped")
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>

6ffe77ba

Merge branch 'akpm' (patches from Andrew) · 380173ff

Linus Torvalds authored Mar 09, 2016

Merge fixes from Andrew Morton:
 "13 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  dma-mapping: avoid oops when parameter cpu_addr is null
  mm/hugetlb: use EOPNOTSUPP in hugetlb sysctl handlers
  memremap: check pfn validity before passing to pfn_to_page()
  mm, thp: fix migration of PTE-mapped transparent huge pages
  dax: check return value of dax_radix_entry()
  ocfs2: fix return value from ocfs2_page_mkwrite()
  arm64: kasan: clear stale stack poison
  sched/kasan: remove stale KASAN poison after hotplug
  kasan: add functions to clear stack poison
  mm: fix mixed zone detection in devm_memremap_pages
  list: kill list_force_poison()
  mm: __delete_from_page_cache show Bad page if mapped
  mm/hugetlb: hugetlb_no_page: rate-limit warning message

380173ff

09 Mar, 2016 5 commits

dma-mapping: avoid oops when parameter cpu_addr is null · d6b7eaeb

Zhen Lei authored Mar 09, 2016

To keep consistent with kfree, which tolerate ptr is NULL.  We do this
because sometimes we may use goto statement, so that success and failure
case can share parts of the code.  But unfortunately, dma_free_coherent
called with parameter cpu_addr is null will cause oops, such as showed
below:

  Unable to handle kernel paging request at virtual address ffffffc020d3b2b8
  pgd = ffffffc083a61000
  [ffffffc020d3b2b8] *pgd=0000000000000000, *pud=0000000000000000
  CPU: 4 PID: 1489 Comm: malloc_dma_1 Tainted: G           O    4.1.12 #1
  Hardware name: ARM64 (DT)
  PC is at __dma_free_coherent.isra.10+0x74/0xc8
  LR is at __dma_free+0x9c/0xb0
  Process malloc_dma_1 (pid: 1489, stack limit = 0xffffffc0837fc020)
  [...]
  Call trace:
    __dma_free_coherent.isra.10+0x74/0xc8
    __dma_free+0x9c/0xb0
    malloc_dma+0x104/0x158 [dma_alloc_coherent_mtmalloc]
    kthread+0xec/0xfc
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

d6b7eaeb

mm/hugetlb: use EOPNOTSUPP in hugetlb sysctl handlers · 86613628

Jan Stancek authored Mar 09, 2016

Replace ENOTSUPP with EOPNOTSUPP.  If hugepages are not supported, this
value is propagated to userspace.  EOPNOTSUPP is part of uapi and is
widely supported by libc libraries.

It gives nicer message to user, rather than:

  # cat /proc/sys/vm/nr_hugepages
  cat: /proc/sys/vm/nr_hugepages: Unknown error 524

And also LTP's proc01 test was failing because this ret code (524)
was unexpected:

  proc01      1  TFAIL  :  proc01.c:396: read failed: /proc/sys/vm/nr_hugepages: errno=???(524): Unknown error 524
  proc01      2  TFAIL  :  proc01.c:396: read failed: /proc/sys/vm/nr_hugepages_mempolicy: errno=???(524): Unknown error 524
  proc01      3  TFAIL  :  proc01.c:396: read failed: /proc/sys/vm/nr_overcommit_hugepages: errno=???(524): Unknown error 524
Signed-off-by: Jan Stancek <jstancek@redhat.com>
Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

86613628

memremap: check pfn validity before passing to pfn_to_page() · ac343e88

Ard Biesheuvel authored Mar 09, 2016

In memremap's helper function try_ram_remap(), we dereference a struct
page pointer that was derived from a PFN that is known to be covered by
a 'System RAM' iomem region, and is thus assumed to be a 'valid' PFN,
i.e., a PFN that has a struct page associated with it and is covered by
the kernel direct mapping.

However, the assumption that there is a 1:1 relation between the System
RAM iomem region and the kernel direct mapping is not universally valid
on all architectures, and on ARM and arm64, 'System RAM' may include
regions for which pfn_valid() returns false.

Generally speaking, both __va() and pfn_to_page() should only ever be
called on PFNs/physical addresses for which pfn_valid() returns true, so
add that check to try_ram_remap().
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

ac343e88

mm, thp: fix migration of PTE-mapped transparent huge pages · 0a2e280b

Kirill A. Shutemov authored Mar 09, 2016

We don't have native support of THP migration, so we have to split huge
page into small pages in order to migrate it to different node.  This
includes PTE-mapped huge pages.

I made mistake in refcounting patchset: we don't actually split
PTE-mapped huge page in queue_pages_pte_range(), if we step on head
page.

The result is that the head page is queued for migration, but none of
tail pages: putting head page on queue takes pin on the page and any
subsequent attempts of split_huge_pages() would fail and we skip queuing
tail pages.

unmap_and_move_huge_page() will eventually split the huge pages, but
only one of 512 pages would get migrated.

Let's fix the situation.

Fixes: 248db92d ("migrate_pages: try to split pages on queuing")
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

0a2e280b

dax: check return value of dax_radix_entry() · 30f471fd

Ross Zwisler authored Mar 09, 2016

dax_pfn_mkwrite() previously wasn't checking the return value of the
call to dax_radix_entry(), which was a mistake.

Instead, capture this return value and return the appropriate VM_FAULT_
value.
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Matthew Wilcox <willy@linux.intel.com>
Cc: Dave Chinner <david@fromorbit.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

30f471fd