Commits · 257a1aac7a9ae82db9c84a263a883c6f17789fea · nexedi / linux

20 Mar, 2018 1 commit

sparc: Make auxiliary vectors for ADI available on 32-bit as well · 257a1aac

Khalid Aziz authored Mar 20, 2018

Commit c6202ca7 ("sparc64: Add auxiliary vectors to report
platform ADI properties") adds auxiliary vectors to report ADI
capabilities on sparc64 platform only. This needs to be uniform
across 64-bit and 32-bit. This patch makes the same vectors
available on 32-bit as well.
Signed-off-by: Khalid Aziz <khalid.aziz@oracle.com>
Cc: Khalid Aziz <khalid@gonehiking.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

257a1aac

18 Mar, 2018 12 commits

Merge branch 'sparc64-ADI' · 88fe3529

David S. Miller authored Mar 18, 2018

Khalid Aziz says:

====================
Application Data Integrity feature introduced by SPARC M7

V12 changes:
This series is same as v10 and v11 and was simply rebased on 4.16-rc2
kernel and patch 11 was added to update signal delivery code to use the
new helper functions added by Eric Biederman. Can mm maintainers please
review patches 2, 7, 8 and 9 which are arch independent, and
include/linux/mm.h and mm/ksm.c changes in patch 10 and ack these if
everything looks good?

SPARC M7 processor adds additional metadata for memory address space
that can be used to secure access to regions of memory. This additional
metadata is implemented as a 4-bit tag attached to each cacheline size
block of memory. A task can set a tag on any number of such blocks.
Access to such block is granted only if the virtual address used to
access that block of memory has the tag encoded in the uppermost 4 bits
of VA. Since sparc processor does not implement all 64 bits of VA, top 4
bits are available for ADI tags. Any mismatch between tag encoded in VA
and tag set on the memory block results in a trap. Tags are verified in
the VA presented to the MMU and tags are associated with the physical
page VA maps on to. If a memory page is swapped out and page frame gets
reused for another task, the tags are lost and hence must be saved when
swapping or migrating the page.

A userspace task enables ADI through mprotect(). This patch series adds
a page protection bit PROT_ADI and a corresponding VMA flag
VM_SPARC_ADI. VM_SPARC_ADI is used to trigger setting TTE.mcd bit in the
sparc pte that enables ADI checking on the corresponding page. MMU
validates the tag embedded in VA for every page that has TTE.mcd bit set
in its pte. After enabling ADI on a memory range, the userspace task can
set ADI version tags using stxa instruction with ASI_MCD_PRIMARY or
ASI_MCD_ST_BLKINIT_PRIMARY ASI.

Once userspace task calls mprotect() with PROT_ADI, kernel takes
following overall steps:

1. Find the VMAs covering the address range passed in to mprotect and
set VM_SPARC_ADI flag. If address range covers a subset of a VMA, the
VMA will be split.

2. When a page is allocated for a VA and the VMA covering this VA has
VM_SPARC_ADI flag set, set the TTE.mcd bit so MMU will check the
vwersion tag.

3. Userspace can now set version tags on the memory it has enabled ADI
on. Userspace accesses ADI enabled memory using a virtual address that
has the version tag embedded in the high bits. MMU validates this
version tag against the actual tag set on the memory. If tag matches,
MMU performs the VA->PA translation and access is granted. If there is a
mismatch, hypervisor sends a data access exception or precise memory
corruption detected exception depending upon whether precise exceptions
are enabled or not (controlled by MCDPERR register). Kernel sends
SIGSEGV to the task with appropriate si_code.

4. If a page is being swapped out or migrated, kernel must save any ADI
tags set on the page. Kernel maintains a page worth of tag storage
descriptors. Each descriptors pointsto a tag storage space and the
address range it covers. If the page being swapped out or migrated has
ADI enabled on it, kernel finds a tag storage descriptor that covers the
address range for the page or allocates a new descriptor if none of the
existing descriptors cover the address range. Kernel saves tags from the
page into the tag storage space descriptor points to.

5. When the page is swapped back in or reinstantiated after migration,
kernel restores the version tags on the new physical page by retrieving
the original tag from tag storage pointed to by a tag storage descriptor
for the virtual address range for new page.

User task can disable ADI by calling mprotect() again on the memory
range with PROT_ADI bit unset. Kernel clears the VM_SPARC_ADI flag in
VMAs, merges adjacent VMAs if necessary, and clears TTE.mcd bit in the
corresponding ptes.

IOMMU does not support ADI checking. Any version tags embedded in the
top bits of VA meant for IOMMU, are cleared and replaced with sign
extension of the first non-version tag bit (bit 59 for SPARC M7) for
IOMMU addresses.

This patch series adds support for this feature in 11 patches:

Patch 1/11
  Tag mismatch on access by a task results in a trap from hypervisor as
  data access exception or a precide memory corruption detected
  exception. As part of handling these exceptions, kernel sends a
  SIGSEGV to user process with special si_code to indicate which fault
  occurred. This patch adds three new si_codes to differentiate between
  various mismatch errors.

Patch 2/11
  When a page is swapped or migrated, metadata associated with the page
  must be saved so it can be restored later. This patch adds a new
  function that saves/restores this metadata when updating pte upon a
  swap/migration.

Patch 3/11
  SPARC M7 processor adds new fields to control registers to support ADI
  feature. It also adds a new exception for precise traps on tag
  mismatch. This patch adds definitions for the new control register
  fields, new ASIs for ADI and an exception handler for the precise trap
  on tag mismatch.

Patch 4/11
  New hypervisor fault types were added by sparc M7 processor to support
  ADI feature. This patch adds code to handle these fault types for data
  access exception handler.

Patch 5/11
  When ADI is in use for a page and a tag mismatch occurs, processor
  raises "Memory corruption Detected" trap. This patch adds a handler
  for this trap.

Patch 6/11
  ADI usage is governed by ADI properties on a platform. These
  properties are provided to kernel by firmware. Thsi patch adds new
  auxiliary vectors that provide these values to userpsace.

Patch 7/11
  arch_validate_prot() is used to validate the new protection bits asked
  for by the userspace app. Validating protection bits may need the
  context of address space the bits are being applied to. One such
  example is PROT_ADI bit on sparc processor that enables ADI protection
  on an address range. ADI protection applies only to addresses covered
  by physical RAM and not other PFN mapped addresses or device
  addresses. This patch adds "address" to the parameters being passed to
  arch_validate_prot() to provide that context.

Patch 8/11
  When protection bits are changed on a page, kernel carries forward all
  protection bits except for read/write/exec. Additional code was added
  to allow kernel to clear PKEY bits on x86 but this requirement to
  clear other bits is not unique to x86. This patch extends the existing
  code to allow other architectures to clear any other protection bits
  as well on protection bit change.

Patch 9/11
  When a processor supports additional metadata on memory pages, that
  additional metadata needs to be copied to new memory pages when those
  pages are moved. This patch allows architecture specific code to
  replace the default copy_highpage() routine with arch specific
  version that copies the metadata as well besides the data on the page.

Patch 10/11
  This patch adds support for a user space task to enable ADI and enable
  tag checking for subsets of its address space. As part of enabling
  this feature, this patch adds to support manipulation of precise
  exception for memory corruption detection, adds code to save and
  restore tags on page swap and migration, and adds code to handle ADI
  tagged addresses for DMA.

Patch 11/11
  Update signal delivery code in arch/sparc/kernel/traps_64.c to use
  the new helper function force_sig_fault() added by commit
  f8ec6601 ("signal: Add send_sig_fault and force_sig_fault").

Changelog v12:

	  - Rebased to 4.16-rc2
	  - Added patch 11 to update signal delivery functions

Changelog v11:

	  - Rebased to 4.15

Changelog v10:

	  - Patch 1/10: Updated si_codes definitions for SEGV to match 4.14
	  - Patch 2/10: No changes
	  - Patch 3/10: Updated copyright
	  - Patch 4/10: No changes
	  - Patch 5/10: No changes
	  - Patch 6/10: Updated copyright
	  - Patch 7/10: No changes
	  - Patch 8/10: No changes
	  - Patch 9/10: No changes
	  - Patch 10/10: Added code to return from kernel path to set
	    PSTATE.mcde if kernel continues execution in another thread
	      (Suggested by Anthony)

Changelog v9:

	  - Patch 1/10: No changes
	  - Patch 2/10: No changes
	  - Patch 3/10: No changes
	  - Patch 4/10: No changes
	  - Patch 5/10: No changes
	  - Patch 6/10: No changes
	  - Patch 7/10: No changes
	  - Patch 8/10: No changes
	  - Patch 9/10: New patch
	  - Patch 10/10: Patch 9 from v8. Added code to copy ADI tags when
	    pages are migrated. Updated code to detect overflow and underflow
	      of addresses when allocating tag storage.

Changelog v8:

	  - Patch 1/9: No changes
	  - Patch 2/9: Fixed and erroneous "}"
	  - Patch 3/9: Minor print formatting change
	  - Patch 4/9: No changes
	  - Patch 5/9: No changes
	  - Patch 6/9: Added AT_ADI_UEONADI back
	  - Patch 7/9: Added addr parameter to powerpc arch_validate_prot()
	  - Patch 8/9: No changes
	  - Patch 9/9:
	    - Documentation updates
	      - Added an IPI on mprotect(...PROT_ADI...) call and
	      	  restore of TSTATE.MCDE on context switch
		  	  - Removed restriction on enabling ADI on read-only
			      memory
				- Changed kzalloc() for tag storage to use GFP_NOWAIT
				  - Added code to handle overflow and underflow when
				      allocating tag storage
				      		 - Replaced sun_m7_patch_1insn_range() with
						     sun4v_patch_1insn_range()
							- Added membar after restoring ADI tags in
							    copy_user_highpage()

Changelog v7:

	  - Patch 1/9: No changes
	  - Patch 2/9: Updated parameters to arch specific swap in/out
	    handlers
	    - Patch 3/9: No changes
	    - Patch 4/9: new patch split off from patch 4/4 in v6
	    - Patch 5/9: new patch split off from patch 4/4 in v6
	    - Patch 6/9: new patch split off from patch 4/4 in v6
	    - Patch 7/9: new patch
	    - Patch 8/9: new patch
	    - Patch 9/9:
	      - Enhanced arch_validate_prot() to enable ADI only on
	      	  writable addresses backed by physical RAM
		  	   - Added support for saving/restoring ADI tags for each
			       ADI block size address range on a page on swap in/out
			       	   - copy ADI tags on COW
				     - Updated values for auxiliary vectors to not conflict
				         with values on other architectures to avoid conflict
					        in glibc
						   - Disable same page merging on ADI enabled pages
						     - Enable ADI only on writable addresses backed by
						         physical RAM
							 	  - Split parts of patch off into separate patches

Changelog v6:
	  - Patch 1/4: No changes
	  - Patch 2/4: No changes
	  - Patch 3/4: Added missing nop in the delay slot in
	    sun4v_mcd_detect_precise
	    - Patch 4/4: Eliminated instructions to read and write PSTATE
	      as well as MCDPER and PMCDPER on every access to userspace
	        addresses by setting PSTATE and PMCDPER correctly upon entry
		  into kernel

Changelog v5:
	  - Patch 1/4: No changes
	  - Patch 2/4: Replaced set_swp_pte_at() with new architecture
	    functions arch_do_swap_page() and arch_unmap_one() that
	      suppoprt architecture specific actions to be taken on page
	        swap and migration
		- Patch 3/4: Fixed indentation issues in assembly code
		- Patch 4/4:
		  - Fixed indentation issues and instrcuctions in assembly
		      code
			- Removed CONFIG_SPARC64 from mdesc.c
			  - Changed to maintain state of MCDPER register in thread
			      info flags as opposed to in mm context. MCDPER is a
			      	     per-thread state and belongs in thread info flag as
				     		  opposed to mm context which is shared across threads.
						  	    Added comments to clarify this is a lazily maintained
							    	    state and must be updated on context switch and
								    	    copy_process()
									    		   - Updated code to use the new arch_do_swap_page() and
											       arch_unmap_one() functions

Testing:

- All functionality was tested with 8K normal pages as well as hugepages
  using malloc, mmap and shm.
- Multiple long duration stress tests were run using hugepages over 2+
  months. Normal pages were tested with shorter duration stress tests.
- Tested swapping with malloc and shm by reducing max memory and
  allocating three times the available system memory by active processes
  using ADI on allocated memory. Ran through multiple hours long runs of
  this test.
- Tested page migration with malloc and shm by migrating data pages of
  active ADI test process using migratepages, back and forth between two
  nodes every few seconds over an hour long run. Verified page migration
  through /proc/<pid>/numa_maps.
- Tested COW support using test that forks children that read from
  ADI enabled pages shared with parent and other children and write to
  them as well forcing COW.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

88fe3529

sparc64: Update signal delivery to use new helper functions · b9fa0365

Khalid Aziz authored Feb 21, 2018

Commit f8ec6601 ("signal: Add send_sig_fault and force_sig_fault")
added new helper functions to streamline signal delivery. This patch
updates signal delivery for new/updated handlers for ADI related
exceptions to use the helper function.
Signed-off-by: Khalid Aziz <khalid.aziz@oracle.com>
Cc: Khalid Aziz <khalid@gonehiking.org>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

b9fa0365

sparc64: Add support for ADI (Application Data Integrity) · 74a04967

Khalid Aziz authored Feb 23, 2018

ADI is a new feature supported on SPARC M7 and newer processors to allow
hardware to catch rogue accesses to memory. ADI is supported for data
fetches only and not instruction fetches. An app can enable ADI on its
data pages, set version tags on them and use versioned addresses to
access the data pages. Upper bits of the address contain the version
tag. On M7 processors, upper four bits (bits 63-60) contain the version
tag. If a rogue app attempts to access ADI enabled data pages, its
access is blocked and processor generates an exception. Please see
Documentation/sparc/adi.txt for further details.

This patch extends mprotect to enable ADI (TSTATE.mcde), enable/disable
MCD (Memory Corruption Detection) on selected memory ranges, enable
TTE.mcd in PTEs, return ADI parameters to userspace and save/restore ADI
version tags on page swap out/in or migration. ADI is not enabled by
default for any task. A task must explicitly enable ADI on a memory
range and set version tag for ADI to be effective for the task.
Signed-off-by: Khalid Aziz <khalid.aziz@oracle.com>
Cc: Khalid Aziz <khalid@gonehiking.org>
Reviewed-by: Anthony Yznaga <anthony.yznaga@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

74a04967

mm: Allow arch code to override copy_highpage() · a4602b62

Khalid Aziz authored Feb 21, 2018

Some architectures can support metadata for memory pages and when a
page is copied, its metadata must also be copied. Sparc processors
from M7 onwards support metadata for memory pages. This metadata
provides tag based protection for access to memory pages. To maintain
this protection, the tag data must be copied to the new page when a
page is migrated across NUMA nodes. This patch allows arch specific
code to override default copy_highpage() and copy metadata along
with page data upon migration.
Signed-off-by: Khalid Aziz <khalid.aziz@oracle.com>
Cc: Khalid Aziz <khalid@gonehiking.org>
Reviewed-by: Anthony Yznaga <anthony.yznaga@oracle.com>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

a4602b62

mm: Clear arch specific VM flags on protection change · 2c2d57b5

Khalid Aziz authored Feb 21, 2018

When protection bits are changed on a VMA, some of the architecture
specific flags should be cleared as well. An examples of this are the
PKEY flags on x86. This patch expands the current code that clears
PKEY flags for x86, to support similar functionality for other
architectures as well.
Signed-off-by: Khalid Aziz <khalid.aziz@oracle.com>
Cc: Khalid Aziz <khalid@gonehiking.org>
Reviewed-by: Anthony Yznaga <anthony.yznaga@oracle.com>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

2c2d57b5

mm: Add address parameter to arch_validate_prot() · 9035cf9a

Khalid Aziz authored Feb 21, 2018

A protection flag may not be valid across entire address space and
hence arch_validate_prot() might need the address a protection bit is
being set on to ensure it is a valid protection flag. For example, sparc
processors support memory corruption detection (as part of ADI feature)
flag on memory addresses mapped on to physical RAM but not on PFN mapped
pages or addresses mapped on to devices. This patch adds address to the
parameters being passed to arch_validate_prot() so protection bits can
be validated in the relevant context.
Signed-off-by: Khalid Aziz <khalid.aziz@oracle.com>
Cc: Khalid Aziz <khalid@gonehiking.org>
Reviewed-by: Anthony Yznaga <anthony.yznaga@oracle.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Acked-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

9035cf9a

sparc64: Add auxiliary vectors to report platform ADI properties · c6202ca7

Khalid Aziz authored Feb 21, 2018

ADI feature on M7 and newer processors has three important properties
relevant to userspace apps using ADI capabilities - (1) Size of block of
memory an ADI version tag applies to, (2) Number of uppermost bits in
virtual address used to encode ADI tag, and (3) The value M7 processor
will force the ADI tags to if it detects uncorrectable error in an ADI
tagged cacheline. Kernel can retrieve these properties for a platform
through machine description provided by the firmware. This patch adds
code to retrieve these properties and report them to userspace through
auxiliary vectors.
Signed-off-by: Khalid Aziz <khalid.aziz@oracle.com>
Cc: Khalid Aziz <khalid@gonehiking.org>
Reviewed-by: Anthony Yznaga <anthony.yznaga@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c6202ca7

sparc64: Add handler for "Memory Corruption Detected" trap · 94addb35

Khalid Aziz authored Feb 21, 2018

M7 and newer processors add a "Memory corruption Detected" trap with
the addition of ADI feature. This trap is vectored into kernel by HV
through resumable error trap with error attribute for the resumable
error set to 0x00000800.
Signed-off-by: Khalid Aziz <khalid.aziz@oracle.com>
Cc: Khalid Aziz <khalid@gonehiking.org>
Reviewed-by: Anthony Yznaga <anthony.yznaga@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

94addb35

sparc64: Add HV fault type handlers for ADI related faults · 52df948d

Khalid Aziz authored Feb 21, 2018

ADI (Application Data Integrity) feature on M7 and newer processors
adds new fault types for hypervisor - Invalid ASI and MCD disabled.
This patch expands data access exception handler to handle these
faults.
Signed-off-by: Khalid Aziz <khalid.aziz@oracle.com>
Cc: Khalid Aziz <khalid@gonehiking.org>
Reviewed-by: Anthony Yznaga <anthony.yznaga@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

52df948d

sparc64: Add support for ADI register fields, ASIs and traps · 75037500

Khalid Aziz authored Feb 21, 2018

SPARC M7 processor adds new control register fields, ASIs and a new
trap to support the ADI (Application Data Integrity) feature. This
patch adds definitions for these register fields, ASIs and a handler
for the new precise memory corruption detected trap.
Signed-off-by: Khalid Aziz <khalid.aziz@oracle.com>
Cc: Khalid Aziz <khalid@gonehiking.org>
Reviewed-by: Anthony Yznaga <anthony.yznaga@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

75037500

mm, swap: Add infrastructure for saving page metadata on swap · ca827d55

Khalid Aziz authored Feb 21, 2018

If a processor supports special metadata for a page, for example ADI
version tags on SPARC M7, this metadata must be saved when the page is
swapped out. The same metadata must be restored when the page is swapped
back in. This patch adds two new architecture specific functions -
arch_do_swap_page() to be called when a page is swapped in, and
arch_unmap_one() to be called when a page is being unmapped for swap
out. These architecture hooks allow page metadata to be saved if the
architecture supports it.
Signed-off-by: Khalid Aziz <khalid.aziz@oracle.com>
Cc: Khalid Aziz <khalid@gonehiking.org>
Acked-by: Jerome Marchand <jmarchan@redhat.com>
Reviewed-by: Anthony Yznaga <anthony.yznaga@oracle.com>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

ca827d55

signals, sparc: Add signal codes for ADI violations · d84bb709

Khalid Aziz authored Feb 21, 2018

SPARC M7 processor introduces a new feature - Application Data
Integrity (ADI). ADI allows MMU to  catch rogue accesses to memory.
When a rogue access occurs, MMU blocks the access and raises an
exception. In response to the exception, kernel sends the offending
task a SIGSEGV with si_code that indicates the nature of exception.
This patch adds three new signal codes specific to ADI feature:

1. ADI is not enabled for the address and task attempted to access
   memory using ADI
2. Task attempted to access memory using wrong ADI tag and caused
   a deferred exception.
3. Task attempted to access memory using wrong ADI tag and caused
   a precise exception.
Signed-off-by: Khalid Aziz <khalid.aziz@oracle.com>
Cc: Khalid Aziz <khalid@gonehiking.org>
Reviewed-by: Anthony Yznaga <anthony.yznaga@oracle.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

d84bb709

16 Mar, 2018 13 commits

Merge tag 'for-4.16-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux · 8f5fd927

Linus Torvalds authored Mar 16, 2018

Pull btrfs fixes from David Sterba:
 "There's an important revert in this pull request that needs to go to
  stable as it causes a corruption on big endian machines.

  The other fix is for FIEMAP incorrectly reporting shared extents
  before a sync and one fix for a crash in raid56.

  So far we got only one report about the BE corruption, the stable
  kernels were out for like a week, so hopefully the scope of the damage
  is low"

* tag 'for-4.16-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  Revert "btrfs: use proper endianness accessors for super_copy"
  btrfs: add missing initialization in btrfs_check_shared
  btrfs: Fix NULL pointer exception in find_bio_stripe

8f5fd927

Merge tag 'microblaze-4.16-rc6' of git://git.monstr.eu/linux-2.6-microblaze · 8757ae23

Linus Torvalds authored Mar 16, 2018

Pull microblaze fixes from Michal Simek:

 - Use NO_BOOTMEM to fix boot issue

 - Fix opt lib endian dependencies

* tag 'microblaze-4.16-rc6' of git://git.monstr.eu/linux-2.6-microblaze:
  microblaze: switch to NO_BOOTMEM
  microblaze: remove unused alloc_maybe_bootmem
  microblaze: Setup dependencies for ASM optimized lib functions

8757ae23

Merge tag 'drm-fixes-for-v4.16-rc6' of git://people.freedesktop.org/~airlied/linux · 1660a76a

Linus Torvalds authored Mar 16, 2018

Pull drm fixes from Dave Airlie:
 "i915, amd and nouveau fixes.

  i915:
   - backlight fix for some panels
   - pm fix
   - fencing fix
   - some GVT fixes

  amdgpu:
   - backlight fix across suspend/resume
   - object destruction ordering issue fix
   - displayport fix

  nouveau:
   - two backlight fixes
   - fix for some lockups

  Pretty quiet week, seems like everyone was fixing backlights"

* tag 'drm-fixes-for-v4.16-rc6' of git://people.freedesktop.org/~airlied/linux:
  drm/nouveau/bl: fix backlight regression
  drm/nouveau/bl: Fix oops on driver unbind
  drm/nouveau/mmu: ALIGN_DOWN correct variable
  drm/i915/gvt: fix user copy warning by whitelist workload rb_tail field
  drm/i915/gvt: Correct the privilege shadow batch buffer address
  drm/amdgpu/dce: Don't turn off DP sink when disconnected
  drm/amdgpu: save/restore backlight level in legacy dce code
  drm/radeon: fix prime teardown order
  drm/amdgpu: fix prime teardown order
  drm/i915: Kick the rps worker when changing the boost frequency
  drm/i915: Only prune fences after wait-for-all
  drm/i915: Enable VBT based BL control for DP
  drm/i915/gvt: keep oa config in shadow ctx
  drm/i915/gvt: Add runtime_pm_get/put into gvt_switch_mmio

1660a76a

Revert "btrfs: use proper endianness accessors for super_copy" · 093e037c

David Sterba authored Mar 16, 2018

This reverts commit 3c181c12.

The offending patch was merged in 4.16-rc4 and was promptly applied to
stable kernels 4.14.25 and 4.15.8.

The patch causes a corruption in several superblock items on big-endian
machines because of messed up endianity conversions. The damage is
manually repairable. A filesystem cannot be mounted again after it has
been unmounted once.

We do a full revert and not a fixup so stable can pick that patch ASAP.

Fixes: 3c181c12 ("btrfs: use proper endianness accessors for super_copy")
Link: https://lkml.kernel.org/r/1521139304@msgid.manchmal.in-ulm.de
CC: stable@vger.kernel.org # 4.14+
Reported-by: Christoph Biedl <linux-kernel.bfrz@manchmal.in-ulm.de>
Signed-off-by: David Sterba <dsterba@suse.com>

093e037c

microblaze: switch to NO_BOOTMEM · 101646a2

Rob Herring authored Mar 09, 2018

Microblaze doesn't set CONFIG_NO_BOOTMEM and so memblock_virt_alloc()
doesn't work for CONFIG_HAVE_MEMBLOCK && !CONFIG_NO_BOOTMEM.

Similar change was already done by others architectures
"ARM: mm: Remove bootmem code and switch to NO_BOOTMEM"
(sha1: 84f452b1)
or
"openrisc: Consolidate setup to use memblock instead of bootmem"
(sha1: 266c7fad)
or
"parisc: Drop bootmem and switch to memblock"
(sha1: 4fe9e1d9)
or
"powerpc: Remove bootmem allocator"
(sha1: 10239733)
or
"s390/mm: Convert bootmem to memblock"
(sha1: 50be6345)
or
"sparc64: Convert over to NO_BOOTMEM."
(sha1: 625d693e)
or
"xtensa: drop sysmem and switch to memblock"
(sha1: 0e46c111)

Issue was introduced by:
"of/fdt: use memblock_virt_alloc for early alloc"
(sha1: 0fa1c579)
Signed-off-by: Rob Herring <robh@kernel.org>
Tested-by: Alvaro Gamez Machado <alvaro.gamez@hazent.com>
Tested-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>

101646a2

microblaze: remove unused alloc_maybe_bootmem · cd4dfee6

Rob Herring authored Mar 09, 2018

alloc_maybe_bootmem is unused, so remove it.
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>

cd4dfee6

microblaze: Setup dependencies for ASM optimized lib functions · 18ffc0cc

Michal Simek authored Feb 22, 2018

The patch:
"microblaze: Setup proper dependency for optimized lib functions"
(sha1: 7b6ce52b)
didn't setup all dependencies properly.
Optimized lib functions in C are also present for little endian
and optimized library functions in assembler are implemented only for
big endian version.
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>

18ffc0cc

Merge tag 'drm-intel-fixes-2018-03-15' of... · 3a1b5de3

Dave Airlie authored Mar 16, 2018

Merge tag 'drm-intel-fixes-2018-03-15' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes

Only GVT fixes:
- Two warnings fix for runtime pm and usr copy (Xiong, Zhenyu)
- OA context fix for vGPU profiling (Min)
- privilege batch buffer reloc fix (Fred)

* tag 'drm-intel-fixes-2018-03-15' of git://anongit.freedesktop.org/drm/drm-intel:
  drm/i915/gvt: fix user copy warning by whitelist workload rb_tail field
  drm/i915/gvt: Correct the privilege shadow batch buffer address
  drm/i915/gvt: keep oa config in shadow ctx
  drm/i915/gvt: Add runtime_pm_get/put into gvt_switch_mmio

3a1b5de3

Merge branch 'linux-4.16' of git://github.com/skeggsb/linux into drm-fixes · d4487b57

Dave Airlie authored Mar 16, 2018

nouveau regression fixes.

* 'linux-4.16' of git://github.com/skeggsb/linux:
  drm/nouveau/bl: fix backlight regression
  drm/nouveau/bl: Fix oops on driver unbind
  drm/nouveau/mmu: ALIGN_DOWN correct variable

d4487b57

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · df09348f

Linus Torvalds authored Mar 15, 2018

Pull vfs fixes from Al Viro:

 - backport-friendly part of lock_parent() race fix

 - a fix for an assumption in the heurisic used by path_connected() that
   is not true on NFS

 - livelock fixes for d_alloc_parallel()

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  fs: Teach path_connected to handle nfs filesystems with multiple roots.
  fs: dcache: Use READ_ONCE when accessing i_dir_seq
  fs: dcache: Avoid livelock between d_alloc_parallel and __d_add
  lock_parent() needs to recheck if dentry got __dentry_kill'ed under it

df09348f

drm/nouveau/bl: fix backlight regression · 9e75dc61

Karol Herbst authored Feb 19, 2018

Fixes: 3c66c87d ("drm/nouveau/disp: remove hw-specific customisation
of output paths")
Suggested-by: Ben Skeggs <skeggsb@redhat.com>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

9e75dc61

drm/nouveau/bl: Fix oops on driver unbind · 76f2e2bc

Lukas Wunner authored Feb 17, 2018

Unbinding nouveau on a dual GPU MacBook Pro oopses because we iterate
over the bl_connectors list in nouveau_backlight_exit() but skipped
initializing it in nouveau_backlight_init().  Stacktrace for posterity:

    BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
    IP: nouveau_backlight_exit+0x2b/0x70 [nouveau]
    nouveau_display_destroy+0x29/0x80 [nouveau]
    nouveau_drm_unload+0x65/0xe0 [nouveau]
    drm_dev_unregister+0x3c/0xe0 [drm]
    drm_put_dev+0x2e/0x60 [drm]
    nouveau_drm_device_remove+0x47/0x70 [nouveau]
    pci_device_remove+0x36/0xb0
    device_release_driver_internal+0x157/0x220
    driver_detach+0x39/0x70
    bus_remove_driver+0x51/0xd0
    pci_unregister_driver+0x2a/0xa0
    nouveau_drm_exit+0x15/0xfb0 [nouveau]
    SyS_delete_module+0x18c/0x290
    system_call_fast_compare_end+0xc/0x6f

Fixes: b53ac1ee ("drm/nouveau/bl: Do not register interface if Apple GMUX detected")
Cc: stable@vger.kernel.org # v4.10+
Cc: Pierre Moreau <pierre.morrow@free.fr>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

76f2e2bc

drm/nouveau/mmu: ALIGN_DOWN correct variable · da5e45e6

Māris Nartišs authored Mar 16, 2018

Commit 7110c89bb8852ff8b0f88ce05b332b3fe22bd11e ("mmu: swap out round
for ALIGN") replaced two calls to round/rounddown with ALIGN/ALIGN_DOWN,
but erroneously applied ALIGN_DOWN to a different variable (addr) and left
intended variable (tail) not rounded/ALIGNed.

As a result screen corruption, X lockups are observable. An example of kernel
log of affected system with NV98 card where it was bisected:

nouveau 0000:01:00.0: gr: TRAP_M2MF 00000002 [IN]
nouveau 0000:01:00.0: gr: TRAP_M2MF 00320951 400007c0 00000000 04000000
nouveau 0000:01:00.0: gr: 00200000 [] ch 1 [000fbbe000 DRM] subc 4 class 5039
mthd 0100 data 00000000
nouveau 0000:01:00.0: fb: trapped read at 0040000000 on channel 1
[0fbbe000 DRM]
engine 00 [PGRAPH] client 03 [DISPATCH] subclient 04 [M2M_IN] reason 00000006
[NULL_DMAOBJ]

Fixes bug 105173 ("[MCP79][Regression] Unhandled NULL pointer dereference in
nvkm_object_unmap since kernel 4.15")
https://bugs.freedesktop.org/show_bug.cgi?id=105173

Fixes: 7110c89bb885 ("mmu: swap out round for ALIGN ")
Tested-by: Pierre Moreau <pierre.morrow@free.fr>
Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>
Signed-off-by: Maris Nartiss <maris.nartiss@gmail.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Cc: stable@vger.kernel.org # v4.15+

da5e45e6

15 Mar, 2018 8 commits

fs: Teach path_connected to handle nfs filesystems with multiple roots. · 95dd7758

Eric W. Biederman authored Mar 14, 2018

On nfsv2 and nfsv3 the nfs server can export subsets of the same
filesystem and report the same filesystem identifier, so that the nfs
client can know they are the same filesystem.  The subsets can be from
disjoint directory trees.  The nfsv2 and nfsv3 filesystems provides no
way to find the common root of all directory trees exported form the
server with the same filesystem identifier.

The practical result is that in struct super s_root for nfs s_root is
not necessarily the root of the filesystem.  The nfs mount code sets
s_root to the root of the first subset of the nfs filesystem that the
kernel mounts.

This effects the dcache invalidation code in generic_shutdown_super
currently called shrunk_dcache_for_umount and that code for years
has gone through an additional list of dentries that might be dentry
trees that need to be freed to accomodate nfs.

When I wrote path_connected I did not realize nfs was so special, and
it's hueristic for avoiding calling is_subdir can fail.

The practical case where this fails is when there is a move of a
directory from the subtree exposed by one nfs mount to the subtree
exposed by another nfs mount.  This move can happen either locally or
remotely.  With the remote case requiring that the move directory be cached
before the move and that after the move someone walks the path
to where the move directory now exists and in so doing causes the
already cached directory to be moved in the dcache through the magic
of d_splice_alias.

If someone whose working directory is in the move directory or a
subdirectory and now starts calling .. from the initial mount of nfs
(where s_root == mnt_root), then path_connected as a heuristic will
not bother with the is_subdir check.  As s_root really is not the root
of the nfs filesystem this heuristic is wrong, and the path may
actually not be connected and path_connected can fail.

The is_subdir function might be cheap enough that we can call it
unconditionally.  Verifying that will take some benchmarking and
the result may not be the same on all kernels this fix needs
to be backported to.  So I am avoiding that for now.

Filesystems with snapshots such as nilfs and btrfs do something
similar.  But as the directory tree of the snapshots are disjoint
from one another and from the main directory tree rename won't move
things between them and this problem will not occur.

Cc: stable@vger.kernel.org
Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Fixes: 397d425d ("vfs: Test for and handle paths that are unreachable from their mnt_root")
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

95dd7758

Merge tag 'gvt-fixes-2018-03-15' of https://github.com/intel/gvt-linux into drm-intel-fixes · 05b429a8

Rodrigo Vivi authored Mar 15, 2018

gvt-fixes-2018-03-15

- Two warnings fix for runtime pm and usr copy (Xiong, Zhenyu)
- OA context fix for vGPU profiling (Min)
- privilege batch buffer reloc fix (Fred)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180315100023.5n5a74afky6qinoh@zhen-hp.sh.intel.com

05b429a8

sparc64: Fix regression in pmdp_invalidate(). · cfb61b5e

David S. Miller authored Mar 15, 2018

pmdp_invalidate() was changed to update the pmd atomically
(to not lose dirty/access bits) and return the original pmd
value.

However, in doing so, we lost a lot of the essential work that
set_pmd_at() does, namely to update hugepage mapping counts and
queuing up the batched TLB flush entry.

Thus we were not flushing entries out of the TLB when making
such PMD changes.

Fix this by abstracting the accounting work of set_pmd_at() out into a
separate function, and call it from pmdp_establish().

Fixes: a8e654f0 ("sparc64: update pmdp_invalidate() to return old pmd value")
Signed-off-by: David S. Miller <davem@davemloft.net>

cfb61b5e

Merge tag 'sound-4.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · e2c15aff

Linus Torvalds authored Mar 15, 2018

Pull sound fixes from Takashi Iwai:
 "A series of small fixes in ASoC, HD-audio and core stuff:

   - a UAF fix in ALSA PCM core

   - yet more hardening for ALSA sequencer

   - a regression fix for the previous HD-audio power_save option change

   - various ASoC codec fixes (sgtl5000, rt5651, hdmi-codec, wm_adsp)

   - minor ASoC platform fixes (AMD ACP, sun4i)"

* tag 'sound-4.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: hda - Revert power_save option default value
  ALSA: pcm: Fix UAF in snd_pcm_oss_get_formats()
  ALSA: seq: Clear client entry before deleting else at closing
  ALSA: seq: Fix possible UAF in snd_seq_check_queue()
  ASoC: amd: 16bit resolution support for i2s sp instance
  ASoC: wm_adsp: For TLV controls only register TLV get/set
  ASoC: sun4i-i2s: Fix RX slot number of SUN8I
  ASoC: hdmi-codec: Fix module unloading caused kernel crash
  ASoC: rt5651: Fix regcache sync errors on resume
  ASoC: sgtl5000: Fix suspend/resume
  MAINTAINERS: Add myself as sgtl5000 maintainer
  ASoC: samsung: Add the DT binding files entry to MAINTAINERS
  sgtl5000: change digital_mute policy

e2c15aff

Merge tag 'for-4.16/dm-fixes-3' of... · 667058ae

Linus Torvalds authored Mar 15, 2018

Merge tag 'for-4.16/dm-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm

Pull device mapper fixes from Mike Snitzer:

 - a stable DM multipath fix to restore ability to pass integrity data

 - two DM multipath fixes for a fix that was merged into 4.16-rc5

* tag 'for-4.16/dm-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dm mpath: fix passing integrity data
  dm mpath: eliminate need to use scsi_device_from_queue
  dm mpath: fix uninitialized 'pg_init_wait' waitqueue_head NULL pointer

667058ae

drm/i915/gvt: fix user copy warning by whitelist workload rb_tail field · 850555d1

Zhenyu Wang authored Feb 14, 2018

This is to fix warning got as:

[ 6730.476938] ------------[ cut here ]------------
[ 6730.476979] Bad or missing usercopy whitelist? Kernel memory exposure attempt detected from SLAB object 'gvt-g_vgpu_workload' (offset 120, size 4)!
[ 6730.477021] WARNING: CPU: 2 PID: 441 at mm/usercopy.c:81 usercopy_warn+0x7e/0xa0
[ 6730.477042] Modules linked in: tun(E) bridge(E) stp(E) llc(E) kvmgt(E) x86_pkg_temp_thermal(E) vfio_mdev(E) intel_powerclamp(E) mdev(E) coretemp(E) vfio_iommu_type1(E) vfio(E) kvm_intel(E) kvm(E) hid_generic(E) irqbypass(E) crct10dif_pclmul(E) crc32_pclmul(E) usbhid(E) i915(E) crc32c_intel(E) hid(E) ghash_clmulni_intel(E) pcbc(E) aesni_intel(E) aes_x86_64(E) crypto_simd(E) cryptd(E) glue_helper(E) intel_cstate(E) idma64(E) evdev(E) virt_dma(E) iTCO_wdt(E) intel_uncore(E) intel_rapl_perf(E) intel_lpss_pci(E) sg(E) shpchp(E) mei_me(E) pcspkr(E) iTCO_vendor_support(E) intel_lpss(E) intel_pch_thermal(E) prime_numbers(E) mei(E) mfd_core(E) video(E) acpi_pad(E) button(E) binfmt_misc(E) ip_tables(E) x_tables(E) autofs4(E) ext4(E) crc16(E) mbcache(E) jbd2(E) fscrypto(E) sd_mod(E) e1000e(E) xhci_pci(E) sdhci_pci(E)
[ 6730.477244]  ptp(E) cqhci(E) xhci_hcd(E) pps_core(E) sdhci(E) mmc_core(E) i2c_i801(E) usbcore(E) thermal(E) fan(E)
[ 6730.477276] CPU: 2 PID: 441 Comm: gvt workload 0 Tainted: G            E    4.16.0-rc1-gvt-staging-0213+ #127
[ 6730.477303] Hardware name:  /NUC6i5SYB, BIOS SYSKLi35.86A.0039.2016.0316.1747 03/16/2016
[ 6730.477326] RIP: 0010:usercopy_warn+0x7e/0xa0
[ 6730.477340] RSP: 0018:ffffba6301223d18 EFLAGS: 00010286
[ 6730.477355] RAX: 0000000000000000 RBX: ffff8f41caae9838 RCX: 0000000000000006
[ 6730.477375] RDX: 0000000000000007 RSI: 0000000000000082 RDI: ffff8f41dad166f0
[ 6730.477395] RBP: 0000000000000004 R08: 0000000000000576 R09: 0000000000000000
[ 6730.477415] R10: ffffffffb1293fb2 R11: 00000000ffffffff R12: 0000000000000001
[ 6730.477447] R13: ffff8f41caae983c R14: ffff8f41caae9838 R15: 00007f183ca2b000
[ 6730.477467] FS:  0000000000000000(0000) GS:ffff8f41dad00000(0000) knlGS:0000000000000000
[ 6730.477489] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 6730.477506] CR2: 0000559462817291 CR3: 000000028b46c006 CR4: 00000000003626e0
[ 6730.477526] Call Trace:
[ 6730.477537]  __check_object_size+0x9c/0x1a0
[ 6730.477562]  __kvm_write_guest_page+0x45/0x90 [kvm]
[ 6730.477585]  kvm_write_guest+0x46/0x80 [kvm]
[ 6730.477599]  kvmgt_rw_gpa+0x9b/0xf0 [kvmgt]
[ 6730.477642]  workload_thread+0xa38/0x1040 [i915]
[ 6730.477659]  ? do_wait_intr_irq+0xc0/0xc0
[ 6730.477673]  ? finish_wait+0x80/0x80
[ 6730.477707]  ? clean_workloads+0x120/0x120 [i915]
[ 6730.477722]  kthread+0x111/0x130
[ 6730.477733]  ? _kthread_create_worker_on_cpu+0x60/0x60
[ 6730.477750]  ? exit_to_usermode_loop+0x6f/0xb0
[ 6730.477766]  ret_from_fork+0x35/0x40
[ 6730.477777] Code: 48 c7 c0 20 e3 25 b1 48 0f 44 c2 41 50 51 41 51 48 89 f9 49 89 f1 4d 89 d8 4c 89 d2 48 89 c6 48 c7 c7 78 e3 25 b1 e8 b2 bc e4 ff <0f> ff 48 83 c4 18 c3 48 c7 c6 09 d0 26 b1 49 89 f1 49 89 f3 eb
[ 6730.477849] ---[ end trace cae869c1c323e45a ]---

By whitelist guest page write from workload struct allocated from kmem cache.
Reviewed-by: Hang Yuan <hang.yuan@linux.intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
(cherry picked from commit 5627705406874df57fdfad3b4e0c9aedd3b007df)

850555d1

drm/i915/gvt: Correct the privilege shadow batch buffer address · ef75c685

fred gao authored Mar 15, 2018

Once the ring buffer is copied to ring_scan_buffer and scanned,
the shadow batch buffer start address is only updated into
ring_scan_buffer, not the real ring address allocated through
intel_ring_begin in later copy_workload_to_ring_buffer.

This patch is only to set the right shadow batch buffer address
from Ring buffer, not include the shadow_wa_ctx.

v2:
- refine some comments. (Zhenyu)
v3:
- fix typo in title. (Zhenyu)
v4:
- remove the unnecessary comments. (Zhenyu)
- add comments in bb_start_cmd_va update. (Zhenyu)

Fixes: 0a53bc07 ("drm/i915/gvt: Separate cmd scan from request allocation")
Cc: stable@vger.kernel.org  # v4.15
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: Yulei Zhang <yulei.zhang@intel.com>
Signed-off-by: fred gao <fred.gao@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

ef75c685

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 0aa3fdb8

Linus Torvalds authored Mar 14, 2018

Pull SCSI fixes from James Bottomley:
 "This is four patches, consisting of one regression from the merge
  window (qla2xxx), one long-standing memory leak (sd_zbc), one event
  queue mislabelling which we want to eliminate to discourage the
  pattern (mpt3sas), and one behaviour change because re-reading the
  partition table shouldn't clear the ro flag"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: sd: Keep disk read-only when re-reading partition
  scsi: qla2xxx: Fix crashes in qla2x00_probe_one on probe failure
  scsi: sd_zbc: Fix potential memory leak
  scsi: mpt3sas: Do not mark fw_event workqueue as WQ_MEM_RECLAIM

0aa3fdb8

14 Mar, 2018 6 commits

btree: avoid variable-length allocations · 8df3aaaf

Joern Engel authored Mar 13, 2018

geo->keylen cannot be larger than 4.  So we might as well make
fixed-size allocations.

Given the one remaining user, geo->keylen cannot even be larger than 1.
Logfs used to have 64bit and 128bit keys, tcm_qla2xxx only has 32bit
keys.  But let's not break the code if we don't have to.
Signed-off-by: Joern Engel <joern@purestorage.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

8df3aaaf

Merge branch 'percpu_ref-rcu-audit-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc · fed8f509

Linus Torvalds authored Mar 14, 2018

Pull percpu_ref rcu fixes from Tejun Heo:
 "Jann Horn found that aio was depending on the internal RCU grace
  periods of percpu-ref and that it's broken because aio uses regular
  RCU while percpu_ref uses sched-RCU.

  Depending on percpu_ref's internal grace periods isn't a good idea
  because

   - The RCU type might not match.

   - percpu_ref's grace periods are used to switch to atomic mode. They
     aren't between the last put and the invocation of the last release.
     This is easy to get confused about and can lead to subtle bugs.

   - percpu_ref might not have grace periods at all depending on its
     current operation mode.

  This patchset audits and fixes percpu_ref users for their RCU usages"

[ There's a continuation of this series that clarifies percpu_ref
  documentation that the internal grace periods must not be depended
  upon, and introduces rcu_work to simplify bouncing to a workqueue
  after an RCU grace period.

  That will go in for 4.17 - this is just the minimal set with the fixes
  that are tagged for -stable ]

* 'percpu_ref-rcu-audit-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc:
  RDMAVT: Fix synchronization around percpu_ref
  fs/aio: Use RCU accessors for kioctx_table->table[]
  fs/aio: Add explicit RCU grace period when freeing kioctx

fed8f509

Revert "mm/page_alloc: fix memmap_init_zone pageblock alignment" · 3e04040d

Ard Biesheuvel authored Mar 14, 2018

This reverts commit 864b75f9.

Commit 864b75f9 ("mm/page_alloc: fix memmap_init_zone pageblock
alignment") modified the logic in memmap_init_zone() to initialize
struct pages associated with invalid PFNs, to appease a VM_BUG_ON()
in move_freepages(), which is redundant by its own admission, and
dereferences struct page fields to obtain the zone without checking
whether the struct pages in question are valid to begin with.

Commit 864b75f9 only makes it worse, since the rounding it does
may cause pfn assume the same value it had in a prior iteration of
the loop, resulting in an infinite loop and a hang very early in the
boot. Also, since it doesn't perform the same rounding on start_pfn
itself but only on intermediate values following an invalid PFN, we
may still hit the same VM_BUG_ON() as before.

So instead, let's fix this at the core, and ensure that the BUG
check doesn't dereference struct page fields of invalid pages.

Fixes: 864b75f9 ("mm/page_alloc: fix memmap_init_zone pageblock alignment")
Tested-by: Jan Glauber <jglauber@cavium.com>
Tested-by: Shanker Donthineni <shankerd@codeaurora.org>
Cc: Daniel Vacek <neelx@redhat.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: Pavel Tatashin <pasha.tatashin@oracle.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

3e04040d

Merge tag 'drm-intel-fixes-2018-03-14' of... · 67f19766

Dave Airlie authored Mar 15, 2018

Merge tag 'drm-intel-fixes-2018-03-14' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes

- 1 display fix for bxt
- 1 gem fix for fences
- 1 gem/pm fix for rps freq

* tag 'drm-intel-fixes-2018-03-14' of git://anongit.freedesktop.org/drm/drm-intel:
  drm/i915: Kick the rps worker when changing the boost frequency
  drm/i915: Only prune fences after wait-for-all
  drm/i915: Enable VBT based BL control for DP

67f19766

Merge branch 'drm-fixes-4.16' of git://people.freedesktop.org/~agd5f/linux into drm-fixes · 4cdc8f12

Dave Airlie authored Mar 15, 2018

A few fixes for 4.16:
- Fix a backlight S/R regression on amdgpu
- Fix prime teardown on radeon and amdgpu
- DP fix for amdgpu

* 'drm-fixes-4.16' of git://people.freedesktop.org/~agd5f/linux:
  drm/amdgpu/dce: Don't turn off DP sink when disconnected
  drm/amdgpu: save/restore backlight level in legacy dce code
  drm/radeon: fix prime teardown order
  drm/amdgpu: fix prime teardown order

4cdc8f12

btrfs: add missing initialization in btrfs_check_shared · 18bf591b

Edmund Nadolski authored Mar 14, 2018

This patch addresses an issue that causes fiemap to falsely
report a shared extent.  The test case is as follows:

xfs_io -f -d -c "pwrite -b 16k 0 64k" -c "fiemap -v" /media/scratch/file5
sync
xfs_io  -c "fiemap -v" /media/scratch/file5

which gives the resulting output:

wrote 65536/65536 bytes at offset 0
64 KiB, 4 ops; 0.0000 sec (121.359 MiB/sec and 7766.9903 ops/sec)
/media/scratch/file5:
 EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
   0: [0..127]:        24576..24703       128 0x2001
/media/scratch/file5:
 EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
   0: [0..127]:        24576..24703       128   0x1

This is because btrfs_check_shared calls find_parent_nodes
repeatedly in a loop, passing a share_check struct to report
the count of shared extent. But btrfs_check_shared does not
re-initialize the count value to zero for subsequent calls
from the loop, resulting in a false share count value. This
is a regressive behavior from 4.13.

With proper re-initialization the test result is as follows:

wrote 65536/65536 bytes at offset 0
64 KiB, 4 ops; 0.0000 sec (110.035 MiB/sec and 7042.2535 ops/sec)
/media/scratch/file5:
 EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
   0: [0..127]:        24576..24703       128   0x1
/media/scratch/file5:
 EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
   0: [0..127]:        24576..24703       128   0x1

which corrects the regression.

Fixes: 3ec4d323 ("btrfs: allow backref search checks for shared extents")
Signed-off-by: Edmund Nadolski <enadolski@suse.com>
[ add text from cover letter to changelog ]
Signed-off-by: David Sterba <dsterba@suse.com>

18bf591b