Commits · baef9a35d24626ebdc5a9074455e63eea6c7f6af · Kirill Smelkov / linux

19 Dec, 2016 14 commits

mailbox: mailbox-test: add support for fasync/poll · baef9a35

Sudeep Holla authored Nov 29, 2016

Currently the read operation on the message debug file returns error if
there's no data ready to be read. It expects the userspace to retry if
it fails. Since the mailbox response could be asynchronous, it would be
good to add support to block the read until the data is available.

We can also implement poll file operations so that the userspace can
wait to become ready to perform any I/O.

This patch implements the poll and fasync file operation callback for
the test mailbox device.

Cc: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>

baef9a35

mailbox: bcm-pdc: Remove unnecessary void* casts · cf175813

Rob Rice authored Nov 14, 2016

Remove unnecessary void* casts in register writes. Fix two other
minor formatting issues.
Signed-off-by: Rob Rice <rob.rice@broadcom.com>
Reviewed-by: Andy Gospodarek <gospo@broadcom.com>
Reviewed-by: Jon Mason <jon.mason@broadcom.com>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>

cf175813

mailbox: bcm-pdc: Simplify interrupt handler logic · 30d1ef62

Rob Rice authored Nov 14, 2016

Earlier versions of the PDC driver registered for both
transmit and receive interrupts. The hard IRQ handler had to
communicate to the soft handler which interrupt(s) had occurred.
The PDC driver no longer registers for tx interrupts. So there is
no reason to save the intstatus. So remove the intstatus member
of the PDC state.
Signed-off-by: Rob Rice <rob.rice@broadcom.com>
Reviewed-by: Andy Gospodarek <gospo@broadcom.com>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>

30d1ef62

mailbox: bcm-pdc: Performance improvements · 63bb50bd

Rob Rice authored Nov 14, 2016

Three changes to improve performance in the PDC driver:
- disable and reenable interrupts while the interrupt handler is
running
- update rxin and txin descriptor indexes more efficiently
- group receive descriptor context into a structure and keep
context in a single array rather than five to improve locality
of reference
Signed-off-by: Rob Rice <rob.rice@broadcom.com>
Reviewed-by: Andy Gospodarek <gospo@broadcom.com>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>

63bb50bd

mailbox: bcm-pdc: Don't use iowrite32 to write DMA descriptors · 38ed49ed

Rob Rice authored Nov 14, 2016

In PDC driver, it is not necessary to use iowrite32()
when writing DMA descriptors to the transmit and receive rings.
The ring memory is in host memory. So convert to normal
assignment statements.
Signed-off-by: Rob Rice <rob.rice@broadcom.com>
Reviewed-by: Andy Gospodarek <gospo@broadcom.com>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>

38ed49ed

mailbox: bcm-pdc: Convert from threaded IRQ to tasklet · 8aef00f0

Rob Rice authored Nov 14, 2016

Previously used threaded IRQs in the PDC driver to defer
processing the rx DMA ring after getting an rx done interrupt.
Instead, use a tasklet at normal priority for deferred processing.
Signed-off-by: Rob Rice <rob.rice@broadcom.com>
Reviewed-by: Andy Gospodarek <gospo@broadcom.com>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>

8aef00f0

mailbox: bcm-pdc: Try to improve branch prediction · 7493cde3

Rob Rice authored Nov 14, 2016

Use likely/unlikely directives to improve branch prediction.
Signed-off-by: Rob Rice <rob.rice@broadcom.com>
Reviewed-by: Andy Gospodarek <gospo@broadcom.com>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>

7493cde3

mailbox: bcm-pdc: streamline rx code · e004c7e7

Rob Rice authored Nov 14, 2016

Remove the unnecessary rmb() from the receive path.

If the rx ring has multiple messages ready, avoid reading
last_rx_curr multiple times from the register.
Signed-off-by: Rob Rice <rob.rice@broadcom.com>
Reviewed-by: Andy Gospodarek <gospo@broadcom.com>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>

e004c7e7

mailbox: bcm-pdc: Convert from interrupts to poll for tx done · ab8d1b2d

Rob Rice authored Nov 14, 2016

The PDC driver is a mailbox controller. A mailbox controller
can report that a mailbox message has been "transmitted" either when
a tx interrupt fires or by having the mailbox framework poll. This
commit converts the PDC driver to the poll method. We found that the
tx interrupt happens when the descriptors are read by the SPU hw. Thus,
the interrupt method does not allow more than one tx message in the PDC
tx DMA ring at a time. To keep the SPU hw busy, we would like to keep
the tx ring full under heavy load.

With the poll method, the PDC driver responds that the previous message
has been transmitted if the tx ring has space for another message.
SPU request messages take a variable number of descriptors. If 15
descriptors are available, there is a good chance another message will
fit. Also increased the ring size from 128 to 512 descriptors.

With this change, I found the PDC driver hangs on its spinlock under
heavy load. The PDC spinlock is not required; so I removed it. Calls
to pdc_send_data() are already synchronized because of the channel
spinlock in the mailbox framework. Other references to ring indexes
should not require locking because they only written on either the
tx or rx side.
Signed-off-by: Rob Rice <rob.rice@broadcom.com>
Reviewed-by: Andy Gospodarek <gospo@broadcom.com>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>

ab8d1b2d

mailbox: bcm-pdc: PDC driver leaves debugfs files after removal · 9310f1de

Steve Lin authored Nov 14, 2016

Minor fix to ensure that debugfs stats pseudo-files are
removed when driver module is unloaded.  Previously, the call to
debugfs_remove_recursive() was never being called since the
directory was not empty, and a seg fault would occur if another
process tried to access these leftover files.
Signed-off-by: Steve Lin <steven.lin1@broadcom.com>
Signed-off-by: Rob Rice <rob.rice@broadcom.com>
Reviewed-by: Andy Gospodarek <gospo@broadcom.com>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>

9310f1de

mailbox: bcm-pdc: Changes so mbox client can be removed / re-inserted · 9fb0f9ac

Steve Lin authored Nov 14, 2016

Ensure that DMA is disabled, and pointers reset, when changing
DMA base addresses in pdc_ring_init().  This allows a mailbox client
to be re-inserted after being removed.  Otherwise, the DMA doesn't
restart so the client hangs while being reinserted.
Signed-off-by: Steve Lin <steven.lin1@broadcom.com>
Signed-off-by: Rob Rice <rob.rice@broadcom.com>
Reviewed-by: Andy Gospodarek <gospo@broadcom.com>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>

9fb0f9ac

mailbox: bcm-pdc: Use octal permissions rather than symbolic · 9b1b2b3a

Rob Rice authored Nov 14, 2016

When creating the debugfs files for the PDC driver, use
octal file permissions rather than symbolic file permissions.
Signed-off-by: Rob Rice <rob.rice@broadcom.com>
Reviewed-by: Andy Gospodarek <gospo@broadcom.com>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>

9b1b2b3a

mailbox: sti: Fix module autoload for OF registration · 2f50497d

Javier Martinez Canillas authored Oct 20, 2016

If the driver is built as a module, autoload won't work because the module
alias information is not filled. So user-space can't match the registered
device with the corresponding module.

Export the module alias information using the MODULE_DEVICE_TABLE() macro.

Before this patch:

$ modinfo drivers/mailbox/mailbox-sti.ko | grep alias
alias:          platform:mailbox-sti

After this patch:

$ modinfo drivers/mailbox/mailbox-sti.ko | grep alias
alias:          platform:mailbox-sti
alias:          of:N*T*Cst,stih407-mailboxC*
alias:          of:N*T*Cst,stih407-mailbox
Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com>
Acked-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>

2f50497d

mailbox: mailbox-test: Fix module autoload · f42cce3c

Javier Martinez Canillas authored Oct 20, 2016

If the driver is built as a module, autoload won't work because the module
alias information is not filled. So user-space can't match the registered
device with the corresponding module.

Export the module alias information using the MODULE_DEVICE_TABLE() macro.

Before this patch:

$ modinfo drivers/mailbox/mailbox-test.ko | grep alias
$

After this patch:

$ modinfo drivers/mailbox/mailbox-test.ko | grep alias
alias:          of:N*T*Cmailbox-testC*
alias:          of:N*T*Cmailbox-test
Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com>
Acked-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>

f42cce3c

24 Nov, 2016 4 commits

Merge tag 'mmc-v4.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc · 16ae16c6

Linus Torvalds authored Nov 24, 2016

Pull MMC fixes from Ulf Hansson:
 "MMC host:

   - sdhci-of-esdhc: Fix card detection
   - dw_mmc: Fix DMA error path"

* tag 'mmc-v4.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
  mmc: dw_mmc: fix the error handling for dma operation
  mmc: sdhci-of-esdhc: fixup PRESENT_STATE read

16ae16c6

Merge tag 'usb-4.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb · bae73e80

Linus Torvalds authored Nov 24, 2016

Pull USB fixes from Greg KH:
 "Here are a few small USB fixes and new device ids for 4.9-rc7.

  The majority of these fixes are in the musb driver, fixing a number of
  regressions that have been reported but took a while to resolve. The
  other fixes are all small ones, to resolve other reported minor
  issues.

  All have been in linux-next for a while with no reported issues"

* tag 'usb-4.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  usb: gadget: f_fs: fix wrong parenthesis in ffs_func_req_match()
  phy: twl4030-usb: Fix for musb session bit based PM
  usb: musb: Drop pointless PM runtime code for dsps glue
  usb: musb: Add missing pm_runtime_disable and drop 2430 PM timeout
  usb: musb: Fix PM for hub disconnect
  usb: musb: Fix sleeping function called from invalid context for hdrc glue
  usb: musb: Fix broken use of static variable for multiple instances
  USB: serial: cp210x: add ID for the Zone DPMX
  usb: chipidea: move the lock initialization to core file
  Fix USB CB/CBI storage devices with CONFIG_VMAP_STACK=y
  USB: serial: ftdi_sio: add support for TI CC3200 LaunchPad

bae73e80

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid · e2b6535d

Linus Torvalds authored Nov 24, 2016

Pull HID fixes from Jiri Kosina:

 - DMA-on-stack fixes for a couple drivers, from Benjamin Tissoires

 - small memory sanitization fix for sensor-hub driver, from Song
   Hongyan

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
  HID: hid-sensor-hub: clear memory to avoid random data
  HID: rmi: make transfer buffers DMA capable
  HID: magicmouse: make transfer buffers DMA capable
  HID: lg: make transfer buffers DMA capable
  HID: cp2112: make transfer buffers DMA capable

e2b6535d

init: use pr_cont() when displaying rotator during ramdisk loading. · 18594e9b

Nicolas Schichan authored Nov 24, 2016

Otherwise each individual rotator char would be printed in a new line:

(...)
[    0.642350] -
[    0.644374] |
[    0.646367] -
(...)
Signed-off-by: Nicolas Schichan <nicolas.schichan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

18594e9b

23 Nov, 2016 10 commits

Merge tag 'nfs-for-4.9-4' of git://git.linux-nfs.org/projects/anna/linux-nfs · 10b9dd56

Linus Torvalds authored Nov 23, 2016

Pull NFS client bugfixes from Anna Schumaker:
 "Most of these fix regressions or races, but there is one patch for
  stable that Arnd sent me

  Stable bugfix:
   - Hide array-bounds warning

  Bugfixes:
   - Keep a reference on lock states while checking
   - Handle NFS4ERR_OLD_STATEID in nfs4_reclaim_open_state
   - Don't call close if the open stateid has already been cleared
   - Fix CLOSE rases with OPEN
   - Fix a regression in DELEGRETURN"

* tag 'nfs-for-4.9-4' of git://git.linux-nfs.org/projects/anna/linux-nfs:
  NFSv4.x: hide array-bounds warning
  NFSv4.1: Keep a reference on lock states while checking
  NFSv4.1: Handle NFS4ERR_OLD_STATEID in nfs4_reclaim_open_state
  NFSv4: Don't call close if the open stateid has already been cleared
  NFSv4: Fix CLOSE races with OPEN
  NFSv4.1: Fix a regression in DELEGRETURN

10b9dd56

Merge branch 'stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile · 4d92c8d0

Linus Torvalds authored Nov 23, 2016

Pull arch/tile bugfix from Chris Metcalf:
 "This fixes a bug that causes reboots after 208 days of uptime :-)"

* 'stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
  tile: avoid using clocksource_cyc2ns with absolute cycle count

4d92c8d0

tile: avoid using clocksource_cyc2ns with absolute cycle count · e658a6f1

Chris Metcalf authored Nov 16, 2016

For large values of "mult" and long uptimes, the intermediate
result of "cycles * mult" can overflow 64 bits.  For example,
the tile platform calls clocksource_cyc2ns with a 1.2 GHz clock;
we have mult = 853, and after 208.5 days, we overflow 64 bits.

Since clocksource_cyc2ns() is intended to be used for relative
cycle counts, not absolute cycle counts, performance is more
importance than accepting a wider range of cycle values.  So,
just use mult_frac() directly in tile's sched_clock().

Commit 4cecf6d4 ("sched, x86: Avoid unnecessary overflow
in sched_clock") by Salman Qazi results in essentially the same
generated code for x86 as this change does for tile.  In fact,
a follow-on change by Salman introduced mult_frac() and switched
to using it, so the C code was largely identical at that point too.

Peter Zijlstra then added mul_u64_u32_shr() and switched x86
to use it.  This is, in principle, better; by optimizing the
64x64->64 multiplies to be 32x32->64 multiplies we can potentially
save some time.  However, the compiler piplines the 64x64->64
multiplies pretty well, and the conditional branch in the generic
mul_u64_u32_shr() causes some bubbles in execution, with the
result that it's pretty much a wash.  If tilegx provided its own
implementation of mul_u64_u32_shr() without the conditional branch,
we could potentially save 3 cycles, but that seems like small gain
for a fair amount of additional build scaffolding; no other platform
currently provides a mul_u64_u32_shr() override, and tile doesn't
currently have an <asm/div64.h> header to put the override in.

Additionally, gcc currently has an optimization bug that prevents
it from recognizing the opportunity to use a 32x32->64 multiply,
and so the result would be no better than the existing mult_frac()
until such time as the compiler is fixed.

For now, just using mult_frac() seems like the right answer.

Cc: stable@kernel.org [v3.4+]
Signed-off-by: Chris Metcalf <cmetcalf@mellanox.com>

e658a6f1

HID: hid-sensor-hub: clear memory to avoid random data · d443a0aa

Song Hongyan authored Nov 15, 2016

When user tried to read some fields like hysteresis from IIO sysfs on some
systems, it fails. The reason is that this field is a byte field and caller
of sensor_hub_get_feature() passes a buffer of 4 bytes. Here the function
sensor_hub_get_feature() copies the single byte from the report to the
caller buffer and returns "1" as the number of bytes copied. So caller
can use the return value.

But this is done by multiple callers, so if we just change the
sensor_hub_get_feature so that caller buffer is initialized with 0s
then we don't to change all functions.
Signed-off-by: Song Hongyan <hongyan.song@intel.com>
Acked-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>

d443a0aa

HID: rmi: make transfer buffers DMA capable · 6dab07df

Benjamin Tissoires authored Nov 21, 2016

Kernel v4.9 strictly enforces DMA capable buffers, so we need to remove
buffers allocated on the stack.
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>

6dab07df

HID: magicmouse: make transfer buffers DMA capable · b7a87ad6

Benjamin Tissoires authored Nov 21, 2016

Kernel v4.9 strictly enforces DMA capable buffers, so we need to remove
buffers allocated on the stack.
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>

b7a87ad6

HID: lg: make transfer buffers DMA capable · 061232f0

Benjamin Tissoires authored Nov 21, 2016

Kernel v4.9 strictly enforces DMA capable buffers, so we need to remove
buffers allocated on the stack.

[jkosina@suse.cz: fix up second usage of hid_hw_raw_request(), spotted by
 0day build bot]
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>

061232f0

HID: cp2112: make transfer buffers DMA capable · 1ffb3c40

Benjamin Tissoires authored Nov 21, 2016

Kernel v4.9 strictly enforces DMA capable buffers, so we need to remove
buffers allocated on the stack.

Use a spinlock to prevent concurrent accesses to the buffer.
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>

1ffb3c40

Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · ded9b5dd

Linus Torvalds authored Nov 23, 2016

Pull perf fixes from Ingo Molnar:
 "Six fixes for bugs that were found via fuzzing, and a trivial
  hw-enablement patch for AMD Family-17h CPU PMUs"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/intel/uncore: Allow only a single PMU/box within an events group
  perf/x86/intel: Cure bogus unwind from PEBS entries
  perf/x86: Restore TASK_SIZE check on frame pointer
  perf/core: Fix address filter parser
  perf/x86: Add perf support for AMD family-17h processors
  perf/x86/uncore: Fix crash by removing bogus event_list[] handling for SNB client uncore IMC
  perf/core: Do not set cpuctx->cgrp for unscheduled cgroups

ded9b5dd

Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 · 23aabe73

Linus Torvalds authored Nov 23, 2016

Pull crypto fixes from Herbert Xu:
 "The last push broke algif_hash for all shash implementations, so this
  is a follow-up to fix that.

  This also fixes a problem in the crypto scatterwalk that triggers a
  BUG_ON with certain debugging options due to the new vmalloced-stack
  code"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: scatterwalk - Remove unnecessary aliasing check in map_and_copy
  crypto: algif_hash - Fix result clobbering in recvmsg

23aabe73

22 Nov, 2016 12 commits

Merge branch 'for-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux · 23400ac9

Linus Torvalds authored Nov 22, 2016

Pull thermal management fix from Zhang Rui:
 "We only have one urgent fix this time.

  Commit 3105f234 ("thermal/powerclamp: correct cpu support check"),
  which is shipped in 4.9-rc3, fixed a problem introduced by commit
  b721ca0d ("thermal/powerclamp: remove cpu whitelist").

  But unfortunately, it broke intel_powerclamp driver module auto-
  loading at the same time. Thus we need this change to add back module
  auto-loading for 4.9"

* 'for-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux:
  thermal/powerclamp: add back module device table

23400ac9

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · b66c08ba

Linus Torvalds authored Nov 22, 2016

Pull SCSI fixes from James Bottomley:
 "Two small fixes.

  One prevents timeouts on mpt3sas when trying to use the secure erase
  protocol which causes the erase protocol to be aborted. The second is
  a regression in a prior fix which causes all commands to abort during
  PCI extended error recovery, which is incorrect because PCI EEH is
  independent from what's happening on the FC transport"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: qla2xxx: do not abort all commands in the adapter during EEH recovery
  scsi: mpt3sas: Fix secure erase premature termination

b66c08ba

Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux · 57527ed1

Linus Torvalds authored Nov 22, 2016

Pull clk fixes from Stephen Boyd:
 "A handful of driver fixes.

  The sunxi fixes are for an incorrect clk tree configuration and a bad
  frequency calculation. The other two are fixes for passing the wrong
  pointer in drivers recently converted to clk_hw style registration"

* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
  clk: efm32gg: Pass correct type to hw provider registration
  clk: berlin: Pass correct type to hw provider registration
  clk: sunxi: Fix M factor computation for APB1
  clk: sunxi-ng: sun6i-a31: Force AHB1 clock to use PLL6 as parent

57527ed1

NFSv4.x: hide array-bounds warning · d55b352b

Arnd Bergmann authored Nov 22, 2016

A correct bugfix introduced a harmless warning that shows up with gcc-7:

fs/nfs/callback.c: In function 'nfs_callback_up':
fs/nfs/callback.c:214:14: error: array subscript is outside array bounds [-Werror=array-bounds]

What happens here is that the 'minorversion == 0' check tells the
compiler that we assume minorversion can be something other than 0,
but when CONFIG_NFS_V4_1 is disabled that would be invalid and
result in an out-of-bounds access.

The added check for IS_ENABLED(CONFIG_NFS_V4_1) tells gcc that this
really can't happen, which makes the code slightly smaller and also
avoids the warning.

The bugfix that introduced the warning is marked for stable backports,
we want this one backported to the same releases.

Fixes: 98b0f80c ("NFSv4.x: Fix a refcount leak in nfs_callback_up_net")
Cc: stable@vger.kernel.org # v3.7+
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

d55b352b

Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 000b8949

Linus Torvalds authored Nov 22, 2016

Pull scheduler fixes from Ingo Molnar:
 "Two fixes for autogroup scheduling, for races when turning the feature
  on/off via /proc/sys/kernel/sched_autogroup_enabled"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/autogroup: Do not use autogroup->tg in zombie threads
  sched/autogroup: Fix autogroup_move_group() to never skip sched_move_task()

000b8949

Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 7cfc4317

Linus Torvalds authored Nov 22, 2016

Pull x86 fixes from Ingo Molnar:
 "Misc fixes:
   - two fixes to make (very) old Intel CPUs boot reliably
   - fix the intel-mid driver and rename it
   - two KASAN false positive fixes
   - an FPU fix
   - two sysfb fixes
   - two build fixes related to new toolchain versions"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/platform/intel-mid: Rename platform_wdt to platform_mrfld_wdt
  x86/build: Build compressed x86 kernels as PIE when !CONFIG_RELOCATABLE as well
  x86/platform/intel-mid: Register watchdog device after SCU
  x86/fpu: Fix invalid FPU ptrace state after execve()
  x86/boot: Fail the boot if !M486 and CPUID is missing
  x86/traps: Ignore high word of regs->cs in early_fixup_exception()
  x86/dumpstack: Prevent KASAN false positive warnings
  x86/unwind: Prevent KASAN false positive warnings in guess unwinder
  x86/boot: Avoid warning for zero-filling .bss
  x86/sysfb: Fix lfb_size calculation
  x86/sysfb: Add support for 64bit EFI lfb_base

7cfc4317

perf/x86/intel/uncore: Allow only a single PMU/box within an events group · 033ac60c

Peter Zijlstra authored Nov 18, 2016

Group validation expects all events to be of the same PMU; however
is_uncore_pmu() is too wide, it matches _all_ uncore events, even
across PMUs.

This triggers failure when we group different events from different
uncore PMUs, like:

  perf stat -vv -e '{uncore_cbox_0/config=0x0334/,uncore_qpi_0/event=1/}' -a sleep 1

Fix is_uncore_pmu() by only matching events to the box at hand.

Note that generic code; ran after this step; will disallow this
mixture of PMU events.
Reported-by: Jiri Olsa <jolsa@redhat.com>
Tested-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vince@deater.net>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/20161118125354.GQ3117@twins.programming.kicks-ass.netSigned-off-by: Ingo Molnar <mingo@kernel.org>

033ac60c

perf/x86/intel: Cure bogus unwind from PEBS entries · b8000586

Peter Zijlstra authored Nov 17, 2016

Vince Weaver reported that perf_fuzzer + KASAN detects that PEBS event
unwinds sometimes do 'weird' things. In particular, we seemed to be
ending up unwinding from random places on the NMI stack.

While it was somewhat expected that the event record BP,SP would not
match the interrupt BP,SP in that the interrupt is strictly later than
the record event, it was overlooked that it could be on an already
overwritten stack.

Therefore, don't copy the recorded BP,SP over the interrupted BP,SP
when we need stack unwinds.

Note that its still possible the unwind doesn't full match the actual
event, as its entirely possible to have done an (I)RET between record
and interrupt, but on average it should still point in the general
direction of where the event came from. Also, it's the best we can do,
considering.

The particular scenario that triggered the bogus NMI stack unwind was
a PEBS event with very short period, upon enabling the event at the
tail of the PMI handler (FREEZE_ON_PMI is not used), it instantly
triggers a record (while still on the NMI stack) which in turn
triggers the next PMI. This then causes back-to-back NMIs and we'll
try and unwind the stack-frame from the last NMI, which obviously is
now overwritten by our own.
Analyzed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: davej@codemonkey.org.uk <davej@codemonkey.org.uk>
Cc: dvyukov@google.com <dvyukov@google.com>
Cc: stable@vger.kernel.org
Fixes: ca037701 ("perf, x86: Add PEBS infrastructure")
Link: http://lkml.kernel.org/r/20161117171731.GV3157@twins.programming.kicks-ass.netSigned-off-by: Ingo Molnar <mingo@kernel.org>

b8000586

perf/x86: Restore TASK_SIZE check on frame pointer · ae31fe51

Johannes Weiner authored Nov 22, 2016

The following commit:

  75925e1a ("perf/x86: Optimize stack walk user accesses")

... switched from copy_from_user_nmi() to __copy_from_user_nmi() with a manual
access_ok() check.

Unfortunately, copy_from_user_nmi() does an explicit check against TASK_SIZE,
whereas the access_ok() uses whatever the current address limit of the task is.

We are getting NMIs when __probe_kernel_read() has switched to KERNEL_DS, and
then see vmalloc faults when we access what looks like pointers into vmalloc
space:

  [] WARNING: CPU: 3 PID: 3685731 at arch/x86/mm/fault.c:435 vmalloc_fault+0x289/0x290
  [] CPU: 3 PID: 3685731 Comm: sh Tainted: G        W       4.6.0-5_fbk1_223_gdbf0f40 #1
  [] Call Trace:
  []  <NMI>  [<ffffffff814717d1>] dump_stack+0x4d/0x6c
  []  [<ffffffff81076e43>] __warn+0xd3/0xf0
  []  [<ffffffff81076f2d>] warn_slowpath_null+0x1d/0x20
  []  [<ffffffff8104a899>] vmalloc_fault+0x289/0x290
  []  [<ffffffff8104b5a0>] __do_page_fault+0x330/0x490
  []  [<ffffffff8104b70c>] do_page_fault+0xc/0x10
  []  [<ffffffff81794e82>] page_fault+0x22/0x30
  []  [<ffffffff81006280>] ? perf_callchain_user+0x100/0x2a0
  []  [<ffffffff8115124f>] get_perf_callchain+0x17f/0x190
  []  [<ffffffff811512c7>] perf_callchain+0x67/0x80
  []  [<ffffffff8114e750>] perf_prepare_sample+0x2a0/0x370
  []  [<ffffffff8114e840>] perf_event_output+0x20/0x60
  []  [<ffffffff8114aee7>] ? perf_event_update_userpage+0xc7/0x130
  []  [<ffffffff8114ea01>] __perf_event_overflow+0x181/0x1d0
  []  [<ffffffff8114f484>] perf_event_overflow+0x14/0x20
  []  [<ffffffff8100a6e3>] intel_pmu_handle_irq+0x1d3/0x490
  []  [<ffffffff8147daf7>] ? copy_user_enhanced_fast_string+0x7/0x10
  []  [<ffffffff81197191>] ? vunmap_page_range+0x1a1/0x2f0
  []  [<ffffffff811972f1>] ? unmap_kernel_range_noflush+0x11/0x20
  []  [<ffffffff814f2056>] ? ghes_copy_tofrom_phys+0x116/0x1f0
  []  [<ffffffff81040d1d>] ? x2apic_send_IPI_self+0x1d/0x20
  []  [<ffffffff8100411d>] perf_event_nmi_handler+0x2d/0x50
  []  [<ffffffff8101ea31>] nmi_handle+0x61/0x110
  []  [<ffffffff8101ef94>] default_do_nmi+0x44/0x110
  []  [<ffffffff8101f13b>] do_nmi+0xdb/0x150
  []  [<ffffffff81795187>] end_repeat_nmi+0x1a/0x1e
  []  [<ffffffff8147daf7>] ? copy_user_enhanced_fast_string+0x7/0x10
  []  [<ffffffff8147daf7>] ? copy_user_enhanced_fast_string+0x7/0x10
  []  [<ffffffff8147daf7>] ? copy_user_enhanced_fast_string+0x7/0x10
  []  <<EOE>>  <IRQ>  [<ffffffff8115d05e>] ? __probe_kernel_read+0x3e/0xa0

Fix this by moving the valid_user_frame() check to before the uaccess
that loads the return address and the pointer to the next frame.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: linux-kernel@vger.kernel.org
Fixes: 75925e1a ("perf/x86: Optimize stack walk user accesses")
Signed-off-by: Ingo Molnar <mingo@kernel.org>

ae31fe51

sched/autogroup: Do not use autogroup->tg in zombie threads · 8e5bfa8c

Oleg Nesterov authored Nov 14, 2016

Exactly because for_each_thread() in autogroup_move_group() can't see it
and update its ->sched_task_group before _put() and possibly free().

So the exiting task needs another sched_move_task() before exit_notify()
and we need to re-introduce the PF_EXITING (or similar) check removed by
the previous change for another reason.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: hartsjc@redhat.com
Cc: vbendel@redhat.com
Cc: vlovejoy@redhat.com
Link: http://lkml.kernel.org/r/20161114184612.GA15968@redhat.comSigned-off-by: Ingo Molnar <mingo@kernel.org>

8e5bfa8c

sched/autogroup: Fix autogroup_move_group() to never skip sched_move_task() · 18f649ef

Oleg Nesterov authored Nov 14, 2016

The PF_EXITING check in task_wants_autogroup() is no longer needed. Remove
it, but see the next patch.

However the comment is correct in that autogroup_move_group() must always
change task_group() for every thread so the sysctl_ check is very wrong;
we can race with cgroups and even sys_setsid() is not safe because a task
running with task_group() == ag->tg must participate in refcounting:

	int main(void)
	{
		int sctl = open("/proc/sys/kernel/sched_autogroup_enabled", O_WRONLY);

		assert(sctl > 0);
		if (fork()) {
			wait(NULL); // destroy the child's ag/tg
			pause();
		}

		assert(pwrite(sctl, "1\n", 2, 0) == 2);
		assert(setsid() > 0);
		if (fork())
			pause();

		kill(getppid(), SIGKILL);
		sleep(1);

		// The child has gone, the grandchild runs with kref == 1
		assert(pwrite(sctl, "0\n", 2, 0) == 2);
		assert(setsid() > 0);

		// runs with the freed ag/tg
		for (;;)
			sleep(1);

		return 0;
	}

crashes the kernel. It doesn't really need sleep(1), it doesn't matter if
autogroup_move_group() actually frees the task_group or this happens later.
Reported-by: Vern Lovejoy <vlovejoy@redhat.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: hartsjc@redhat.com
Cc: vbendel@redhat.com
Link: http://lkml.kernel.org/r/20161114184609.GA15965@redhat.comSigned-off-by: Ingo Molnar <mingo@kernel.org>

18f649ef

crypto: scatterwalk - Remove unnecessary aliasing check in map_and_copy · c8467f7a

Herbert Xu authored Nov 21, 2016

The aliasing check in map_and_copy is no longer necessary because
the IPsec ESP code no longer provides an IV that points into the
actual request data.  As this check is now triggering BUG checks
due to the vmalloced stack code, I'm removing it.
Reported-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

c8467f7a