Commits · be9015460ed58cd5ec2f944e29d09cdf32f0345c · Kirill Smelkov / linux

10 Nov, 2016 40 commits

Miklos Szeredi authored Oct 31, 2016

commit b93d4a0e upstream.

tmpfs doesn't have ->get_acl() because it only uses cached acls.

This fixes the acl tests in pjdfstest when tmpfs is used as the upper layer
of the overlay.
Reported-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Fixes: 39a25b2b ("ovl: define ->get_acl() for overlay inodes")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

be901546

MIPS: KASLR: Fix handling of NULL FDT · 2b632307

Matt Redfearn authored Oct 17, 2016

commit 47366979 upstream.

If platform code returns a NULL pointer to the FDT, initial_boot_params
will not get set to a valid pointer and attempting to find the /chosen
node in it will cause a NULL pointer dereference and the kernel to crash
immediately on startup - with no output to the console.

Fix this by checking that initial_boot_params is valid before using it.

Fixes: 405bc8fd ("MIPS: Kernel: Implement KASLR using CONFIG_RELOCATABLE")
Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/14414/Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

2b632307

nfsd: Fix general protection fault in release_lock_stateid() · 1734afcc

Chuck Lever authored Oct 29, 2016

commit f46c445b upstream.

When I push NFSv4.1 / RDMA hard, (xfstests generic/089, for example),
I get this crash on the server:

Oct 28 22:04:30 klimt kernel: general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
Oct 28 22:04:30 klimt kernel: Modules linked in: cts rpcsec_gss_krb5 iTCO_wdt iTCO_vendor_support sb_edac edac_core x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm btrfs irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd xor pcspkr raid6_pq i2c_i801 i2c_smbus lpc_ich mfd_core sg mei_me mei ioatdma shpchp wmi ipmi_si ipmi_msghandler rpcrdma ib_ipoib rdma_ucm acpi_power_meter acpi_pad ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c mlx4_ib mlx4_en ib_core sr_mod cdrom sd_mod ast drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm crc32c_intel igb ahci libahci ptp mlx4_core pps_core dca libata i2c_algo_bit i2c_core dm_mirror dm_region_hash dm_log dm_mod
Oct 28 22:04:30 klimt kernel: CPU: 7 PID: 1558 Comm: nfsd Not tainted 4.9.0-rc2-00005-g82cd754 #8
Oct 28 22:04:30 klimt kernel: Hardware name: Supermicro Super Server/X10SRL-F, BIOS 1.0c 09/09/2015
Oct 28 22:04:30 klimt kernel: task: ffff880835c3a100 task.stack: ffff8808420d8000
Oct 28 22:04:30 klimt kernel: RIP: 0010:[<ffffffffa05a759f>]  [<ffffffffa05a759f>] release_lock_stateid+0x1f/0x60 [nfsd]
Oct 28 22:04:30 klimt kernel: RSP: 0018:ffff8808420dbce0  EFLAGS: 00010246
Oct 28 22:04:30 klimt kernel: RAX: ffff88084e6660f0 RBX: ffff88084e667020 RCX: 0000000000000000
Oct 28 22:04:30 klimt kernel: RDX: 0000000000000007 RSI: 0000000000000000 RDI: ffff88084e667020
Oct 28 22:04:30 klimt kernel: RBP: ffff8808420dbcf8 R08: 0000000000000001 R09: 0000000000000000
Oct 28 22:04:30 klimt kernel: R10: ffff880835c3a100 R11: ffff880835c3aca8 R12: 6b6b6b6b6b6b6b6b
Oct 28 22:04:30 klimt kernel: R13: ffff88084e6670d8 R14: ffff880835f546f0 R15: ffff880835f1c548
Oct 28 22:04:30 klimt kernel: FS:  0000000000000000(0000) GS:ffff88087bdc0000(0000) knlGS:0000000000000000
Oct 28 22:04:30 klimt kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Oct 28 22:04:30 klimt kernel: CR2: 00007ff020389000 CR3: 0000000001c06000 CR4: 00000000001406e0
Oct 28 22:04:30 klimt kernel: Stack:
Oct 28 22:04:30 klimt kernel: ffff88084e667020 0000000000000000 ffff88084e6670d8 ffff8808420dbd20
Oct 28 22:04:30 klimt kernel: ffffffffa05ac80d ffff880835f54548 ffff88084e640008 ffff880835f545b0
Oct 28 22:04:30 klimt kernel: ffff8808420dbd70 ffffffffa059803d ffff880835f1c768 0000000000000870
Oct 28 22:04:30 klimt kernel: Call Trace:
Oct 28 22:04:30 klimt kernel: [<ffffffffa05ac80d>] nfsd4_free_stateid+0xfd/0x1b0 [nfsd]
Oct 28 22:04:30 klimt kernel: [<ffffffffa059803d>] nfsd4_proc_compound+0x40d/0x690 [nfsd]
Oct 28 22:04:30 klimt kernel: [<ffffffffa0583114>] nfsd_dispatch+0xd4/0x1d0 [nfsd]
Oct 28 22:04:30 klimt kernel: [<ffffffffa047bbf9>] svc_process_common+0x3d9/0x700 [sunrpc]
Oct 28 22:04:30 klimt kernel: [<ffffffffa047ca64>] svc_process+0xf4/0x330 [sunrpc]
Oct 28 22:04:30 klimt kernel: [<ffffffffa05827ca>] nfsd+0xfa/0x160 [nfsd]
Oct 28 22:04:30 klimt kernel: [<ffffffffa05826d0>] ? nfsd_destroy+0x170/0x170 [nfsd]
Oct 28 22:04:30 klimt kernel: [<ffffffff810b367b>] kthread+0x10b/0x120
Oct 28 22:04:30 klimt kernel: [<ffffffff810b3570>] ? kthread_stop+0x280/0x280
Oct 28 22:04:30 klimt kernel: [<ffffffff8174e8ba>] ret_from_fork+0x2a/0x40
Oct 28 22:04:30 klimt kernel: Code: c3 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 55 41 54 53 48 8b 87 b0 00 00 00 48 89 fb 4c 8b a0 98 00 00 00 <49> 8b 44 24 20 48 8d b8 80 03 00 00 e8 10 66 1a e1 48 89 df e8
Oct 28 22:04:30 klimt kernel: RIP  [<ffffffffa05a759f>] release_lock_stateid+0x1f/0x60 [nfsd]
Oct 28 22:04:30 klimt kernel: RSP <ffff8808420dbce0>
Oct 28 22:04:30 klimt kernel: ---[ end trace cf5d0b371973e167 ]---

Jeff Layton says:
> Hm...now that I look though, this is a little suspicious:
>
>    struct nfs4_openowner *oo = openowner(stp->st_openstp->st_stateowner);
>
> I wonder if it's possible for the openstateid to have already been
> destroyed at this point.
>
> We might be better off doing something like this to get the client pointer:
>
>    stp->st_stid.sc_client;
>
> ...which should be more direct and less dependent on other stateids
> staying valid.

With the suggested change, I am no longer able to reproduce the above oops.

v2: Fix unhash_lock_stateid() as well
Fix-suggested-by: Jeff Layton <jlayton@redhat.com>
Fixes: 42691398 ('nfsd: Fix race between FREE_STATEID and LOCK')
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

1734afcc

ARM: dts: fix the SD card on the Snowball · 202c6676

Linus Walleij authored Oct 07, 2016

commit 1b283eea upstream.

This fixes a very annoying regression on the Snowball SD card
that has been around for a while. It turns out that the device
tree does not configure the direction pins properly, nor sets
up the pins for the voltage converter properly at boot. Unless
all things are correctly set up, the feedback clock will not
work, and makes the driver spew messages in the console (but
it works, very slowly):

root@Ux500:/ mount /dev/mmcblk0p2 /mnt/
[    9.953460] mmci-pl18x 80126000.sdi0_per1: error during DMA transfer!
[    9.960296] mmcblk0: error -110 sending status command, retrying
[    9.966461] mmcblk0: error -110 sending status command, retrying
[    9.972534] mmcblk0: error -110 sending status command, aborting

Fix this by rectifying the device tree to correspond to that of
the Ux500 HREF boards plus the DAT31DIR setting that is unique for
the Snowball, and things start working smoothly. Add in the SDR12
and SDR25 modes which this host can do without any problems.

I don't know if this has ever been correct, sadly. It works after
this patch.
Reported-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

202c6676

ARM: mvebu: Select corediv clk for all mvebu v7 SoC · db20b510

Gregory CLEMENT authored Sep 19, 2016

commit 33c45ef8 upstream.

Since the commit bd3677ff ("clk: mvebu: Remove corediv clock from
Armada XP"), the corediv clk is no more selected for Armada XP, however
this clock is used for Armada XP using the compatible
armada-370-corediv-clock.

While since commit 1594d568 ("clk: mvebu: Move corediv config to
mvebu config") Armada 38x and Armada 375 got corediv support again, not
only Armada XP was missed but also Armada 39x.

Actually all the SoC selecting MVEBU_V7 config need this clock:
git grep "\-corediv-clock" arch/arm/boot/dts
arch/arm/boot/dts/armada-370-xp.dtsi: compatible = "marvell,armada-370-corediv-clock";
arch/arm/boot/dts/armada-375.dtsi:    compatible = "marvell,armada-375-corediv-clock";
arch/arm/boot/dts/armada-38x.dtsi:    compatible = "marvell,armada-380-corediv-clock";
arch/arm/boot/dts/armada-39x.dtsi:    compatible = "marvell,armada-390-corediv-clock"

This commit now fixes this behavior by letting MVEBU_V7 select
MVEBU_CLK_COREDIV.

Fixes: bd3677ff ("clk: mvebu: Remove corediv clock from Armada XP")
Reported-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

db20b510

KVM: MIPS: Precalculate MMIO load resume PC · c627b2e7

James Hogan authored Oct 25, 2016

commit e1e575f6 upstream.

The advancing of the PC when completing an MMIO load is done before
re-entering the guest, i.e. before restoring the guest ASID. However if
the load is in a branch delay slot it may need to access guest code to
read the prior branch instruction. This isn't safe in TLB mapped code at
the moment, nor in the future when we'll access unmapped guest segments
using direct user accessors too, as it could read the branch from host
user memory instead.

Therefore calculate the resume PC in advance while we're still in the
right context and save it in the new vcpu->arch.io_pc (replacing the no
longer needed vcpu->arch.pending_load_cause), and restore it on MMIO
completion.

Fixes: e685c689 ("KVM/MIPS32: Privileged instruction/target branch emulation.")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

c627b2e7

KVM: MIPS: Make ERET handle ERL before EXL · f3a0c969

James Hogan authored Oct 25, 2016

commit ede5f3e7 upstream.

The ERET instruction to return from exception is used for returning from
exception level (Status.EXL) and error level (Status.ERL). If both bits
are set however we should be returning from ERL first, as ERL can
interrupt EXL, for example when an NMI is taken. KVM however checks EXL
first.

Fix the order of the checks to match the pseudocode in the instruction
set manual.

Fixes: e685c689 ("KVM/MIPS32: Privileged instruction/target branch emulation.")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

f3a0c969

KVM: s390: Fix STHYI buffer alignment for diag224 · 961cf133

Janosch Frank authored Oct 26, 2016

commit 45c7ee43 upstream.

Diag224 requires a page-aligned 4k buffer to store the name table
into. kmalloc does not guarantee page alignment, hence we replace it
with __get_free_page for the buffer allocation.
Reported-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

961cf133

KVM: x86: fix wbinvd_dirty_mask use-after-free · 88aca01f

Ido Yariv authored Oct 21, 2016

commit bd768e14 upstream.

vcpu->arch.wbinvd_dirty_mask may still be used after freeing it,
corrupting memory. For example, the following call trace may set a bit
in an already freed cpu mask:
    kvm_arch_vcpu_load
    vcpu_load
    vmx_free_vcpu_nested
    vmx_free_vcpu
    kvm_arch_vcpu_free

Fix this by deferring freeing of wbinvd_dirty_mask.
Signed-off-by: Ido Yariv <ido@wizery.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

88aca01f

dm: free io_barrier after blk_cleanup_queue call · ea261d17

Tahsin Erdogan authored Oct 10, 2016

commit d09960b0 upstream.

dm_old_request_fn() has paths that access md->io_barrier.  The party
destroying io_barrier should ensure that no future execution of
dm_old_request_fn() is possible.  Move io_barrier destruction to below
blk_cleanup_queue() to ensure this and avoid a NULL pointer crash during
request-based DM device shutdown.
Signed-off-by: Tahsin Erdogan <tahsin@google.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ea261d17

Staging: wilc1000: Fix kernel Oops on opening the device · 377a2a27

Aditya Shankar authored Oct 07, 2016

commit 1d4f1d53 upstream.

Commit 2518ac59 ("staging: wilc1000: Replace kthread with workqueue
for host interface") adds an unconditional destroy_workqueue() on the
wilc's "hif_workqueue" soon after its creation thereby rendering
it unusable. It then further attempts to queue work onto this
non-existing hif_worqueue and results in:

Unable to handle kernel NULL pointer dereference at virtual address 00000010
pgd = de478000
[00000010] *pgd=3eec0831, *pte=00000000, *ppte=00000000
Internal error: Oops: 17 [#1] ARM
Modules linked in: wilc1000_sdio(C) wilc1000(C)
CPU: 0 PID: 825 Comm: ifconfig Tainted: G         C      4.8.0-rc8+ #37
Hardware name: Atmel SAMA5
task: df56f800 task.stack: deeb0000
PC is at __queue_work+0x90/0x284
LR is at __queue_work+0x58/0x284
pc : [<c0126bb0>]    lr : [<c0126b78>]    psr: 600f0093
sp : deeb1aa0  ip : def22d78  fp : deea6000
r10: 00000000  r9 : c0a08150  r8 : c0a2f058
r7 : 00000001  r6 : dee9b600  r5 : def22d74  r4 : 00000000
r3 : 00000000  r2 : def22d74  r1 : 07ffffff  r0 : 00000000
Flags: nZCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment none
...
[<c0127060>] (__queue_work) from [<c0127298>] (queue_work_on+0x34/0x40)
[<c0127298>] (queue_work_on) from [<bf0076b4>] (wilc_enqueue_cmd+0x54/0x64 [wilc1000])
[<bf0076b4>] (wilc_enqueue_cmd [wilc1000]) from [<bf0082b4>] (wilc_set_wfi_drv_handler+0x48/0x70 [wilc1000])
[<bf0082b4>] (wilc_set_wfi_drv_handler [wilc1000]) from [<bf00509c>] (wilc_mac_open+0x214/0x250 [wilc1000])
[<bf00509c>] (wilc_mac_open [wilc1000]) from [<c04fde98>] (__dev_open+0xb8/0x11c)
[<c04fde98>] (__dev_open) from [<c04fe128>] (__dev_change_flags+0x94/0x158)
[<c04fe128>] (__dev_change_flags) from [<c04fe204>] (dev_change_flags+0x18/0x48)
[<c04fe204>] (dev_change_flags) from [<c0557d5c>] (devinet_ioctl+0x6b4/0x788)
[<c0557d5c>] (devinet_ioctl) from [<c04e40a0>] (sock_ioctl+0x154/0x2cc)
[<c04e40a0>] (sock_ioctl) from [<c01b16e0>] (do_vfs_ioctl+0x9c/0x878)
[<c01b16e0>] (do_vfs_ioctl) from [<c01b1ef0>] (SyS_ioctl+0x34/0x5c)
[<c01b1ef0>] (SyS_ioctl) from [<c0107520>] (ret_fast_syscall+0x0/0x3c)
Code: e5932004 e1520006 01a04003 0affffff (e5943010)
---[ end trace b612328adaa6bf20 ]---

This fix removes the unnecessary call to destroy_workqueue() while opening
the device to avoid the above kernel panic. The deinit routine already
does a good job of terminating the workqueue when no longer needed.
Reported-by: Nicolas Ferre <Nicolas.Ferre@microchip.com>
Fixes: 2518ac59 ("staging: wilc1000: Replace kthread with workqueue for host interface")
Signed-off-by: Aditya Shankar <Aditya.Shankar@microchip.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

377a2a27

iio:chemical:atlas-ph-sensor: Fix use of 32 bit int to hold 16 bit big endian value · 0c4ffbf9

Sandhya Bankar authored Sep 25, 2016

commit d1fe85ec upstream.

This will result in a random value being reported on big endian architectures.
(thanks to Lars-Peter Clausen for pointing out the effects of this bug)

Only effects a value printed to the log, but as this reports the settings of
the probe in question it may be of direct interest to users.

Also, fixes the following sparse endianness warnings:

drivers/iio/chemical/atlas-ph-sensor.c:215:9: warning: cast to restricted __be16
drivers/iio/chemical/atlas-ph-sensor.c:215:9: warning: cast to restricted __be16
drivers/iio/chemical/atlas-ph-sensor.c:215:9: warning: cast to restricted __be16
drivers/iio/chemical/atlas-ph-sensor.c:215:9: warning: cast to restricted __be16
drivers/iio/chemical/atlas-ph-sensor.c:215:9: warning: cast to restricted __be16
drivers/iio/chemical/atlas-ph-sensor.c:215:9: warning: cast to restricted __be16
drivers/iio/chemical/atlas-ph-sensor.c:215:9: warning: cast to restricted __be16
drivers/iio/chemical/atlas-ph-sensor.c:215:9: warning: cast to restricted __be16
Signed-off-by: Sandhya Bankar <bankarsandhya512@gmail.com>
Fixes: e8dd92bf ("iio: chemical: atlas-ph-sensor: add EC feature")
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

0c4ffbf9

arm64: dts: marvell: fix clocksource for CP110 master SPI0 · 52a1e76f

Marcin Wojtas authored Sep 06, 2016

commit 51227bf5 upstream.

I2C and SPI interfaces share common clock trees within the CP110 HW block.
It occurred that SPI0 interface has wrong clock assignment in the device
tree, which is fixed in this commit to a proper value.

Fixes: 728dacc7 ("arm64: dts: marvell: initial DT description of ...")
Signed-off-by: Marcin Wojtas <mw@semihalf.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

52a1e76f

tty: limit terminal size to 4M chars · 0dff3c63

Dmitry Vyukov authored Oct 14, 2016

commit 32b2921e upstream.

Size of kmalloc() in vc_do_resize() is controlled by user.
Too large kmalloc() size triggers WARNING message on console.
Put a reasonable upper bound on terminal size to prevent WARNINGs.
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
CC: David Rientjes <rientjes@google.com>
Cc: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jiri Slaby <jslaby@suse.com>
Cc: Peter Hurley <peter@hurleysoftware.com>
Cc: linux-kernel@vger.kernel.org
Cc: syzkaller@googlegroups.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

0dff3c63

xhci: workaround for hosts missing CAS bit · 44f0722d

Mathias Nyman authored Oct 20, 2016

commit 346e9973 upstream.

If a device is unplugged and replugged during Sx system suspend
some  Intel xHC hosts will overwrite the CAS (Cold attach status) flag
and no device connection is noticed in resume.

A device in this state can be identified in resume if its link state
is in polling or compliance mode, and the current connect status is 0.
A device in this state needs to be warm reset.

Intel 100/c230 series PCH specification update Doc #332692-006 Errata #8

Observed on Cherryview and Apollolake as they go into compliance mode
if LFPS times out during polling, and re-plugged devices are not
discovered at resume.
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

44f0722d

xhci: add restart quirk for Intel Wildcatpoint PCH · 0894224a

Mathias Nyman authored Oct 20, 2016

commit 4c39135a upstream.

xHC in Wildcatpoint-LP PCH is similar to LynxPoint-LP and need the
same quirks to prevent machines from spurious restart while
shutting them down.
Reported-by: Hasan Mahmood <hasan.mahm@gmail.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

0894224a

hv: do not lose pending heartbeat vmbus packets · b2d28d93

Long Li authored Oct 05, 2016

commit 407a3aee upstream.

The host keeps sending heartbeat packets independent of the
guest responding to them.  Even though we respond to the heartbeat messages at
interrupt level, we can have situations where there maybe multiple heartbeat
messages pending that have not been responded to. For instance this occurs when the
VM is paused and the host continues to send the heartbeat messages.
Address this issue by draining and responding to all
the heartbeat messages that maybe pending.
Signed-off-by: Long Li <longli@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

b2d28d93

vt: clear selection before resizing · eeae0a12

Scot Doyle authored Oct 13, 2016

commit 009e39ae upstream.

When resizing a vt its selection may exceed the new size, resulting in
an invalid memory access [1]. Clear the selection before resizing.

[1] http://lkml.kernel.org/r/CACT4Y+acDTwy4umEvf5ROBGiRJNrxHN4Cn5szCXE5Jw-d1B=Xw@mail.gmail.comReported-and-tested-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Scot Doyle <lkml14@scotdoyle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

eeae0a12

x86/smpboot: Init apic mapping before usage · 9710f5b1

Thomas Gleixner authored Oct 29, 2016

commit 1e90a13d upstream.

The recent changes, which forced the registration of the boot cpu on UP
systems, which do not have ACPI tables, have been fixed for systems w/o
local APIC, but left a wreckage for systems which have neither ACPI nor
mptables, but the CPU has an APIC, e.g. virtualbox.

The boot process crashes in prefill_possible_map() as it wants to register
the boot cpu, which needs to access the local apic, but the local APIC is
not yet mapped.

There is no reason why init_apic_mapping() can't be invoked before
prefill_possible_map(). So instead of playing another silly early mapping
game, as the ACPI/mptables code does, we just move init_apic_mapping()
before the call to prefill_possible_map().

In hindsight, I should have noticed that combination earlier.

Sorry for the churn (also in stable)!

Fixes: ff856051 ("x86/boot/smp: Don't try to poke disabled/non-existent APIC")
Reported-and-debugged-by: Michal Necasek <michal.necasek@oracle.com>
Reported-and-tested-by: Wolfgang Bauer <wbauer@tmo.at>
Cc: prarit@redhat.com
Cc: ville.syrjala@linux.intel.com
Cc: michael.thayer@oracle.com
Cc: knut.osmundsen@oracle.com
Cc: frank.mehnert@oracle.com
Cc: Borislav Petkov <bp@alien8.de>
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1610282114380.5053@nanosSigned-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

9710f5b1

GenWQE: Fix bad page access during abort of resource allocation · 58b0a7f1

Gerald Schaefer authored Oct 19, 2016

commit a7a7aeef upstream.

When interrupting an application which was allocating DMAable
memory, it was possible, that the DMA memory was deallocated
twice, leading to the error symptoms below.

Thanks to Gerald, who analyzed the problem and provided this
patch.

I agree with his analysis of the problem: ddcb_cmd_fixups() ->
genwqe_alloc_sync_sgl() (fails in f/lpage, but sgl->sgl != NULL
and f/lpage maybe also != NULL) -> ddcb_cmd_cleanup() ->
genwqe_free_sync_sgl() (double free, because sgl->sgl != NULL and
f/lpage maybe also != NULL)

In this scenario we would have exactly the kind of double free that
would explain the WARNING / Bad page state, and as expected it is
caused by broken error handling (cleanup).

Using the Ubuntu git source, tag Ubuntu-4.4.0-33.52, he was able to reproduce
the "Bad page state" issue, and with the patch on top he could not reproduce
it any more.

------------[ cut here ]------------
WARNING: at /build/linux-o03cxz/linux-4.4.0/arch/s390/include/asm/pci_dma.h:141
Modules linked in: qeth_l2 ghash_s390 prng aes_s390 des_s390 des_generic sha512_s390 sha256_s390 sha1_s390 sha_common genwqe_card qeth crc_itu_t qdio ccwgroup vmur dm_multipath dasd_eckd_mod dasd_mod
CPU: 2 PID: 3293 Comm: genwqe_gunzip Not tainted 4.4.0-33-generic #52-Ubuntu
task: 0000000032c7e270 ti: 00000000324e4000 task.ti: 00000000324e4000
Krnl PSW : 0404c00180000000 0000000000156346 (dma_update_cpu_trans+0x9e/0xa8)
           R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 EA:3
Krnl GPRS: 00000000324e7bcd 0000000000c3c34a 0000000027628298 000000003215b400
           0000000000000400 0000000000001fff 0000000000000400 0000000116853000
           07000000324e7b1e 0000000000000001 0000000000000001 0000000000000001
           0000000000001000 0000000116854000 0000000000156402 00000000324e7a38
Krnl Code: 000000000015633a: 95001000           cli     0(%r1),0
           000000000015633e: a774ffc3           brc     7,1562c4
          #0000000000156342: a7f40001           brc     15,156344
          >0000000000156346: 92011000           mvi     0(%r1),1
           000000000015634a: a7f4ffbd           brc     15,1562c4
           000000000015634e: 0707               bcr     0,%r7
           0000000000156350: c00400000000       brcl    0,156350
           0000000000156356: eb7ff0500024       stmg    %r7,%r15,80(%r15)
Call Trace:
([<00000000001563e0>] dma_update_trans+0x90/0x228)
 [<00000000001565dc>] s390_dma_unmap_pages+0x64/0x160
 [<00000000001567c2>] s390_dma_free+0x62/0x98
 [<000003ff801310ce>] __genwqe_free_consistent+0x56/0x70 [genwqe_card]
 [<000003ff801316d0>] genwqe_free_sync_sgl+0xf8/0x160 [genwqe_card]
 [<000003ff8012bd6e>] ddcb_cmd_cleanup+0x86/0xa8 [genwqe_card]
 [<000003ff8012c1c0>] do_execute_ddcb+0x110/0x348 [genwqe_card]
 [<000003ff8012c914>] genwqe_ioctl+0x51c/0xc20 [genwqe_card]
 [<000000000032513a>] do_vfs_ioctl+0x3b2/0x518
 [<0000000000325344>] SyS_ioctl+0xa4/0xb8
 [<00000000007b86c6>] system_call+0xd6/0x264
 [<000003ff9e8e520a>] 0x3ff9e8e520a
Last Breaking-Event-Address:
 [<0000000000156342>] dma_update_cpu_trans+0x9a/0xa8
---[ end trace 35996336235145c8 ]---
BUG: Bad page state in process jbd2/dasdb1-8  pfn:3215b
page:000003d100c856c0 count:-1 mapcount:0 mapping:          (null) index:0x0
flags: 0x3fffc0000000000()
page dumped because: nonzero _count
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Frank Haverkamp <haver@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

58b0a7f1

usb: increase ohci watchdog delay to 275 msec · b9aa0a72

Bryan Paluch authored Oct 17, 2016

commit ed6d6f8f upstream.

Increase ohci watchout delay to 275 ms. Previous delay was 250 ms
with 20 ms of slack, after removing slack time some ohci controllers don't
respond in time. Logs from systems with controllers that have the
issue would show "HcDoneHead not written back; disabled"
Signed-off-by: Bryan Paluch <bryanpaluch@gmail.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

b9aa0a72

usb: renesas_usbhs: add wait after initialization for R-Car Gen3 · 241208e7

Yoshihiro Shimoda authored Oct 20, 2016

commit b7603239 upstream.

Since the controller on R-Car Gen3 doesn't have any status registers
to detect initialization (LPSTS.SUSPM = 1) and the initialization needs
up to 45 usec, this patch adds wait after the initialization. Otherwise,
writing other registers (e.g. INTENB0) will fail.

Fixes: de18757e ("usb: renesas_usbhs: add R-Car Gen3 power control")
Cc: <balbi@kernel.org>
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

241208e7

xhci: use default USB_RESUME_TIMEOUT when resuming ports. · 00dbeb06

Mathias Nyman authored Oct 20, 2016

commit 7d3b016a upstream.

USB2 host inititated resume, and system suspend bus resume
need to use the same USB_RESUME_TIMEOUT as elsewhere.

This resolves a device disconnect issue at system resume seen
on Intel Braswell and Apollolake, but is in no way limited to
those platforms.
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

00dbeb06

USB: serial: ftdi_sio: add support for Infineon TriBoard TC2X7 · 1e306cd3

Stefan Tauner authored Oct 06, 2016

commit ca006f78 upstream.

This adds support to ftdi_sio for the Infineon TriBoard TC2X7
engineering board for first-generation Aurix SoCs with Tricore CPUs.
Mere addition of the device IDs does the job.
Signed-off-by: Stefan Tauner <stefan.tauner@technikum-wien.at>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

1e306cd3

USB: serial: cp210x: fix tiocmget error handling · d082fd10

Johan Hovold authored Oct 19, 2016

commit de24e0a1 upstream.

The current tiocmget implementation would fail to report errors up the
stack and instead leaked a few bits from the stack as a mask of
modem-status flags.

Fixes: 39a66b8d ("[PATCH] USB: CP2101 Add support for flow control")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

d082fd10

USB: serial: fix potential NULL-dereference at probe · e8bf7267

Johan Hovold authored Oct 21, 2016

commit 126d26f6 upstream.

Make sure we have at least one port before attempting to register a
console.

Currently, at least one driver binds to a "dummy" interface and requests
zero ports for it. Should such an interface also lack endpoints, we get
a NULL-deref during probe.

Fixes: e5b1e206 ("USB: serial: make minor allocation dynamic")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

e8bf7267

usb: gadget: function: u_ether: don't starve tx request queue · 23124735

Felipe Balbi authored Oct 04, 2016

commit 6c83f772 upstream.

If we don't guarantee that we will always get an
interrupt at least when we're queueing our very last
request, we could fall into situation where we queue
every request with 'no_interrupt' set. This will
cause the link to get stuck.

The behavior above has been triggered with g_ether
and dwc3.
Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

23124735

usb: gadget: udc: atmel: fix endpoint name · fe4af125

Alexandre Belloni authored Sep 15, 2016

commit bbe097f0 upstream.

Since commit c32b5bcf ("ARM: dts: at91: Fix USB endpoint nodes"),
atmel_usba_udc fails with:

------------[ cut here ]------------
WARNING: CPU: 0 PID: 0 at include/linux/usb/gadget.h:405
ecm_do_notify+0x188/0x1a0
Modules linked in:
CPU: 0 PID: 0 Comm: swapper Not tainted 4.7.0+ #15
Hardware name: Atmel SAMA5
[<c010ccfc>] (unwind_backtrace) from [<c010a7ec>] (show_stack+0x10/0x14)
[<c010a7ec>] (show_stack) from [<c0115c10>] (__warn+0xe4/0xfc)
[<c0115c10>] (__warn) from [<c0115cd8>] (warn_slowpath_null+0x20/0x28)
[<c0115cd8>] (warn_slowpath_null) from [<c04377ac>] (ecm_do_notify+0x188/0x1a0)
[<c04377ac>] (ecm_do_notify) from [<c04379a4>] (ecm_set_alt+0x74/0x1ac)
[<c04379a4>] (ecm_set_alt) from [<c042f74c>] (composite_setup+0xfc0/0x19f8)
[<c042f74c>] (composite_setup) from [<c04356e8>] (usba_udc_irq+0x8f4/0xd9c)
[<c04356e8>] (usba_udc_irq) from [<c013ec9c>] (handle_irq_event_percpu+0x9c/0x158)
[<c013ec9c>] (handle_irq_event_percpu) from [<c013ed80>] (handle_irq_event+0x28/0x3c)
[<c013ed80>] (handle_irq_event) from [<c01416d4>] (handle_fasteoi_irq+0xa0/0x168)
[<c01416d4>] (handle_fasteoi_irq) from [<c013e3f8>] (generic_handle_irq+0x24/0x34)
[<c013e3f8>] (generic_handle_irq) from [<c013e640>] (__handle_domain_irq+0x54/0xa8)
[<c013e640>] (__handle_domain_irq) from [<c010b214>] (__irq_svc+0x54/0x70)
[<c010b214>] (__irq_svc) from [<c0107eb0>] (arch_cpu_idle+0x38/0x3c)
[<c0107eb0>] (arch_cpu_idle) from [<c0137300>] (cpu_startup_entry+0x9c/0xdc)
[<c0137300>] (cpu_startup_entry) from [<c0900c40>] (start_kernel+0x354/0x360)
[<c0900c40>] (start_kernel) from [<20008078>] (0x20008078)
---[ end trace e7cf9dcebf4815a6 ]---

Fixes: c32b5bcf ("ARM: dts: at91: Fix USB endpoint nodes")
Reported-by: Richard Genoud <richard.genoud@gmail.com>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

fe4af125

mei: txe: don't clean an unprocessed interrupt cause. · 420d1689

Alexander Usyskin authored Oct 19, 2016

commit 43605e29 upstream.

SEC registers are not accessible when the TXE device is in low power
state, hence the SEC interrupt cannot be processed if device is not
awake.

In some rare cases entrance to low power state (aliveness off) and input
ready bits can be signaled at the same time, resulting in communication
stall as input ready won't be signaled again after waking up. To resolve
this IPC_HHIER_SEC bit in HHISR_REG should not be cleaned if the
interrupt is not processed.
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

420d1689

ubifs: Fix regression in ubifs_readdir() · 5d30e8f6

Richard Weinberger authored Oct 28, 2016

commit a00052a2 upstream.

Commit c83ed4c9 ("ubifs: Abort readdir upon error") broke
overlayfs support because the fix exposed an internal error
code to VFS.
Reported-by: Peter Rosin <peda@axentia.se>
Tested-by: Peter Rosin <peda@axentia.se>
Reported-by: Ralph Sennhauser <ralph.sennhauser@gmail.com>
Tested-by: Ralph Sennhauser <ralph.sennhauser@gmail.com>
Fixes: c83ed4c9 ("ubifs: Abort readdir upon error")
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

5d30e8f6

ubifs: Abort readdir upon error · b8176cc5

Richard Weinberger authored Oct 19, 2016

commit c83ed4c9 upstream.

If UBIFS is facing an error while walking a directory, it reports this
error and ubifs_readdir() returns the error code. But the VFS readdir
logic does not make the getdents system call fail in all cases. When the
readdir cursor indicates that more entries are present, the system call
will just return and the libc wrapper will try again since it also
knows that more entries are present.
This causes the libc wrapper to busy loop for ever when a directory is
corrupted on UBIFS.
A common approach do deal with corrupted directory entries is
skipping them by setting the cursor to the next entry. On UBIFS this
approach is not possible since we cannot compute the next directory
entry cursor position without reading the current entry. So all we can
do is setting the cursor to the "no more entries" position and make
getdents exit.
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

b8176cc5

timers: Lock base for same bucket optimization · 1755f43e

Thomas Gleixner authored Oct 24, 2016

commit 4da9152a upstream.

Linus stumbled over the unlocked modification of the timer expiry value in
mod_timer() which is an optimization for timers which stay in the same
bucket - due to the bucket granularity - despite their expiry time getting
updated.

The optimization itself still makes sense even if we take the lock, because
in case that the bucket stays the same, we avoid the pointless
queue/enqueue dance.

Make the check and the modification of timer->expires protected by the base
lock and shuffle the remaining code around so we can keep the lock held
when we actually have to requeue the timer to a different bucket.

Fixes: f00c0afd ("timers: Implement optimization for same expiry time in mod_timer()")
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1610241711220.4983@nanos
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

1755f43e

timers: Plug locking race vs. timer migration · e18ed431

Thomas Gleixner authored Oct 24, 2016

commit b831275a upstream.

Linus noticed that lock_timer_base() lacks a READ_ONCE() for accessing the
timer flags. As a consequence the compiler is allowed to reload the flags
between the initial check for TIMER_MIGRATION and the following timer base
computation and the spin lock of the base.

While this has not been observed (yet), we need to make sure that it never
happens.

Fixes: 0eeda71b ("timer: Replace timer base by a cpu index")
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1610241711220.4983@nanos
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

e18ed431

timers: Prevent base clock corruption when forwarding · b5e3a038

Thomas Gleixner authored Oct 22, 2016

commit 6bad6bcc upstream.

When a timer is enqueued we try to forward the timer base clock. This
mechanism has two issues:

1) Forwarding a remote base unlocked

The forwarding function is called from get_target_base() with the current
timer base lock held. But if the new target base is a different base than
the current base (can happen with NOHZ, sigh!) then the forwarding is done
on an unlocked base. This can lead to corruption of base->clk.

Solution is simple: Invoke the forwarding after the target base is locked.

2) Possible corruption due to jiffies advancing

This is similar to the issue in get_net_timer_interrupt() which was fixed
in the previous patch. jiffies can advance between check and assignement
and therefore advancing base->clk beyond the next expiry value.

So we need to read jiffies into a local variable once and do the checks and
assignment with the local copy.

Fixes: a683f390("timers: Forward the wheel clock whenever possible")
Reported-by: Ashton Holmes <scoopta@gmail.com>
Reported-by: Michael Thayer <michael.thayer@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Michal Necasek <michal.necasek@oracle.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: knut.osmundsen@oracle.com
Cc: stern@rowland.harvard.edu
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20161022110552.253640125@linutronix.deSigned-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

b5e3a038

timers: Prevent base clock rewind when forwarding clock · 665f7bf3

Thomas Gleixner authored Oct 22, 2016

commit 041ad7bc upstream.

Ashton and Michael reported, that kernel versions 4.8 and later suffer from
USB timeouts which are caused by the timer wheel rework.

This is caused by a bug in the base clock forwarding mechanism, which leads
to timers expiring early. The scenario which leads to this is:

run_timers()
  while (jiffies >= base->clk) {
    collect_expired_timers();
    base->clk++;
    expire_timers();
  }

So base->clk = jiffies + 1. Now the cpu goes idle:

idle()
  get_next_timer_interrupt()
    nextevt = __next_time_interrupt();
    if (time_after(nextevt, base->clk))
       	base->clk = jiffies;

jiffies has not advanced since run_timers(), so this assignment effectively
decrements base->clk by one.

base->clk is the index into the timer wheel arrays. So let's assume the
following state after the base->clk increment in run_timers():

 jiffies = 0
 base->clk = 1

A timer gets enqueued with an expiry delta of 63 ticks (which is the case
with the USB timeout and HZ=250) so the resulting bucket index is:

  base->clk + delta = 1 + 63 = 64

The timer goes into the first wheel level. The array size is 64 so it ends
up in bucket 0, which is correct as it takes 63 ticks to advance base->clk
to index into bucket 0 again.

If the cpu goes idle before jiffies advance, then the bug in the forwarding
mechanism sets base->clk back to 0, so the next invocation of run_timers()
at the next tick will index into bucket 0 and therefore expire the timer 62
ticks too early.

Instead of blindly setting base->clk to jiffies we must make the forwarding
conditional on jiffies > base->clk, but we cannot use jiffies for this as
we might run into the following issue:

  if (time_after(jiffies, base->clk) {
    if (time_after(nextevt, base->clk))
       base->clk = jiffies;

jiffies can increment between the check and the assigment far enough to
advance beyond nextevt. So we need to use a stable value for checking.

get_next_timer_interrupt() has the basej argument which is the jiffies
value snapshot taken in the calling code. So we can just that.

Thanks to Ashton for bisecting and providing trace data!

Fixes: a683f390 ("timers: Forward the wheel clock whenever possible")
Reported-by: Ashton Holmes <scoopta@gmail.com>
Reported-by: Michael Thayer <michael.thayer@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Michal Necasek <michal.necasek@oracle.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: knut.osmundsen@oracle.com
Cc: stern@rowland.harvard.edu
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20161022110552.175308322@linutronix.deSigned-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

665f7bf3

x86/microcode/AMD: Fix more fallout from CONFIG_RANDOMIZE_MEMORY=y · 0d621c57

Borislav Petkov authored Oct 27, 2016

commit 1c27f646 upstream.

We needed the physical address of the container in order to compute the
offset within the relocated ramdisk. And we did this by doing __pa() on
the virtual address.

However, __pa() does checks whether the physical address is within
PAGE_OFFSET and __START_KERNEL_map - see __phys_addr() - which fail
if we have CONFIG_RANDOMIZE_MEMORY enabled: we feed a virtual address
which *doesn't* have the randomization offset into a function which uses
PAGE_OFFSET which *does* have that offset.

This makes this check fire:

	VIRTUAL_BUG_ON((x > y) || !phys_addr_valid(x));
			^^^^^^

due to the randomization offset.

The fix is as simple as using __pa_nodebug() because we do that
randomization offset accounting later in that function ourselves.
Reported-by: Bob Peterson <rpeterso@redhat.com>
Tested-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andreas Gruenbacher <agruenba@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-mm <linux-mm@kvack.org>
Link: http://lkml.kernel.org/r/20161027123623.j2jri5bandimboff@pd.tnicSigned-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

0d621c57

powerpc/64: Fix race condition in setting lock bit in idle/wakeup code · e599203f

Paul Mackerras authored Oct 21, 2016

commit 09b7e37b upstream.

This fixes a race condition where one thread that is entering or
leaving a power-saving state can inadvertently ignore the lock bit
that was set by another thread, and potentially also clear it.
The core_idle_lock_held function is called when the lock bit is
seen to be set.  It polls the lock bit until it is clear, then
does a lwarx to load the word containing the lock bit and thread
idle bits so it can be updated.  However, it is possible that the
value loaded with the lwarx has the lock bit set, even though an
immediately preceding lwz loaded a value with the lock bit clear.
If this happens then we go ahead and update the word despite the
lock bit being set, and when called from pnv_enter_arch207_idle_mode,
we will subsequently clear the lock bit.

No identifiable misbehaviour has been attributed to this race.

This fixes it by checking the lock bit in the value loaded by the
lwarx.  If it is set then we just go back and keep on polling.

Fixes: b32aadc1 ("powerpc/powernv: Fix race in updating core_idle_state")
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

e599203f

powerpc/64: Re-fix race condition between going idle and entering guest · 51d784b5

Paul Mackerras authored Oct 21, 2016

commit 56c46222 upstream.

Commit 8117ac6a ("powerpc/powernv: Switch off MMU before entering
nap/sleep/rvwinkle mode", 2014-12-10) fixed a race condition where one
thread entering a KVM guest could switch the MMU context to the guest
while another thread was still in host kernel context with the MMU on.
That commit moved the point where a thread entering a power-saving
mode set its kvm_hstate.hwthread_state field in its PACA to
KVM_HWTHREAD_IN_IDLE from a point where the MMU was on to after the
MMU had been switched off.  That commit also added a comment
explaining that we have to switch to real mode before setting
hwthread_state to avoid this race.

Nevertheless, commit 4eae2c9a ("powerpc/powernv: Make
pnv_powersave_common more generic", 2016-07-08) subsequently moved
the setting of hwthread_state back to a point where the MMU is on,
thus reintroducing the race, despite the comment saying that this
should not be done being included in full in the context lines of
the patch that did it.

This fixes the race again and adds a bigger and shoutier comment
explaining the potential race condition.

Fixes: 4eae2c9a ("powerpc/powernv: Make pnv_powersave_common more generic")
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Reviewed-by: Shreyas B. Prabhu <shreyasbp@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

51d784b5

powerpc/mm/radix: Use tlbiel only if we ever ran on the current cpu · 2c7ff0e5

Aneesh Kumar K.V authored Oct 24, 2016

commit bd77c449 upstream.

Before this patch, we used tlbiel, if we ever ran only on this core.
That was mostly derived from the nohash usage of the same. But is
incorrect, the ISA 3.0 clarifies tlbiel such that:

"All TLB entries that have all of the following properties are made
invalid on the thread executing the tlbiel instruction"

ie. tlbiel only invalidates TLB entries on the current thread. So if the
mm has been used on any other thread (aka. cpu) then we must broadcast
the invalidate.

This bug could lead to invalid TLB entries if a program runs on multiple
threads of a core.

Hence use tlbiel, if we only ever ran on only the current cpu.

Fixes: 1a472c9d ("powerpc/mm/radix: Add tlbflush routines")
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

2c7ff0e5

powerpc: Convert cmp to cmpd in idle enter sequence · ae150de2

Segher Boessenkool authored Oct 06, 2016

commit 80f23935 upstream.

PowerPC's "cmp" instruction has four operands. Normally people write
"cmpw" or "cmpd" for the second cmp operand 0 or 1. But, frequently
people forget, and write "cmp" with just three operands.

With older binutils this is silently accepted as if this was "cmpw",
while often "cmpd" is wanted. With newer binutils GAS will complain
about this for 64-bit code. For 32-bit code it still silently assumes
"cmpw" is what is meant.

In this instance the code comes directly from ISA v2.07, including the
cmp, but cmpd is correct. Backport to stable so that new toolchains can
build old kernels.

Fixes: 948cf67c ("powerpc: Add NAP mode support on Power7 in HV mode")
Reviewed-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Signed-off-by: Segher Boessenkool <segher@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ae150de2