1. 10 Nov, 2016 40 commits
    • Matt Redfearn's avatar
      MIPS: KASLR: Fix handling of NULL FDT · 2b632307
      Matt Redfearn authored
      commit 47366979 upstream.
      
      If platform code returns a NULL pointer to the FDT, initial_boot_params
      will not get set to a valid pointer and attempting to find the /chosen
      node in it will cause a NULL pointer dereference and the kernel to crash
      immediately on startup - with no output to the console.
      
      Fix this by checking that initial_boot_params is valid before using it.
      
      Fixes: 405bc8fd ("MIPS: Kernel: Implement KASLR using CONFIG_RELOCATABLE")
      Signed-off-by: default avatarMatt Redfearn <matt.redfearn@imgtec.com>
      Cc: linux-mips@linux-mips.org
      Cc: linux-kernel@vger.kernel.org
      Patchwork: https://patchwork.linux-mips.org/patch/14414/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2b632307
    • Chuck Lever's avatar
      nfsd: Fix general protection fault in release_lock_stateid() · 1734afcc
      Chuck Lever authored
      commit f46c445b upstream.
      
      When I push NFSv4.1 / RDMA hard, (xfstests generic/089, for example),
      I get this crash on the server:
      
      Oct 28 22:04:30 klimt kernel: general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
      Oct 28 22:04:30 klimt kernel: Modules linked in: cts rpcsec_gss_krb5 iTCO_wdt iTCO_vendor_support sb_edac edac_core x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm btrfs irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd xor pcspkr raid6_pq i2c_i801 i2c_smbus lpc_ich mfd_core sg mei_me mei ioatdma shpchp wmi ipmi_si ipmi_msghandler rpcrdma ib_ipoib rdma_ucm acpi_power_meter acpi_pad ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c mlx4_ib mlx4_en ib_core sr_mod cdrom sd_mod ast drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm crc32c_intel igb ahci libahci ptp mlx4_core pps_core dca libata i2c_algo_bit i2c_core dm_mirror dm_region_hash dm_log dm_mod
      Oct 28 22:04:30 klimt kernel: CPU: 7 PID: 1558 Comm: nfsd Not tainted 4.9.0-rc2-00005-g82cd754 #8
      Oct 28 22:04:30 klimt kernel: Hardware name: Supermicro Super Server/X10SRL-F, BIOS 1.0c 09/09/2015
      Oct 28 22:04:30 klimt kernel: task: ffff880835c3a100 task.stack: ffff8808420d8000
      Oct 28 22:04:30 klimt kernel: RIP: 0010:[<ffffffffa05a759f>]  [<ffffffffa05a759f>] release_lock_stateid+0x1f/0x60 [nfsd]
      Oct 28 22:04:30 klimt kernel: RSP: 0018:ffff8808420dbce0  EFLAGS: 00010246
      Oct 28 22:04:30 klimt kernel: RAX: ffff88084e6660f0 RBX: ffff88084e667020 RCX: 0000000000000000
      Oct 28 22:04:30 klimt kernel: RDX: 0000000000000007 RSI: 0000000000000000 RDI: ffff88084e667020
      Oct 28 22:04:30 klimt kernel: RBP: ffff8808420dbcf8 R08: 0000000000000001 R09: 0000000000000000
      Oct 28 22:04:30 klimt kernel: R10: ffff880835c3a100 R11: ffff880835c3aca8 R12: 6b6b6b6b6b6b6b6b
      Oct 28 22:04:30 klimt kernel: R13: ffff88084e6670d8 R14: ffff880835f546f0 R15: ffff880835f1c548
      Oct 28 22:04:30 klimt kernel: FS:  0000000000000000(0000) GS:ffff88087bdc0000(0000) knlGS:0000000000000000
      Oct 28 22:04:30 klimt kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      Oct 28 22:04:30 klimt kernel: CR2: 00007ff020389000 CR3: 0000000001c06000 CR4: 00000000001406e0
      Oct 28 22:04:30 klimt kernel: Stack:
      Oct 28 22:04:30 klimt kernel: ffff88084e667020 0000000000000000 ffff88084e6670d8 ffff8808420dbd20
      Oct 28 22:04:30 klimt kernel: ffffffffa05ac80d ffff880835f54548 ffff88084e640008 ffff880835f545b0
      Oct 28 22:04:30 klimt kernel: ffff8808420dbd70 ffffffffa059803d ffff880835f1c768 0000000000000870
      Oct 28 22:04:30 klimt kernel: Call Trace:
      Oct 28 22:04:30 klimt kernel: [<ffffffffa05ac80d>] nfsd4_free_stateid+0xfd/0x1b0 [nfsd]
      Oct 28 22:04:30 klimt kernel: [<ffffffffa059803d>] nfsd4_proc_compound+0x40d/0x690 [nfsd]
      Oct 28 22:04:30 klimt kernel: [<ffffffffa0583114>] nfsd_dispatch+0xd4/0x1d0 [nfsd]
      Oct 28 22:04:30 klimt kernel: [<ffffffffa047bbf9>] svc_process_common+0x3d9/0x700 [sunrpc]
      Oct 28 22:04:30 klimt kernel: [<ffffffffa047ca64>] svc_process+0xf4/0x330 [sunrpc]
      Oct 28 22:04:30 klimt kernel: [<ffffffffa05827ca>] nfsd+0xfa/0x160 [nfsd]
      Oct 28 22:04:30 klimt kernel: [<ffffffffa05826d0>] ? nfsd_destroy+0x170/0x170 [nfsd]
      Oct 28 22:04:30 klimt kernel: [<ffffffff810b367b>] kthread+0x10b/0x120
      Oct 28 22:04:30 klimt kernel: [<ffffffff810b3570>] ? kthread_stop+0x280/0x280
      Oct 28 22:04:30 klimt kernel: [<ffffffff8174e8ba>] ret_from_fork+0x2a/0x40
      Oct 28 22:04:30 klimt kernel: Code: c3 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 55 41 54 53 48 8b 87 b0 00 00 00 48 89 fb 4c 8b a0 98 00 00 00 <49> 8b 44 24 20 48 8d b8 80 03 00 00 e8 10 66 1a e1 48 89 df e8
      Oct 28 22:04:30 klimt kernel: RIP  [<ffffffffa05a759f>] release_lock_stateid+0x1f/0x60 [nfsd]
      Oct 28 22:04:30 klimt kernel: RSP <ffff8808420dbce0>
      Oct 28 22:04:30 klimt kernel: ---[ end trace cf5d0b371973e167 ]---
      
      Jeff Layton says:
      > Hm...now that I look though, this is a little suspicious:
      >
      >    struct nfs4_openowner *oo = openowner(stp->st_openstp->st_stateowner);
      >
      > I wonder if it's possible for the openstateid to have already been
      > destroyed at this point.
      >
      > We might be better off doing something like this to get the client pointer:
      >
      >    stp->st_stid.sc_client;
      >
      > ...which should be more direct and less dependent on other stateids
      > staying valid.
      
      With the suggested change, I am no longer able to reproduce the above oops.
      
      v2: Fix unhash_lock_stateid() as well
      Fix-suggested-by: default avatarJeff Layton <jlayton@redhat.com>
      Fixes: 42691398 ('nfsd: Fix race between FREE_STATEID and LOCK')
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Reviewed-by: default avatarJeff Layton <jlayton@redhat.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1734afcc
    • Linus Walleij's avatar
      ARM: dts: fix the SD card on the Snowball · 202c6676
      Linus Walleij authored
      commit 1b283eea upstream.
      
      This fixes a very annoying regression on the Snowball SD card
      that has been around for a while. It turns out that the device
      tree does not configure the direction pins properly, nor sets
      up the pins for the voltage converter properly at boot. Unless
      all things are correctly set up, the feedback clock will not
      work, and makes the driver spew messages in the console (but
      it works, very slowly):
      
      root@Ux500:/ mount /dev/mmcblk0p2 /mnt/
      [    9.953460] mmci-pl18x 80126000.sdi0_per1: error during DMA transfer!
      [    9.960296] mmcblk0: error -110 sending status command, retrying
      [    9.966461] mmcblk0: error -110 sending status command, retrying
      [    9.972534] mmcblk0: error -110 sending status command, aborting
      
      Fix this by rectifying the device tree to correspond to that of
      the Ux500 HREF boards plus the DAT31DIR setting that is unique for
      the Snowball, and things start working smoothly. Add in the SDR12
      and SDR25 modes which this host can do without any problems.
      
      I don't know if this has ever been correct, sadly. It works after
      this patch.
      Reported-by: default avatarDaniel Lezcano <daniel.lezcano@linaro.org>
      Cc: Ulf Hansson <ulf.hansson@linaro.org>
      Signed-off-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: default avatarOlof Johansson <olof@lixom.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      202c6676
    • Gregory CLEMENT's avatar
      ARM: mvebu: Select corediv clk for all mvebu v7 SoC · db20b510
      Gregory CLEMENT authored
      commit 33c45ef8 upstream.
      
      Since the commit bd3677ff ("clk: mvebu: Remove corediv clock from
      Armada XP"), the corediv clk is no more selected for Armada XP, however
      this clock is used for Armada XP using the compatible
      armada-370-corediv-clock.
      
      While since commit 1594d568 ("clk: mvebu: Move corediv config to
      mvebu config") Armada 38x and Armada 375 got corediv support again, not
      only Armada XP was missed but also Armada 39x.
      
      Actually all the SoC selecting MVEBU_V7 config need this clock:
      git grep "\-corediv-clock" arch/arm/boot/dts
      arch/arm/boot/dts/armada-370-xp.dtsi: compatible = "marvell,armada-370-corediv-clock";
      arch/arm/boot/dts/armada-375.dtsi:    compatible = "marvell,armada-375-corediv-clock";
      arch/arm/boot/dts/armada-38x.dtsi:    compatible = "marvell,armada-380-corediv-clock";
      arch/arm/boot/dts/armada-39x.dtsi:    compatible = "marvell,armada-390-corediv-clock"
      
      This commit now fixes this behavior by letting MVEBU_V7 select
      MVEBU_CLK_COREDIV.
      
      Fixes: bd3677ff ("clk: mvebu: Remove corediv clock from Armada XP")
      Reported-by: default avatarUwe Kleine-König <u.kleine-koenig@pengutronix.de>
      Acked-by: default avatarUwe Kleine-König <u.kleine-koenig@pengutronix.de>
      Signed-off-by: default avatarGregory CLEMENT <gregory.clement@free-electrons.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      db20b510
    • James Hogan's avatar
      KVM: MIPS: Precalculate MMIO load resume PC · c627b2e7
      James Hogan authored
      commit e1e575f6 upstream.
      
      The advancing of the PC when completing an MMIO load is done before
      re-entering the guest, i.e. before restoring the guest ASID. However if
      the load is in a branch delay slot it may need to access guest code to
      read the prior branch instruction. This isn't safe in TLB mapped code at
      the moment, nor in the future when we'll access unmapped guest segments
      using direct user accessors too, as it could read the branch from host
      user memory instead.
      
      Therefore calculate the resume PC in advance while we're still in the
      right context and save it in the new vcpu->arch.io_pc (replacing the no
      longer needed vcpu->arch.pending_load_cause), and restore it on MMIO
      completion.
      
      Fixes: e685c689 ("KVM/MIPS32: Privileged instruction/target branch emulation.")
      Signed-off-by: default avatarJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c627b2e7
    • James Hogan's avatar
      KVM: MIPS: Make ERET handle ERL before EXL · f3a0c969
      James Hogan authored
      commit ede5f3e7 upstream.
      
      The ERET instruction to return from exception is used for returning from
      exception level (Status.EXL) and error level (Status.ERL). If both bits
      are set however we should be returning from ERL first, as ERL can
      interrupt EXL, for example when an NMI is taken. KVM however checks EXL
      first.
      
      Fix the order of the checks to match the pseudocode in the instruction
      set manual.
      
      Fixes: e685c689 ("KVM/MIPS32: Privileged instruction/target branch emulation.")
      Signed-off-by: default avatarJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f3a0c969
    • Janosch Frank's avatar
      KVM: s390: Fix STHYI buffer alignment for diag224 · 961cf133
      Janosch Frank authored
      commit 45c7ee43 upstream.
      
      Diag224 requires a page-aligned 4k buffer to store the name table
      into. kmalloc does not guarantee page alignment, hence we replace it
      with __get_free_page for the buffer allocation.
      Reported-by: default avatarMichael Holzheu <holzheu@linux.vnet.ibm.com>
      Signed-off-by: default avatarJanosch Frank <frankja@linux.vnet.ibm.com>
      Reviewed-by: default avatarCornelia Huck <cornelia.huck@de.ibm.com>
      Signed-off-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      961cf133
    • Ido Yariv's avatar
      KVM: x86: fix wbinvd_dirty_mask use-after-free · 88aca01f
      Ido Yariv authored
      commit bd768e14 upstream.
      
      vcpu->arch.wbinvd_dirty_mask may still be used after freeing it,
      corrupting memory. For example, the following call trace may set a bit
      in an already freed cpu mask:
          kvm_arch_vcpu_load
          vcpu_load
          vmx_free_vcpu_nested
          vmx_free_vcpu
          kvm_arch_vcpu_free
      
      Fix this by deferring freeing of wbinvd_dirty_mask.
      Signed-off-by: default avatarIdo Yariv <ido@wizery.com>
      Reviewed-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      88aca01f
    • Tahsin Erdogan's avatar
      dm: free io_barrier after blk_cleanup_queue call · ea261d17
      Tahsin Erdogan authored
      commit d09960b0 upstream.
      
      dm_old_request_fn() has paths that access md->io_barrier.  The party
      destroying io_barrier should ensure that no future execution of
      dm_old_request_fn() is possible.  Move io_barrier destruction to below
      blk_cleanup_queue() to ensure this and avoid a NULL pointer crash during
      request-based DM device shutdown.
      Signed-off-by: default avatarTahsin Erdogan <tahsin@google.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ea261d17
    • Aditya Shankar's avatar
      Staging: wilc1000: Fix kernel Oops on opening the device · 377a2a27
      Aditya Shankar authored
      commit 1d4f1d53 upstream.
      
      Commit 2518ac59 ("staging: wilc1000: Replace kthread with workqueue
      for host interface") adds an unconditional destroy_workqueue() on the
      wilc's "hif_workqueue" soon after its creation thereby rendering
      it unusable. It then further attempts to queue work onto this
      non-existing hif_worqueue and results in:
      
      Unable to handle kernel NULL pointer dereference at virtual address 00000010
      pgd = de478000
      [00000010] *pgd=3eec0831, *pte=00000000, *ppte=00000000
      Internal error: Oops: 17 [#1] ARM
      Modules linked in: wilc1000_sdio(C) wilc1000(C)
      CPU: 0 PID: 825 Comm: ifconfig Tainted: G         C      4.8.0-rc8+ #37
      Hardware name: Atmel SAMA5
      task: df56f800 task.stack: deeb0000
      PC is at __queue_work+0x90/0x284
      LR is at __queue_work+0x58/0x284
      pc : [<c0126bb0>]    lr : [<c0126b78>]    psr: 600f0093
      sp : deeb1aa0  ip : def22d78  fp : deea6000
      r10: 00000000  r9 : c0a08150  r8 : c0a2f058
      r7 : 00000001  r6 : dee9b600  r5 : def22d74  r4 : 00000000
      r3 : 00000000  r2 : def22d74  r1 : 07ffffff  r0 : 00000000
      Flags: nZCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment none
      ...
      [<c0127060>] (__queue_work) from [<c0127298>] (queue_work_on+0x34/0x40)
      [<c0127298>] (queue_work_on) from [<bf0076b4>] (wilc_enqueue_cmd+0x54/0x64 [wilc1000])
      [<bf0076b4>] (wilc_enqueue_cmd [wilc1000]) from [<bf0082b4>] (wilc_set_wfi_drv_handler+0x48/0x70 [wilc1000])
      [<bf0082b4>] (wilc_set_wfi_drv_handler [wilc1000]) from [<bf00509c>] (wilc_mac_open+0x214/0x250 [wilc1000])
      [<bf00509c>] (wilc_mac_open [wilc1000]) from [<c04fde98>] (__dev_open+0xb8/0x11c)
      [<c04fde98>] (__dev_open) from [<c04fe128>] (__dev_change_flags+0x94/0x158)
      [<c04fe128>] (__dev_change_flags) from [<c04fe204>] (dev_change_flags+0x18/0x48)
      [<c04fe204>] (dev_change_flags) from [<c0557d5c>] (devinet_ioctl+0x6b4/0x788)
      [<c0557d5c>] (devinet_ioctl) from [<c04e40a0>] (sock_ioctl+0x154/0x2cc)
      [<c04e40a0>] (sock_ioctl) from [<c01b16e0>] (do_vfs_ioctl+0x9c/0x878)
      [<c01b16e0>] (do_vfs_ioctl) from [<c01b1ef0>] (SyS_ioctl+0x34/0x5c)
      [<c01b1ef0>] (SyS_ioctl) from [<c0107520>] (ret_fast_syscall+0x0/0x3c)
      Code: e5932004 e1520006 01a04003 0affffff (e5943010)
      ---[ end trace b612328adaa6bf20 ]---
      
      This fix removes the unnecessary call to destroy_workqueue() while opening
      the device to avoid the above kernel panic. The deinit routine already
      does a good job of terminating the workqueue when no longer needed.
      Reported-by: default avatarNicolas Ferre <Nicolas.Ferre@microchip.com>
      Fixes: 2518ac59 ("staging: wilc1000: Replace kthread with workqueue for host interface")
      Signed-off-by: default avatarAditya Shankar <Aditya.Shankar@microchip.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      377a2a27
    • Sandhya Bankar's avatar
      iio:chemical:atlas-ph-sensor: Fix use of 32 bit int to hold 16 bit big endian value · 0c4ffbf9
      Sandhya Bankar authored
      commit d1fe85ec upstream.
      
      This will result in a random value being reported on big endian architectures.
      (thanks to Lars-Peter Clausen for pointing out the effects of this bug)
      
      Only effects a value printed to the log, but as this reports the settings of
      the probe in question it may be of direct interest to users.
      
      Also, fixes the following sparse endianness warnings:
      
      drivers/iio/chemical/atlas-ph-sensor.c:215:9: warning: cast to restricted __be16
      drivers/iio/chemical/atlas-ph-sensor.c:215:9: warning: cast to restricted __be16
      drivers/iio/chemical/atlas-ph-sensor.c:215:9: warning: cast to restricted __be16
      drivers/iio/chemical/atlas-ph-sensor.c:215:9: warning: cast to restricted __be16
      drivers/iio/chemical/atlas-ph-sensor.c:215:9: warning: cast to restricted __be16
      drivers/iio/chemical/atlas-ph-sensor.c:215:9: warning: cast to restricted __be16
      drivers/iio/chemical/atlas-ph-sensor.c:215:9: warning: cast to restricted __be16
      drivers/iio/chemical/atlas-ph-sensor.c:215:9: warning: cast to restricted __be16
      Signed-off-by: default avatarSandhya Bankar <bankarsandhya512@gmail.com>
      Fixes: e8dd92bf ("iio: chemical: atlas-ph-sensor: add EC feature")
      Signed-off-by: default avatarJonathan Cameron <jic23@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0c4ffbf9
    • Marcin Wojtas's avatar
      arm64: dts: marvell: fix clocksource for CP110 master SPI0 · 52a1e76f
      Marcin Wojtas authored
      commit 51227bf5 upstream.
      
      I2C and SPI interfaces share common clock trees within the CP110 HW block.
      It occurred that SPI0 interface has wrong clock assignment in the device
      tree, which is fixed in this commit to a proper value.
      
      Fixes: 728dacc7 ("arm64: dts: marvell: initial DT description of ...")
      Signed-off-by: default avatarMarcin Wojtas <mw@semihalf.com>
      Signed-off-by: default avatarGregory CLEMENT <gregory.clement@free-electrons.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      52a1e76f
    • Dmitry Vyukov's avatar
      tty: limit terminal size to 4M chars · 0dff3c63
      Dmitry Vyukov authored
      commit 32b2921e upstream.
      
      Size of kmalloc() in vc_do_resize() is controlled by user.
      Too large kmalloc() size triggers WARNING message on console.
      Put a reasonable upper bound on terminal size to prevent WARNINGs.
      Signed-off-by: default avatarDmitry Vyukov <dvyukov@google.com>
      CC: David Rientjes <rientjes@google.com>
      Cc: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Jiri Slaby <jslaby@suse.com>
      Cc: Peter Hurley <peter@hurleysoftware.com>
      Cc: linux-kernel@vger.kernel.org
      Cc: syzkaller@googlegroups.com
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0dff3c63
    • Mathias Nyman's avatar
      xhci: workaround for hosts missing CAS bit · 44f0722d
      Mathias Nyman authored
      commit 346e9973 upstream.
      
      If a device is unplugged and replugged during Sx system suspend
      some  Intel xHC hosts will overwrite the CAS (Cold attach status) flag
      and no device connection is noticed in resume.
      
      A device in this state can be identified in resume if its link state
      is in polling or compliance mode, and the current connect status is 0.
      A device in this state needs to be warm reset.
      
      Intel 100/c230 series PCH specification update Doc #332692-006 Errata #8
      
      Observed on Cherryview and Apollolake as they go into compliance mode
      if LFPS times out during polling, and re-plugged devices are not
      discovered at resume.
      Signed-off-by: default avatarMathias Nyman <mathias.nyman@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      44f0722d
    • Mathias Nyman's avatar
      xhci: add restart quirk for Intel Wildcatpoint PCH · 0894224a
      Mathias Nyman authored
      commit 4c39135a upstream.
      
      xHC in Wildcatpoint-LP PCH is similar to LynxPoint-LP and need the
      same quirks to prevent machines from spurious restart while
      shutting them down.
      Reported-by: default avatarHasan Mahmood <hasan.mahm@gmail.com>
      Signed-off-by: default avatarMathias Nyman <mathias.nyman@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0894224a
    • Long Li's avatar
      hv: do not lose pending heartbeat vmbus packets · b2d28d93
      Long Li authored
      commit 407a3aee upstream.
      
      The host keeps sending heartbeat packets independent of the
      guest responding to them.  Even though we respond to the heartbeat messages at
      interrupt level, we can have situations where there maybe multiple heartbeat
      messages pending that have not been responded to. For instance this occurs when the
      VM is paused and the host continues to send the heartbeat messages.
      Address this issue by draining and responding to all
      the heartbeat messages that maybe pending.
      Signed-off-by: default avatarLong Li <longli@microsoft.com>
      Signed-off-by: default avatarK. Y. Srinivasan <kys@microsoft.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b2d28d93
    • Scot Doyle's avatar
      vt: clear selection before resizing · eeae0a12
      Scot Doyle authored
      commit 009e39ae upstream.
      
      When resizing a vt its selection may exceed the new size, resulting in
      an invalid memory access [1]. Clear the selection before resizing.
      
      [1] http://lkml.kernel.org/r/CACT4Y+acDTwy4umEvf5ROBGiRJNrxHN4Cn5szCXE5Jw-d1B=Xw@mail.gmail.comReported-and-tested-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarScot Doyle <lkml14@scotdoyle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      eeae0a12
    • Thomas Gleixner's avatar
      x86/smpboot: Init apic mapping before usage · 9710f5b1
      Thomas Gleixner authored
      commit 1e90a13d upstream.
      
      The recent changes, which forced the registration of the boot cpu on UP
      systems, which do not have ACPI tables, have been fixed for systems w/o
      local APIC, but left a wreckage for systems which have neither ACPI nor
      mptables, but the CPU has an APIC, e.g. virtualbox.
      
      The boot process crashes in prefill_possible_map() as it wants to register
      the boot cpu, which needs to access the local apic, but the local APIC is
      not yet mapped.
      
      There is no reason why init_apic_mapping() can't be invoked before
      prefill_possible_map(). So instead of playing another silly early mapping
      game, as the ACPI/mptables code does, we just move init_apic_mapping()
      before the call to prefill_possible_map().
      
      In hindsight, I should have noticed that combination earlier.
      
      Sorry for the churn (also in stable)!
      
      Fixes: ff856051 ("x86/boot/smp: Don't try to poke disabled/non-existent APIC")
      Reported-and-debugged-by: default avatarMichal Necasek <michal.necasek@oracle.com>
      Reported-and-tested-by: default avatarWolfgang Bauer <wbauer@tmo.at>
      Cc: prarit@redhat.com
      Cc: ville.syrjala@linux.intel.com
      Cc: michael.thayer@oracle.com
      Cc: knut.osmundsen@oracle.com
      Cc: frank.mehnert@oracle.com
      Cc: Borislav Petkov <bp@alien8.de>
      Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1610282114380.5053@nanosSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9710f5b1
    • Gerald Schaefer's avatar
      GenWQE: Fix bad page access during abort of resource allocation · 58b0a7f1
      Gerald Schaefer authored
      commit a7a7aeef upstream.
      
      When interrupting an application which was allocating DMAable
      memory, it was possible, that the DMA memory was deallocated
      twice, leading to the error symptoms below.
      
      Thanks to Gerald, who analyzed the problem and provided this
      patch.
      
      I agree with his analysis of the problem: ddcb_cmd_fixups() ->
      genwqe_alloc_sync_sgl() (fails in f/lpage, but sgl->sgl != NULL
      and f/lpage maybe also != NULL) -> ddcb_cmd_cleanup() ->
      genwqe_free_sync_sgl() (double free, because sgl->sgl != NULL and
      f/lpage maybe also != NULL)
      
      In this scenario we would have exactly the kind of double free that
      would explain the WARNING / Bad page state, and as expected it is
      caused by broken error handling (cleanup).
      
      Using the Ubuntu git source, tag Ubuntu-4.4.0-33.52, he was able to reproduce
      the "Bad page state" issue, and with the patch on top he could not reproduce
      it any more.
      
      ------------[ cut here ]------------
      WARNING: at /build/linux-o03cxz/linux-4.4.0/arch/s390/include/asm/pci_dma.h:141
      Modules linked in: qeth_l2 ghash_s390 prng aes_s390 des_s390 des_generic sha512_s390 sha256_s390 sha1_s390 sha_common genwqe_card qeth crc_itu_t qdio ccwgroup vmur dm_multipath dasd_eckd_mod dasd_mod
      CPU: 2 PID: 3293 Comm: genwqe_gunzip Not tainted 4.4.0-33-generic #52-Ubuntu
      task: 0000000032c7e270 ti: 00000000324e4000 task.ti: 00000000324e4000
      Krnl PSW : 0404c00180000000 0000000000156346 (dma_update_cpu_trans+0x9e/0xa8)
                 R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 EA:3
      Krnl GPRS: 00000000324e7bcd 0000000000c3c34a 0000000027628298 000000003215b400
                 0000000000000400 0000000000001fff 0000000000000400 0000000116853000
                 07000000324e7b1e 0000000000000001 0000000000000001 0000000000000001
                 0000000000001000 0000000116854000 0000000000156402 00000000324e7a38
      Krnl Code: 000000000015633a: 95001000           cli     0(%r1),0
                 000000000015633e: a774ffc3           brc     7,1562c4
                #0000000000156342: a7f40001           brc     15,156344
                >0000000000156346: 92011000           mvi     0(%r1),1
                 000000000015634a: a7f4ffbd           brc     15,1562c4
                 000000000015634e: 0707               bcr     0,%r7
                 0000000000156350: c00400000000       brcl    0,156350
                 0000000000156356: eb7ff0500024       stmg    %r7,%r15,80(%r15)
      Call Trace:
      ([<00000000001563e0>] dma_update_trans+0x90/0x228)
       [<00000000001565dc>] s390_dma_unmap_pages+0x64/0x160
       [<00000000001567c2>] s390_dma_free+0x62/0x98
       [<000003ff801310ce>] __genwqe_free_consistent+0x56/0x70 [genwqe_card]
       [<000003ff801316d0>] genwqe_free_sync_sgl+0xf8/0x160 [genwqe_card]
       [<000003ff8012bd6e>] ddcb_cmd_cleanup+0x86/0xa8 [genwqe_card]
       [<000003ff8012c1c0>] do_execute_ddcb+0x110/0x348 [genwqe_card]
       [<000003ff8012c914>] genwqe_ioctl+0x51c/0xc20 [genwqe_card]
       [<000000000032513a>] do_vfs_ioctl+0x3b2/0x518
       [<0000000000325344>] SyS_ioctl+0xa4/0xb8
       [<00000000007b86c6>] system_call+0xd6/0x264
       [<000003ff9e8e520a>] 0x3ff9e8e520a
      Last Breaking-Event-Address:
       [<0000000000156342>] dma_update_cpu_trans+0x9a/0xa8
      ---[ end trace 35996336235145c8 ]---
      BUG: Bad page state in process jbd2/dasdb1-8  pfn:3215b
      page:000003d100c856c0 count:-1 mapcount:0 mapping:          (null) index:0x0
      flags: 0x3fffc0000000000()
      page dumped because: nonzero _count
      Signed-off-by: default avatarGerald Schaefer <gerald.schaefer@de.ibm.com>
      Signed-off-by: default avatarFrank Haverkamp <haver@linux.vnet.ibm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      58b0a7f1
    • Bryan Paluch's avatar
      usb: increase ohci watchdog delay to 275 msec · b9aa0a72
      Bryan Paluch authored
      commit ed6d6f8f upstream.
      
      Increase ohci watchout delay to 275 ms. Previous delay was 250 ms
      with 20 ms of slack, after removing slack time some ohci controllers don't
      respond in time. Logs from systems with controllers that have the
      issue would show "HcDoneHead not written back; disabled"
      Signed-off-by: default avatarBryan Paluch <bryanpaluch@gmail.com>
      Acked-by: default avatarAlan Stern <stern@rowland.harvard.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b9aa0a72
    • Yoshihiro Shimoda's avatar
      usb: renesas_usbhs: add wait after initialization for R-Car Gen3 · 241208e7
      Yoshihiro Shimoda authored
      commit b7603239 upstream.
      
      Since the controller on R-Car Gen3 doesn't have any status registers
      to detect initialization (LPSTS.SUSPM = 1) and the initialization needs
      up to 45 usec, this patch adds wait after the initialization. Otherwise,
      writing other registers (e.g. INTENB0) will fail.
      
      Fixes: de18757e ("usb: renesas_usbhs: add R-Car Gen3 power control")
      Cc: <balbi@kernel.org>
      Signed-off-by: default avatarYoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      241208e7
    • Mathias Nyman's avatar
      xhci: use default USB_RESUME_TIMEOUT when resuming ports. · 00dbeb06
      Mathias Nyman authored
      commit 7d3b016a upstream.
      
      USB2 host inititated resume, and system suspend bus resume
      need to use the same USB_RESUME_TIMEOUT as elsewhere.
      
      This resolves a device disconnect issue at system resume seen
      on Intel Braswell and Apollolake, but is in no way limited to
      those platforms.
      Signed-off-by: default avatarMathias Nyman <mathias.nyman@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      00dbeb06
    • Stefan Tauner's avatar
      USB: serial: ftdi_sio: add support for Infineon TriBoard TC2X7 · 1e306cd3
      Stefan Tauner authored
      commit ca006f78 upstream.
      
      This adds support to ftdi_sio for the Infineon TriBoard TC2X7
      engineering board for first-generation Aurix SoCs with Tricore CPUs.
      Mere addition of the device IDs does the job.
      Signed-off-by: default avatarStefan Tauner <stefan.tauner@technikum-wien.at>
      Signed-off-by: default avatarJohan Hovold <johan@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1e306cd3
    • Johan Hovold's avatar
      USB: serial: cp210x: fix tiocmget error handling · d082fd10
      Johan Hovold authored
      commit de24e0a1 upstream.
      
      The current tiocmget implementation would fail to report errors up the
      stack and instead leaked a few bits from the stack as a mask of
      modem-status flags.
      
      Fixes: 39a66b8d ("[PATCH] USB: CP2101 Add support for flow control")
      Signed-off-by: default avatarJohan Hovold <johan@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d082fd10
    • Johan Hovold's avatar
      USB: serial: fix potential NULL-dereference at probe · e8bf7267
      Johan Hovold authored
      commit 126d26f6 upstream.
      
      Make sure we have at least one port before attempting to register a
      console.
      
      Currently, at least one driver binds to a "dummy" interface and requests
      zero ports for it. Should such an interface also lack endpoints, we get
      a NULL-deref during probe.
      
      Fixes: e5b1e206 ("USB: serial: make minor allocation dynamic")
      Signed-off-by: default avatarJohan Hovold <johan@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e8bf7267
    • Felipe Balbi's avatar
      usb: gadget: function: u_ether: don't starve tx request queue · 23124735
      Felipe Balbi authored
      commit 6c83f772 upstream.
      
      If we don't guarantee that we will always get an
      interrupt at least when we're queueing our very last
      request, we could fall into situation where we queue
      every request with 'no_interrupt' set. This will
      cause the link to get stuck.
      
      The behavior above has been triggered with g_ether
      and dwc3.
      Reported-by: default avatarVille Syrjälä <ville.syrjala@linux.intel.com>
      Signed-off-by: default avatarFelipe Balbi <felipe.balbi@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      23124735
    • Alexandre Belloni's avatar
      usb: gadget: udc: atmel: fix endpoint name · fe4af125
      Alexandre Belloni authored
      commit bbe097f0 upstream.
      
      Since commit c32b5bcf ("ARM: dts: at91: Fix USB endpoint nodes"),
      atmel_usba_udc fails with:
      
      ------------[ cut here ]------------
      WARNING: CPU: 0 PID: 0 at include/linux/usb/gadget.h:405
      ecm_do_notify+0x188/0x1a0
      Modules linked in:
      CPU: 0 PID: 0 Comm: swapper Not tainted 4.7.0+ #15
      Hardware name: Atmel SAMA5
      [<c010ccfc>] (unwind_backtrace) from [<c010a7ec>] (show_stack+0x10/0x14)
      [<c010a7ec>] (show_stack) from [<c0115c10>] (__warn+0xe4/0xfc)
      [<c0115c10>] (__warn) from [<c0115cd8>] (warn_slowpath_null+0x20/0x28)
      [<c0115cd8>] (warn_slowpath_null) from [<c04377ac>] (ecm_do_notify+0x188/0x1a0)
      [<c04377ac>] (ecm_do_notify) from [<c04379a4>] (ecm_set_alt+0x74/0x1ac)
      [<c04379a4>] (ecm_set_alt) from [<c042f74c>] (composite_setup+0xfc0/0x19f8)
      [<c042f74c>] (composite_setup) from [<c04356e8>] (usba_udc_irq+0x8f4/0xd9c)
      [<c04356e8>] (usba_udc_irq) from [<c013ec9c>] (handle_irq_event_percpu+0x9c/0x158)
      [<c013ec9c>] (handle_irq_event_percpu) from [<c013ed80>] (handle_irq_event+0x28/0x3c)
      [<c013ed80>] (handle_irq_event) from [<c01416d4>] (handle_fasteoi_irq+0xa0/0x168)
      [<c01416d4>] (handle_fasteoi_irq) from [<c013e3f8>] (generic_handle_irq+0x24/0x34)
      [<c013e3f8>] (generic_handle_irq) from [<c013e640>] (__handle_domain_irq+0x54/0xa8)
      [<c013e640>] (__handle_domain_irq) from [<c010b214>] (__irq_svc+0x54/0x70)
      [<c010b214>] (__irq_svc) from [<c0107eb0>] (arch_cpu_idle+0x38/0x3c)
      [<c0107eb0>] (arch_cpu_idle) from [<c0137300>] (cpu_startup_entry+0x9c/0xdc)
      [<c0137300>] (cpu_startup_entry) from [<c0900c40>] (start_kernel+0x354/0x360)
      [<c0900c40>] (start_kernel) from [<20008078>] (0x20008078)
      ---[ end trace e7cf9dcebf4815a6 ]---
      
      Fixes: c32b5bcf ("ARM: dts: at91: Fix USB endpoint nodes")
      Reported-by: default avatarRichard Genoud <richard.genoud@gmail.com>
      Acked-by: default avatarNicolas Ferre <nicolas.ferre@atmel.com>
      Signed-off-by: default avatarAlexandre Belloni <alexandre.belloni@free-electrons.com>
      Signed-off-by: default avatarFelipe Balbi <felipe.balbi@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fe4af125
    • Alexander Usyskin's avatar
      mei: txe: don't clean an unprocessed interrupt cause. · 420d1689
      Alexander Usyskin authored
      commit 43605e29 upstream.
      
      SEC registers are not accessible when the TXE device is in low power
      state, hence the SEC interrupt cannot be processed if device is not
      awake.
      
      In some rare cases entrance to low power state (aliveness off) and input
      ready bits can be signaled at the same time, resulting in communication
      stall as input ready won't be signaled again after waking up. To resolve
      this IPC_HHIER_SEC bit in HHISR_REG should not be cleaned if the
      interrupt is not processed.
      Signed-off-by: default avatarAlexander Usyskin <alexander.usyskin@intel.com>
      Signed-off-by: default avatarTomas Winkler <tomas.winkler@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      420d1689
    • Richard Weinberger's avatar
      ubifs: Fix regression in ubifs_readdir() · 5d30e8f6
      Richard Weinberger authored
      commit a00052a2 upstream.
      
      Commit c83ed4c9 ("ubifs: Abort readdir upon error") broke
      overlayfs support because the fix exposed an internal error
      code to VFS.
      Reported-by: default avatarPeter Rosin <peda@axentia.se>
      Tested-by: default avatarPeter Rosin <peda@axentia.se>
      Reported-by: default avatarRalph Sennhauser <ralph.sennhauser@gmail.com>
      Tested-by: default avatarRalph Sennhauser <ralph.sennhauser@gmail.com>
      Fixes: c83ed4c9 ("ubifs: Abort readdir upon error")
      Signed-off-by: default avatarRichard Weinberger <richard@nod.at>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5d30e8f6
    • Richard Weinberger's avatar
      ubifs: Abort readdir upon error · b8176cc5
      Richard Weinberger authored
      commit c83ed4c9 upstream.
      
      If UBIFS is facing an error while walking a directory, it reports this
      error and ubifs_readdir() returns the error code. But the VFS readdir
      logic does not make the getdents system call fail in all cases. When the
      readdir cursor indicates that more entries are present, the system call
      will just return and the libc wrapper will try again since it also
      knows that more entries are present.
      This causes the libc wrapper to busy loop for ever when a directory is
      corrupted on UBIFS.
      A common approach do deal with corrupted directory entries is
      skipping them by setting the cursor to the next entry. On UBIFS this
      approach is not possible since we cannot compute the next directory
      entry cursor position without reading the current entry. So all we can
      do is setting the cursor to the "no more entries" position and make
      getdents exit.
      Signed-off-by: default avatarRichard Weinberger <richard@nod.at>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b8176cc5
    • Thomas Gleixner's avatar
      timers: Lock base for same bucket optimization · 1755f43e
      Thomas Gleixner authored
      commit 4da9152a upstream.
      
      Linus stumbled over the unlocked modification of the timer expiry value in
      mod_timer() which is an optimization for timers which stay in the same
      bucket - due to the bucket granularity - despite their expiry time getting
      updated.
      
      The optimization itself still makes sense even if we take the lock, because
      in case that the bucket stays the same, we avoid the pointless
      queue/enqueue dance.
      
      Make the check and the modification of timer->expires protected by the base
      lock and shuffle the remaining code around so we can keep the lock held
      when we actually have to requeue the timer to a different bucket.
      
      Fixes: f00c0afd ("timers: Implement optimization for same expiry time in mod_timer()")
      Reported-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1610241711220.4983@nanos
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1755f43e
    • Thomas Gleixner's avatar
      timers: Plug locking race vs. timer migration · e18ed431
      Thomas Gleixner authored
      commit b831275a upstream.
      
      Linus noticed that lock_timer_base() lacks a READ_ONCE() for accessing the
      timer flags. As a consequence the compiler is allowed to reload the flags
      between the initial check for TIMER_MIGRATION and the following timer base
      computation and the spin lock of the base.
      
      While this has not been observed (yet), we need to make sure that it never
      happens.
      
      Fixes: 0eeda71b ("timer: Replace timer base by a cpu index")
      Reported-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1610241711220.4983@nanos
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e18ed431
    • Thomas Gleixner's avatar
      timers: Prevent base clock corruption when forwarding · b5e3a038
      Thomas Gleixner authored
      commit 6bad6bcc upstream.
      
      When a timer is enqueued we try to forward the timer base clock. This
      mechanism has two issues:
      
      1) Forwarding a remote base unlocked
      
      The forwarding function is called from get_target_base() with the current
      timer base lock held. But if the new target base is a different base than
      the current base (can happen with NOHZ, sigh!) then the forwarding is done
      on an unlocked base. This can lead to corruption of base->clk.
      
      Solution is simple: Invoke the forwarding after the target base is locked.
      
      2) Possible corruption due to jiffies advancing
      
      This is similar to the issue in get_net_timer_interrupt() which was fixed
      in the previous patch. jiffies can advance between check and assignement
      and therefore advancing base->clk beyond the next expiry value.
      
      So we need to read jiffies into a local variable once and do the checks and
      assignment with the local copy.
      
      Fixes: a683f390("timers: Forward the wheel clock whenever possible")
      Reported-by: default avatarAshton Holmes <scoopta@gmail.com>
      Reported-by: default avatarMichael Thayer <michael.thayer@oracle.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Michal Necasek <michal.necasek@oracle.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: knut.osmundsen@oracle.com
      Cc: stern@rowland.harvard.edu
      Cc: rt@linutronix.de
      Link: http://lkml.kernel.org/r/20161022110552.253640125@linutronix.deSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b5e3a038
    • Thomas Gleixner's avatar
      timers: Prevent base clock rewind when forwarding clock · 665f7bf3
      Thomas Gleixner authored
      commit 041ad7bc upstream.
      
      Ashton and Michael reported, that kernel versions 4.8 and later suffer from
      USB timeouts which are caused by the timer wheel rework.
      
      This is caused by a bug in the base clock forwarding mechanism, which leads
      to timers expiring early. The scenario which leads to this is:
      
      run_timers()
        while (jiffies >= base->clk) {
          collect_expired_timers();
          base->clk++;
          expire_timers();
        }
      
      So base->clk = jiffies + 1. Now the cpu goes idle:
      
      idle()
        get_next_timer_interrupt()
          nextevt = __next_time_interrupt();
          if (time_after(nextevt, base->clk))
             	base->clk = jiffies;
      
      jiffies has not advanced since run_timers(), so this assignment effectively
      decrements base->clk by one.
      
      base->clk is the index into the timer wheel arrays. So let's assume the
      following state after the base->clk increment in run_timers():
      
       jiffies = 0
       base->clk = 1
      
      A timer gets enqueued with an expiry delta of 63 ticks (which is the case
      with the USB timeout and HZ=250) so the resulting bucket index is:
      
        base->clk + delta = 1 + 63 = 64
      
      The timer goes into the first wheel level. The array size is 64 so it ends
      up in bucket 0, which is correct as it takes 63 ticks to advance base->clk
      to index into bucket 0 again.
      
      If the cpu goes idle before jiffies advance, then the bug in the forwarding
      mechanism sets base->clk back to 0, so the next invocation of run_timers()
      at the next tick will index into bucket 0 and therefore expire the timer 62
      ticks too early.
      
      Instead of blindly setting base->clk to jiffies we must make the forwarding
      conditional on jiffies > base->clk, but we cannot use jiffies for this as
      we might run into the following issue:
      
        if (time_after(jiffies, base->clk) {
          if (time_after(nextevt, base->clk))
             base->clk = jiffies;
      
      jiffies can increment between the check and the assigment far enough to
      advance beyond nextevt. So we need to use a stable value for checking.
      
      get_next_timer_interrupt() has the basej argument which is the jiffies
      value snapshot taken in the calling code. So we can just that.
      
      Thanks to Ashton for bisecting and providing trace data!
      
      Fixes: a683f390 ("timers: Forward the wheel clock whenever possible")
      Reported-by: default avatarAshton Holmes <scoopta@gmail.com>
      Reported-by: default avatarMichael Thayer <michael.thayer@oracle.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Michal Necasek <michal.necasek@oracle.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: knut.osmundsen@oracle.com
      Cc: stern@rowland.harvard.edu
      Cc: rt@linutronix.de
      Link: http://lkml.kernel.org/r/20161022110552.175308322@linutronix.deSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      665f7bf3
    • Borislav Petkov's avatar
      x86/microcode/AMD: Fix more fallout from CONFIG_RANDOMIZE_MEMORY=y · 0d621c57
      Borislav Petkov authored
      commit 1c27f646 upstream.
      
      We needed the physical address of the container in order to compute the
      offset within the relocated ramdisk. And we did this by doing __pa() on
      the virtual address.
      
      However, __pa() does checks whether the physical address is within
      PAGE_OFFSET and __START_KERNEL_map - see __phys_addr() - which fail
      if we have CONFIG_RANDOMIZE_MEMORY enabled: we feed a virtual address
      which *doesn't* have the randomization offset into a function which uses
      PAGE_OFFSET which *does* have that offset.
      
      This makes this check fire:
      
      	VIRTUAL_BUG_ON((x > y) || !phys_addr_valid(x));
      			^^^^^^
      
      due to the randomization offset.
      
      The fix is as simple as using __pa_nodebug() because we do that
      randomization offset accounting later in that function ourselves.
      Reported-by: default avatarBob Peterson <rpeterso@redhat.com>
      Tested-by: default avatarBob Peterson <rpeterso@redhat.com>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Cc: Andreas Gruenbacher <agruenba@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Whitehouse <swhiteho@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-mm <linux-mm@kvack.org>
      Link: http://lkml.kernel.org/r/20161027123623.j2jri5bandimboff@pd.tnicSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0d621c57
    • Paul Mackerras's avatar
      powerpc/64: Fix race condition in setting lock bit in idle/wakeup code · e599203f
      Paul Mackerras authored
      commit 09b7e37b upstream.
      
      This fixes a race condition where one thread that is entering or
      leaving a power-saving state can inadvertently ignore the lock bit
      that was set by another thread, and potentially also clear it.
      The core_idle_lock_held function is called when the lock bit is
      seen to be set.  It polls the lock bit until it is clear, then
      does a lwarx to load the word containing the lock bit and thread
      idle bits so it can be updated.  However, it is possible that the
      value loaded with the lwarx has the lock bit set, even though an
      immediately preceding lwz loaded a value with the lock bit clear.
      If this happens then we go ahead and update the word despite the
      lock bit being set, and when called from pnv_enter_arch207_idle_mode,
      we will subsequently clear the lock bit.
      
      No identifiable misbehaviour has been attributed to this race.
      
      This fixes it by checking the lock bit in the value loaded by the
      lwarx.  If it is set then we just go back and keep on polling.
      
      Fixes: b32aadc1 ("powerpc/powernv: Fix race in updating core_idle_state")
      Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e599203f
    • Paul Mackerras's avatar
      powerpc/64: Re-fix race condition between going idle and entering guest · 51d784b5
      Paul Mackerras authored
      commit 56c46222 upstream.
      
      Commit 8117ac6a ("powerpc/powernv: Switch off MMU before entering
      nap/sleep/rvwinkle mode", 2014-12-10) fixed a race condition where one
      thread entering a KVM guest could switch the MMU context to the guest
      while another thread was still in host kernel context with the MMU on.
      That commit moved the point where a thread entering a power-saving
      mode set its kvm_hstate.hwthread_state field in its PACA to
      KVM_HWTHREAD_IN_IDLE from a point where the MMU was on to after the
      MMU had been switched off.  That commit also added a comment
      explaining that we have to switch to real mode before setting
      hwthread_state to avoid this race.
      
      Nevertheless, commit 4eae2c9a ("powerpc/powernv: Make
      pnv_powersave_common more generic", 2016-07-08) subsequently moved
      the setting of hwthread_state back to a point where the MMU is on,
      thus reintroducing the race, despite the comment saying that this
      should not be done being included in full in the context lines of
      the patch that did it.
      
      This fixes the race again and adds a bigger and shoutier comment
      explaining the potential race condition.
      
      Fixes: 4eae2c9a ("powerpc/powernv: Make pnv_powersave_common more generic")
      Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      Reviewed-by: default avatarShreyas B. Prabhu <shreyasbp@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      51d784b5
    • Aneesh Kumar K.V's avatar
      powerpc/mm/radix: Use tlbiel only if we ever ran on the current cpu · 2c7ff0e5
      Aneesh Kumar K.V authored
      commit bd77c449 upstream.
      
      Before this patch, we used tlbiel, if we ever ran only on this core.
      That was mostly derived from the nohash usage of the same. But is
      incorrect, the ISA 3.0 clarifies tlbiel such that:
      
      "All TLB entries that have all of the following properties are made
      invalid on the thread executing the tlbiel instruction"
      
      ie. tlbiel only invalidates TLB entries on the current thread. So if the
      mm has been used on any other thread (aka. cpu) then we must broadcast
      the invalidate.
      
      This bug could lead to invalid TLB entries if a program runs on multiple
      threads of a core.
      
      Hence use tlbiel, if we only ever ran on only the current cpu.
      
      Fixes: 1a472c9d ("powerpc/mm/radix: Add tlbflush routines")
      Signed-off-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2c7ff0e5
    • Segher Boessenkool's avatar
      powerpc: Convert cmp to cmpd in idle enter sequence · ae150de2
      Segher Boessenkool authored
      commit 80f23935 upstream.
      
      PowerPC's "cmp" instruction has four operands. Normally people write
      "cmpw" or "cmpd" for the second cmp operand 0 or 1. But, frequently
      people forget, and write "cmp" with just three operands.
      
      With older binutils this is silently accepted as if this was "cmpw",
      while often "cmpd" is wanted. With newer binutils GAS will complain
      about this for 64-bit code. For 32-bit code it still silently assumes
      "cmpw" is what is meant.
      
      In this instance the code comes directly from ISA v2.07, including the
      cmp, but cmpd is correct. Backport to stable so that new toolchains can
      build old kernels.
      
      Fixes: 948cf67c ("powerpc: Add NAP mode support on Power7 in HV mode")
      Reviewed-by: default avatarVaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
      Signed-off-by: default avatarSegher Boessenkool <segher@kernel.crashing.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ae150de2
    • Chris Mason's avatar
      btrfs: fix races on root_log_ctx lists · 1198fbca
      Chris Mason authored
      commit 570dd450 upstream.
      
      btrfs_remove_all_log_ctxs takes a shortcut where it avoids walking the
      list because it knows all of the waiters are patiently waiting for the
      commit to finish.
      
      But, there's a small race where btrfs_sync_log can remove itself from
      the list if it finds a log commit is already done.  Also, it uses
      list_del_init() to remove itself from the list, but there's no way to
      know if btrfs_remove_all_log_ctxs has already run, so we don't know for
      sure if it is safe to call list_del_init().
      
      This gets rid of all the shortcuts for btrfs_remove_all_log_ctxs(), and
      just calls it with the proper locking.
      
      This is part two of the corruption fixed by cbd60aa7.  I should have
      done this in the first place, but convinced myself the optimizations were
      safe.  A 12 hour run of dbench 2048 will eventually trigger a list debug
      WARN_ON for the list_del_init() in btrfs_sync_log().
      
      Fixes: d1433debReported-by: default avatarDave Jones <davej@codemonkey.org.uk>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1198fbca