1. 08 Nov, 2017 1 commit
    • Paul Mackerras's avatar
      KVM: PPC: Book3S HV: Fix exclusion between HPT resizing and other HPT updates · 38c53af8
      Paul Mackerras authored
      Commit 5e985969 ("KVM: PPC: Book3S HV: Outline of KVM-HV HPT resizing
      implementation", 2016-12-20) added code that tries to exclude any use
      or update of the hashed page table (HPT) while the HPT resizing code
      is iterating through all the entries in the HPT.  It does this by
      taking the kvm->lock mutex, clearing the kvm->arch.hpte_setup_done
      flag and then sending an IPI to all CPUs in the host.  The idea is
      that any VCPU task that tries to enter the guest will see that the
      hpte_setup_done flag is clear and therefore call kvmppc_hv_setup_htab_rma,
      which also takes the kvm->lock mutex and will therefore block until
      we release kvm->lock.
      
      However, any VCPU that is already in the guest, or is handling a
      hypervisor page fault or hypercall, can re-enter the guest without
      rechecking the hpte_setup_done flag.  The IPI will cause a guest exit
      of any VCPUs that are currently in the guest, but does not prevent
      those VCPU tasks from immediately re-entering the guest.
      
      The result is that after resize_hpt_rehash_hpte() has made a HPTE
      absent, a hypervisor page fault can occur and make that HPTE present
      again.  This includes updating the rmap array for the guest real page,
      meaning that we now have a pointer in the rmap array which connects
      with pointers in the old rev array but not the new rev array.  In
      fact, if the HPT is being reduced in size, the pointer in the rmap
      array could point outside the bounds of the new rev array.  If that
      happens, we can get a host crash later on such as this one:
      
      [91652.628516] Unable to handle kernel paging request for data at address 0xd0000000157fb10c
      [91652.628668] Faulting instruction address: 0xc0000000000e2640
      [91652.628736] Oops: Kernel access of bad area, sig: 11 [#1]
      [91652.628789] LE SMP NR_CPUS=1024 NUMA PowerNV
      [91652.628847] Modules linked in: binfmt_misc vhost_net vhost tap xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c iptable_mangle iptable_security iptable_raw ebtable_filter ebtables ip6table_filter ip6_tables ses enclosure scsi_transport_sas i2c_opal ipmi_powernv ipmi_devintf i2c_core ipmi_msghandler powernv_op_panel nfsd auth_rpcgss oid_registry nfs_acl lockd grace sunrpc kvm_hv kvm_pr kvm scsi_dh_alua dm_service_time dm_multipath tg3 ptp pps_core [last unloaded: stap_552b612747aec2da355051e464fa72a1_14259]
      [91652.629566] CPU: 136 PID: 41315 Comm: CPU 21/KVM Tainted: G           O    4.14.0-1.rc4.dev.gitb27fc5c.el7.centos.ppc64le #1
      [91652.629684] task: c0000007a419e400 task.stack: c0000000028d8000
      [91652.629750] NIP:  c0000000000e2640 LR: d00000000c36e498 CTR: c0000000000e25f0
      [91652.629829] REGS: c0000000028db5d0 TRAP: 0300   Tainted: G           O     (4.14.0-1.rc4.dev.gitb27fc5c.el7.centos.ppc64le)
      [91652.629932] MSR:  900000010280b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]>  CR: 44022422  XER: 00000000
      [91652.630034] CFAR: d00000000c373f84 DAR: d0000000157fb10c DSISR: 40000000 SOFTE: 1
      [91652.630034] GPR00: d00000000c36e498 c0000000028db850 c000000001403900 c0000007b7960000
      [91652.630034] GPR04: d0000000117fb100 d000000007ab00d8 000000000033bb10 0000000000000000
      [91652.630034] GPR08: fffffffffffffe7f 801001810073bb10 d00000000e440000 d00000000c373f70
      [91652.630034] GPR12: c0000000000e25f0 c00000000fdb9400 f000000003b24680 0000000000000000
      [91652.630034] GPR16: 00000000000004fb 00007ff7081a0000 00000000000ec91a 000000000033bb10
      [91652.630034] GPR20: 0000000000010000 00000000001b1190 0000000000000001 0000000000010000
      [91652.630034] GPR24: c0000007b7ab8038 d0000000117fb100 0000000ec91a1190 c000001e6a000000
      [91652.630034] GPR28: 00000000033bb100 000000000073bb10 c0000007b7960000 d0000000157fb100
      [91652.630735] NIP [c0000000000e2640] kvmppc_add_revmap_chain+0x50/0x120
      [91652.630806] LR [d00000000c36e498] kvmppc_book3s_hv_page_fault+0xbb8/0xc40 [kvm_hv]
      [91652.630884] Call Trace:
      [91652.630913] [c0000000028db850] [c0000000028db8b0] 0xc0000000028db8b0 (unreliable)
      [91652.630996] [c0000000028db8b0] [d00000000c36e498] kvmppc_book3s_hv_page_fault+0xbb8/0xc40 [kvm_hv]
      [91652.631091] [c0000000028db9e0] [d00000000c36a078] kvmppc_vcpu_run_hv+0xdf8/0x1300 [kvm_hv]
      [91652.631179] [c0000000028dbb30] [d00000000c2248c4] kvmppc_vcpu_run+0x34/0x50 [kvm]
      [91652.631266] [c0000000028dbb50] [d00000000c220d54] kvm_arch_vcpu_ioctl_run+0x114/0x2a0 [kvm]
      [91652.631351] [c0000000028dbbd0] [d00000000c2139d8] kvm_vcpu_ioctl+0x598/0x7a0 [kvm]
      [91652.631433] [c0000000028dbd40] [c0000000003832e0] do_vfs_ioctl+0xd0/0x8c0
      [91652.631501] [c0000000028dbde0] [c000000000383ba4] SyS_ioctl+0xd4/0x130
      [91652.631569] [c0000000028dbe30] [c00000000000b8e0] system_call+0x58/0x6c
      [91652.631635] Instruction dump:
      [91652.631676] fba1ffe8 fbc1fff0 fbe1fff8 f8010010 f821ffa1 2fa70000 793d0020 e9432110
      [91652.631814] 7bbf26e4 7c7e1b78 7feafa14 409e0094 <807f000c> 786326e4 7c6a1a14 93a40008
      [91652.631959] ---[ end trace ac85ba6db72e5b2e ]---
      
      To fix this, we tighten up the way that the hpte_setup_done flag is
      checked to ensure that it does provide the guarantee that the resizing
      code needs.  In kvmppc_run_core(), we check the hpte_setup_done flag
      after disabling interrupts and refuse to enter the guest if it is
      clear (for a HPT guest).  The code that checks hpte_setup_done and
      calls kvmppc_hv_setup_htab_rma() is moved from kvmppc_vcpu_run_hv()
      to a point inside the main loop in kvmppc_run_vcpu(), ensuring that
      we don't just spin endlessly calling kvmppc_run_core() while
      hpte_setup_done is clear, but instead have a chance to block on the
      kvm->lock mutex.
      
      Finally we also check hpte_setup_done inside the region in
      kvmppc_book3s_hv_page_fault() where the HPTE is locked and we are about
      to update the HPTE, and bail out if it is clear.  If another CPU is
      inside kvm_vm_ioctl_resize_hpt_commit) and has cleared hpte_setup_done,
      then we know that either we are looking at a HPTE
      that resize_hpt_rehash_hpte() has not yet processed, which is OK,
      or else we will see hpte_setup_done clear and refuse to update it,
      because of the full barrier formed by the unlock of the HPTE in
      resize_hpt_rehash_hpte() combined with the locking of the HPTE
      in kvmppc_book3s_hv_page_fault().
      
      Fixes: 5e985969 ("KVM: PPC: Book3S HV: Outline of KVM-HV HPT resizing implementation")
      Cc: stable@vger.kernel.org # v4.10+
      Reported-by: default avatarSatheesh Rajendran <satheera@in.ibm.com>
      Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      38c53af8
  2. 15 Oct, 2017 1 commit
    • Benjamin Herrenschmidt's avatar
      KVM: PPC: Book3S HV: Add more barriers in XIVE load/unload code · ad98dd1a
      Benjamin Herrenschmidt authored
      On POWER9 systems, we push the VCPU context onto the XIVE (eXternal
      Interrupt Virtualization Engine) hardware when entering a guest,
      and pull the context off the XIVE when exiting the guest.  The push
      is done with cache-inhibited stores, and the pull with cache-inhibited
      loads.
      
      Testing has revealed that it is possible (though very rare) for
      the stores to get reordered with the loads so that we end up with the
      guest VCPU context still loaded on the XIVE after we have exited the
      guest.  When that happens, it is possible for the same VCPU context
      to then get loaded on another CPU, which causes the machine to
      checkstop.
      
      To fix this, we add I/O barrier instructions (eieio) before and
      after the push and pull operations.  As partial compensation for the
      potential slowdown caused by the extra barriers, we remove the eieio
      instructions between the two stores in the push operation, and between
      the two loads in the pull operation.  (The architecture requires
      loads to cache-inhibited, guarded storage to be kept in order, and
      requires stores to cache-inhibited, guarded storage likewise to be
      kept in order, but allows such loads and stores to be reordered with
      respect to each other.)
      Reported-by: default avatarCarol L Soto <clsoto@us.ibm.com>
      Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      ad98dd1a
  3. 14 Oct, 2017 3 commits
    • Alexey Kardashevskiy's avatar
      KVM: PPC: Book3S: Protect kvmppc_gpa_to_ua() with SRCU · 8f6a9f0d
      Alexey Kardashevskiy authored
      kvmppc_gpa_to_ua() accesses KVM memory slot array via
      srcu_dereference_check() and this produces warnings from RCU like below.
      
      This extends the existing srcu_read_lock/unlock to cover that
      kvmppc_gpa_to_ua() as well.
      
      We did not hit this before as this lock is not needed for the realmode
      handlers and hash guests would use the realmode path all the time;
      however the radix guests are always redirected to the virtual mode
      handlers and hence the warning.
      
      [   68.253798] ./include/linux/kvm_host.h:575 suspicious rcu_dereference_check() usage!
      [   68.253799]
                     other info that might help us debug this:
      
      [   68.253802]
                     rcu_scheduler_active = 2, debug_locks = 1
      [   68.253804] 1 lock held by qemu-system-ppc/6413:
      [   68.253806]  #0:  (&vcpu->mutex){+.+.}, at: [<c00800000e3c22f4>] vcpu_load+0x3c/0xc0 [kvm]
      [   68.253826]
                     stack backtrace:
      [   68.253830] CPU: 92 PID: 6413 Comm: qemu-system-ppc Tainted: G        W       4.14.0-rc3-00553-g432dcba58e9c-dirty #72
      [   68.253833] Call Trace:
      [   68.253839] [c000000fd3d9f790] [c000000000b7fcc8] dump_stack+0xe8/0x160 (unreliable)
      [   68.253845] [c000000fd3d9f7d0] [c0000000001924c0] lockdep_rcu_suspicious+0x110/0x180
      [   68.253851] [c000000fd3d9f850] [c0000000000e825c] kvmppc_gpa_to_ua+0x26c/0x2b0
      [   68.253858] [c000000fd3d9f8b0] [c00800000e3e1984] kvmppc_h_put_tce+0x12c/0x2a0 [kvm]
      
      Fixes: 121f80ba ("KVM: PPC: VFIO: Add in-kernel acceleration for VFIO")
      Cc: stable@vger.kernel.org # v4.12+
      Signed-off-by: default avatarAlexey Kardashevskiy <aik@ozlabs.ru>
      Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      8f6a9f0d
    • Nicholas Piggin's avatar
      KVM: PPC: Book3S HV: POWER9 more doorbell fixes · 2cde3716
      Nicholas Piggin authored
      - Add another case where msgsync is required.
      - Required barrier sequence for global doorbells is msgsync ; lwsync
      
      When msgsnd is used for IPIs to other cores, msgsync must be executed by
      the target to order stores performed on the source before its msgsnd
      (provided the source executes the appropriate sync).
      
      Fixes: 1704a81c ("KVM: PPC: Book3S HV: Use msgsnd for IPIs to other cores on POWER9")
      Cc: stable@vger.kernel.org # v4.10+
      Signed-off-by: default avatarNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      2cde3716
    • Greg Kurz's avatar
      KVM: PPC: Fix oops when checking KVM_CAP_PPC_HTM · ac64115a
      Greg Kurz authored
      The following program causes a kernel oops:
      
      #include <sys/types.h>
      #include <sys/stat.h>
      #include <fcntl.h>
      #include <sys/ioctl.h>
      #include <linux/kvm.h>
      
      main()
      {
          int fd = open("/dev/kvm", O_RDWR);
          ioctl(fd, KVM_CHECK_EXTENSION, KVM_CAP_PPC_HTM);
      }
      
      This happens because when using the global KVM fd with
      KVM_CHECK_EXTENSION, kvm_vm_ioctl_check_extension() gets
      called with a NULL kvm argument, which gets dereferenced
      in is_kvmppc_hv_enabled(). Spotted while reading the code.
      
      Let's use the hv_enabled fallback variable, like everywhere
      else in this function.
      
      Fixes: 23528bb2 ("KVM: PPC: Introduce KVM_CAP_PPC_HTM")
      Cc: stable@vger.kernel.org # v4.7+
      Signed-off-by: default avatarGreg Kurz <groug@kaod.org>
      Reviewed-by: default avatarDavid Gibson <david@gibson.dropbear.id.au>
      Reviewed-by: default avatarThomas Huth <thuth@redhat.com>
      Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      ac64115a
  4. 09 Oct, 2017 1 commit
  5. 07 Oct, 2017 4 commits
  6. 06 Oct, 2017 20 commits
    • Linus Torvalds's avatar
      Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux · dbeb1a8f
      Linus Torvalds authored
      Pull clk fixes from Stephen Boyd:
      
       - build fix to export the clk_bulk_prepare() symbol
      
       - suspend fix for Samsung Exynos SoCs where we need to keep clks on
         across suspend
      
       - two critical clk markings for clks that shouldn't ever turn off on
         Rockchip SoCs
      
       - a fix for a copy-paste mistake on Rockchip rk3128 causing some clks
         to touch the same bit and trample over one another
      
      * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
        clk: samsung: exynos4: Enable VPLL and EPLL clocks for suspend/resume cycle
        clk: Export clk_bulk_prepare()
        clk: rockchip: add sclk_timer5 as critical clock on rk3128
        clk: rockchip: fix up rk3128 pvtm and mipi_24m gate regs error
        clk: rockchip: add pclk_pmu as critical clock on rk3128
      dbeb1a8f
    • Linus Torvalds's avatar
      Merge tag 'arc-4.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc · ed0f72f4
      Linus Torvalds authored
      Pull ARC udpates from Vineet Gupta:
      
       - updates for various platforms
      
       - boot log updates for upcoming HS48 family of cores (dual issue)
      
      * tag 'arc-4.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
        ARC: [plat-hsdk]: Add reset controller node to manage ethernet reset
        ARC: [plat-hsdk]: Temporary fix to set CPU frequency to 1GHz
        ARC: fix allnoconfig build warning
        ARCv2: boot log: identify HS48 cores (dual issue)
        ARC: boot log: decontaminate ARCv2 ISA_CONFIG register
        arc: remove redundant UTS_MACHINE define in arch/arc/Makefile
        ARC: [plat-eznps] Update platform maintainer as Noam left
        ARC: [plat-hsdk] use actual clk driver to manage cpu clk
        ARC: [*defconfig] Reenable soft lock-up detector
        ARC: [plat-axs10x] sdio: Temporary fix of sdio ciu frequency
        ARC: [plat-hsdk] sdio: Temporary fix of sdio ciu frequency
        ARC: [plat-axs103] Add temporary quirk to reset ethernet IP
      ed0f72f4
    • Linus Torvalds's avatar
      Merge tag 'xfs-4.14-fixes-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux · eab26ad1
      Linus Torvalds authored
      Pull xfs fixes from Darrick Wong:
      
       - fix a race between overlapping copy on write aio
      
       - fix cow fork swapping when we defragment reflinked files
      
      * tag 'xfs-4.14-fixes-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
        xfs: handle racy AIO in xfs_reflink_end_cow
        xfs: always swap the cow forks when swapping extents
      eab26ad1
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.dk/linux-block · 17d084c8
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
       "A collection of fixes for this series. This contains:
      
         - NVMe pull request from Christoph, one uuid attribute fix, and one
           fix for the controller memory buffer address for remapped BARs.
      
         - use-after-free fix for bsg, from Benjamin Block.
      
         - bcache race/use-after-free fix for a list traversal, fixing a
           regression in this merge window. From Coly Li.
      
         - null_blk change configfs dependency change from a 'depends' to a
           'select'. This is a change from this merge window as well. From me.
      
         - nbd signal fix from Josef, fixing a regression introduced with the
           status code changes.
      
         - nbd MAINTAINERS mailing list entry update.
      
         - blk-throttle stall fix from Joseph Qi.
      
         - blk-mq-debugfs fix from Omar, fixing an issue where we don't
           register the IO scheduler debugfs directory, if the driver is
           loaded with it. Only shows up if you switch through the sysfs
           interface"
      
      * 'for-linus' of git://git.kernel.dk/linux-block:
        bsg-lib: fix use-after-free under memory-pressure
        nvme-pci: Use PCI bus address for data/queues in CMB
        blk-mq-debugfs: fix device sched directory for default scheduler
        null_blk: change configfs dependency to select
        blk-throttle: fix possible io stall when upgrade to max
        MAINTAINERS: update list for NBD
        nbd: fix -ERESTARTSYS handling
        nvme: fix visibility of "uuid" ns attribute
        bcache: use llist_for_each_entry_safe() in __closure_wake_up()
      17d084c8
    • Linus Torvalds's avatar
      Merge tag 'pci-v4.14-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci · 80cf1f8c
      Linus Torvalds authored
      Pull PCI fixes from Bjorn Helgaas:
       "Fix legacy IDE probe issues exposed by recent PCI core IRQ mapping
        changes (Bartlomiej Zolnierkiewicz, Lorenzo Pieralisi)"
      
      * tag 'pci-v4.14-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
        ide: fix IRQ assignment for PCI bus order probing
        ide: pci: free PCI BARs on initialization failure
        ide: free hwif->portdev on hwif_init() failure
      80cf1f8c
    • Linus Torvalds's avatar
      Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux · 27549068
      Linus Torvalds authored
      Pull arm64 fixes from Catalin Marinas:
      
       - Bring initialisation of user space undefined instruction handling
         early (core_initcall) since late_initcall() happens after modprobe in
         initramfs is invoked. Similar fix for fpsimd initialisation
      
       - Increase the kernel stack when KASAN is enabled
      
       - Bring the PCI ACS enabling earlier via the
         iort_init_platform_devices()
      
       - Fix misleading data abort address printing (decimal vs hex)
      
      * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
        arm64: Ensure fpsimd support is ready before userspace is active
        arm64: Ensure the instruction emulation is ready for userspace
        arm64: Use larger stacks when KASAN is selected
        ACPI/IORT: Fix PCI ACS enablement
        arm64: fix misleading data abort decoding
      27549068
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 8d473320
      Linus Torvalds authored
      Pull KVM fixes from Radim Krčmář:
      
       - fix PPC XIVE interrupt delivery
      
       - fix x86 RCU breakage from asynchronous page faults when built without
         PREEMPT_COUNT
      
       - fix x86 build with -frecord-gcc-switches
      
       - fix x86 build without X86_LOCAL_APIC
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        KVM: add X86_LOCAL_APIC dependency
        x86/kvm: Move kvm_fastop_exception to .fixup section
        kvm/x86: Avoid async PF preempting the kernel incorrectly
        KVM: PPC: Book3S: Fix server always zero from kvmppc_xive_get_xive()
      8d473320
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma · d109d83f
      Linus Torvalds authored
      Pull rdma fixes from Doug Ledford:
       "This is a pretty small pull request. Only 6 patches in total. There
        are no outstanding -rc patches on the mailing list after this pull
        request, so only if some new issues are discovered in the remainder of
        the rc cycles will you hear from me again.
      
        Summary:
         - a fix for iwpm netlink usage
         - a fix for error unwinding in mlx5
         - two fixes to vlan handling in qedr
         - a couple small i40iw fixes"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
        i40iw: Fix port number for query QP
        i40iw: Add missing memory barriers
        RDMA/qedr: Parse vlan priority as sl
        RDMA/qedr: Parse VLAN ID correctly and ignore the value of zero
        IB/mlx5: Fix label order in error path handling
        RDMA/iwpm: Properly mark end of NL messages
      d109d83f
    • Linus Torvalds's avatar
      Merge branch 'for-4.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux · bf2db0b9
      Linus Torvalds authored
      Pull btrfs fixes from David Sterba:
       "Two more fixes for bugs introduced in 4.13.
      
        The sector_t problem with 32bit architecture and !LBDAF config seems
        serious but the number of affected deployments is hopefully low.
      
        The clashing status bits could lead to a confusing in-memory state of
        the whole-filesystem operations if used with the quota override sysfs
        knob"
      
      * 'for-4.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
        Btrfs: fix overlap of fs_info::flags values
        btrfs: avoid overflow when sector_t is 32 bit
      bf2db0b9
    • Linus Torvalds's avatar
      Merge tag 'ceph-for-4.14-rc4' of git://github.com/ceph/ceph-client · b77779b9
      Linus Torvalds authored
      Pull ceph fixes from Ilya Dryomov:
       "Two fixups for CephFS snapshot-handling patches in -rc1"
      
      * tag 'ceph-for-4.14-rc4' of git://github.com/ceph/ceph-client:
        ceph: fix __choose_mds() for LSSNAP request
        ceph: properly queue cap snap for newly created snap realm
      b77779b9
    • Eugeniy Paltsev's avatar
      ARC: [plat-hsdk]: Add reset controller node to manage ethernet reset · ab8eb7db
      Eugeniy Paltsev authored
      DW ethernet controller on HSDK hangs sometimes after SW reset, so
      add reset node to make possible to reset DW ethernet controller HW.
      Signed-off-by: default avatarEugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
      Signed-off-by: default avatarVineet Gupta <vgupta@synopsys.com>
      ab8eb7db
    • Linus Torvalds's avatar
      Merge branch 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs · 8d4ef4e1
      Linus Torvalds authored
      Pull overlayfs fixes from Miklos Szeredi:
       "Fix a regression in 4.14 and one in 4.13. The latter is a case when
        Docker is doing something it really shouldn't and gets away with it.
        We now print a warning instead of erroring out.
      
        There are also fixes to several error paths"
      
      * 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
        ovl: fix regression caused by exclusive upper/work dir protection
        ovl: fix missing unlock_rename() in ovl_do_copy_up()
        ovl: fix dentry leak in ovl_indexdir_cleanup()
        ovl: fix dput() of ERR_PTR in ovl_cleanup_index()
        ovl: fix error value printed in ovl_lookup_index()
        ovl: fix may_write_real() for overlayfs directories
      8d4ef4e1
    • Linus Torvalds's avatar
      Merge tag 'powerpc-4.14-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · 1249b571
      Linus Torvalds authored
      Pull powerpc fixes from Michael Ellerman:
       "Nine small fixes, really nothing that stands out.
      
        A work-around for a spurious MCE on Power9. A CXL fault handling fix,
        some fixes to the new XIVE code, and a fix to the new 32-bit
        STRICT_KERNEL_RWX code.
      
        Fixes for old code/stable: an fix to an incorrect TLB flush on boot
        but not on any current machines, a compile error on 4xx and a fix to
        memory hotplug when using radix (Power9).
      
        Thanks to: Anton Blanchard, Cédric Le Goater, Christian Lamparter,
        Christophe Leroy, Christophe Lombard, Guenter Roeck, Jeremy Kerr,
        Michael Neuling, Nicholas Piggin"
      
      * tag 'powerpc-4.14-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        powerpc/powernv: Increase memory block size to 1GB on radix
        powerpc/mm: Call flush_tlb_kernel_range with interrupts enabled
        powerpc/xive: Clear XIVE internal structures when a CPU is removed
        powerpc/xive: Fix IPI reset
        powerpc/4xx: Fix compile error with 64K pages on 40x, 44x
        powerpc: Fix action argument for cpufeatures-based TLB flush
        cxl: Fix memory page not handled
        powerpc: Fix workaround for spurious MCE on POWER9
        powerpc: Handle MCE on POWER9 with only DSISR bit 30 set
      1249b571
    • Linus Torvalds's avatar
      Merge tag 'drm-fixes-for-v4.14-rc4' of git://people.freedesktop.org/~airlied/linux · 9c0c1ada
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "Some i915 fixes from the last two weeks (as they were on a strange
        base and I just waited for rc3), also a single sun4i hdmi fix"
      
      * tag 'drm-fixes-for-v4.14-rc4' of git://people.freedesktop.org/~airlied/linux:
        drm/i915/glk: Fix DMC/DC state idleness calculation
        drm/i915/cnl: Reprogram DMC firmware after S3/S4 resume
        drm/i915: Fix DDI PHY init if it was already on
        drm/sun4i: hdmi: Disable clks in bind function error path and unbind function
        drm/i915/bios: ignore HDMI on port A
        drm/i915: remove redundant variable hw_check
        drm/i915: always update ELD connector type after get modes
      9c0c1ada
    • Linus Torvalds's avatar
      Merge branch 'core-watchdog-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 27efed3e
      Linus Torvalds authored
      Pull watchddog clean-up and fixes from Thomas Gleixner:
       "The watchdog (hard/softlockup detector) code is pretty much broken in
        its current state. The patch series addresses this by removing all
        duct tape and refactoring it into a workable state.
      
        The reasons why I ask for inclusion that late in the cycle are:
      
         1) The code causes lockdep splats vs. hotplug locking which get
            reported over and over. Unfortunately there is no easy fix.
      
         2) The risk of breakage is minimal because it's already broken
      
         3) As 4.14 is a long term stable kernel, I prefer to have working
            watchdog code in that and the lockdep issues resolved. I wouldn't
            ask you to pull if 4.14 wouldn't be a LTS kernel or if the
            solution would be easy to backport.
      
         4) The series was around before the merge window opened, but then got
            delayed due to the UP failure caused by the for_each_cpu()
            surprise which we discussed recently.
      
        Changes vs. V1:
      
         - Addressed your review points
      
         - Addressed the warning in the powerpc code which was discovered late
      
         - Changed two function names which made sense up to a certain point
           in the series. Now they match what they do in the end.
      
         - Fixed a 'unused variable' warning, which got not detected by the
           intel robot. I triggered it when trying all possible related config
           combinations manually. Randconfig testing seems not random enough.
      
        The changes have been tested by and reviewed by Don Zickus and tested
        and acked by Micheal Ellerman for powerpc"
      
      * 'core-watchdog-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits)
        watchdog/core: Put softlockup_threads_initialized under ifdef guard
        watchdog/core: Rename some softlockup_* functions
        powerpc/watchdog: Make use of watchdog_nmi_probe()
        watchdog/core, powerpc: Lock cpus across reconfiguration
        watchdog/core, powerpc: Replace watchdog_nmi_reconfigure()
        watchdog/hardlockup/perf: Fix spelling mistake: "permanetely" -> "permanently"
        watchdog/hardlockup/perf: Cure UP damage
        watchdog/hardlockup: Clean up hotplug locking mess
        watchdog/hardlockup/perf: Simplify deferred event destroy
        watchdog/hardlockup/perf: Use new perf CPU enable mechanism
        watchdog/hardlockup/perf: Implement CPU enable replacement
        watchdog/hardlockup/perf: Implement init time detection of perf
        watchdog/hardlockup/perf: Implement init time perf validation
        watchdog/core: Get rid of the racy update loop
        watchdog/core, powerpc: Make watchdog_nmi_reconfigure() two stage
        watchdog/sysctl: Clean up sysctl variable name space
        watchdog/sysctl: Get rid of the #ifdeffery
        watchdog/core: Clean up header mess
        watchdog/core: Further simplify sysctl handling
        watchdog/core: Get rid of the thread teardown/setup dance
        ...
      27efed3e
    • Suzuki K Poulose's avatar
      arm64: Ensure fpsimd support is ready before userspace is active · ae2e972d
      Suzuki K Poulose authored
      We register the pm/hotplug callbacks for FPSIMD as late_initcall,
      which happens after the userspace is active (from initramfs via
      populate_rootfs, a rootfs_initcall). Make sure we are ready even
      before the userspace could potentially use it, by promoting to
      a core_initcall.
      
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Dave Martin <dave.martin@arm.com>
      Signed-off-by: default avatarSuzuki K Poulose <suzuki.poulose@arm.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      ae2e972d
    • Suzuki K Poulose's avatar
      arm64: Ensure the instruction emulation is ready for userspace · c0d8832e
      Suzuki K Poulose authored
      We trap and emulate some instructions (e.g, mrs, deprecated instructions)
      for the userspace. However the handlers for these are registered as
      late_initcalls and the userspace could be up and running from the initramfs
      by that time (with populate_rootfs, which is a rootfs_initcall()). This
      could cause problems for the early applications ending up in failure
      like :
      
      [   11.152061] modprobe[93]: undefined instruction: pc=0000ffff8ca48ff4
      
      This patch promotes the specific calls to core_initcalls, which are
      guaranteed to be completed before we hit userspace.
      
      Cc: stable@vger.kernel.org
      Cc: Dave Martin <dave.martin@arm.com>
      Cc: Matthias Brugger <mbrugger@suse.com>
      Cc: James Morse <james.morse@arm.com>
      Reported-by: default avatarMatwey V. Kornilov <matwey.kornilov@gmail.com>
      Signed-off-by: default avatarSuzuki K Poulose <suzuki.poulose@arm.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      c0d8832e
    • Anton Blanchard's avatar
      powerpc/powernv: Increase memory block size to 1GB on radix · 53ecde0b
      Anton Blanchard authored
      Memory hot unplug on PowerNV radix hosts is broken. Our memory block
      size is 256MB but since we map the linear region with very large
      pages, each pte we tear down maps 1GB.
      
      A hot unplug of one 256MB memory block results in 768MB of memory
      getting unintentionally unmapped. At this point we are likely to oops.
      
      Fix this by increasing our memory block size to 1GB on PowerNV radix
      hosts.
      
      Fixes: 4b5d62ca ("powerpc/mm: add radix__remove_section_mapping()")
      Cc: stable@vger.kernel.org # v4.11+
      Signed-off-by: default avatarAnton Blanchard <anton@samba.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      53ecde0b
    • Dave Airlie's avatar
      Merge tag 'drm-misc-fixes-2017-10-05' of git://anongit.freedesktop.org/git/drm-misc into drm-fixes · baf7c1f7
      Dave Airlie authored
      One bugfix in sun4i for 4.14
      
      * tag 'drm-misc-fixes-2017-10-05' of git://anongit.freedesktop.org/git/drm-misc:
        drm/sun4i: hdmi: Disable clks in bind function error path and unbind function
      baf7c1f7
    • Dave Airlie's avatar
      Merge tag 'drm-intel-fixes-2017-10-04' of... · 00bb09c4
      Dave Airlie authored
      Merge tag 'drm-intel-fixes-2017-10-04' of git://anongit.freedesktop.org/git/drm-intel into drm-fixes
      
      drm/i915 fixes for 4.14-rc4:
      
      All 3 highest GLK bugs fixed by Imre:
      - GLK drv reload - Fix DDI Phy init if it was already on.
      - GLK suspend resume - Reprogram DMC firmware after s3/s4.
      - GLK DC states - Fix idleness calculation.
      
      * tag 'drm-intel-fixes-2017-10-04' of git://anongit.freedesktop.org/git/drm-intel:
        drm/i915/glk: Fix DMC/DC state idleness calculation
        drm/i915/cnl: Reprogram DMC firmware after S3/S4 resume
        drm/i915: Fix DDI PHY init if it was already on
      00bb09c4
  7. 05 Oct, 2017 10 commits
    • Linus Torvalds's avatar
      Merge tag 'pm-4.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 7a92616c
      Linus Torvalds authored
      Pull power management fix from Rafael Wysocki:
       "This fixes a code ordering issue in the main suspend-to-idle loop that
        causes some "low power S0 idle" conditions to be incorrectly reported
        as unmet with suspend/resume debug messages enabled"
      
      * tag 'pm-4.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        PM / s2idle: Invoke the ->wake() platform callback earlier
      7a92616c
    • Rafael J. Wysocki's avatar
      Merge branch 'pm-sleep' · ca935f8e
      Rafael J. Wysocki authored
      * pm-sleep:
        PM / s2idle: Invoke the ->wake() platform callback earlier
      ca935f8e
    • Linus Torvalds's avatar
      Merge tag 'for-4.14/dm-fixes' of... · 076264ad
      Linus Torvalds authored
      Merge tag 'for-4.14/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
      
      Pull device mapper fixes from Mike Snitzer:
      
       - a stable fix for the alignment of the event number reported at the
         end of the 'DM_LIST_DEVICES' ioctl.
      
       - a couple stable fixes for the DM crypt target.
      
       - a DM raid health status reporting fix.
      
      * tag 'for-4.14/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
        dm raid: fix incorrect status output at the end of a "recover" process
        dm crypt: reject sector_size feature if device length is not aligned to it
        dm crypt: fix memory leak in crypt_ctr_cipher_old()
        dm ioctl: fix alignment of event number in the device list
      076264ad
    • Jonathan Brassow's avatar
      dm raid: fix incorrect status output at the end of a "recover" process · 41dcf197
      Jonathan Brassow authored
      There are three important fields that indicate the overall health and
      status of an array: dev_health, sync_ratio, and sync_action.  They tell
      us the condition of the devices in the array, and the degree to which
      the array is synchronized.
      
      This commit fixes a condition that is reported incorrectly.  When a member
      of the array is being rebuilt or a new device is added, the "recover"
      process is used to synchronize it with the rest of the array.  When the
      process is complete, but the sync thread hasn't yet been reaped, it is
      possible for the state of MD to be:
       mddev->recovery = [ MD_RECOVERY_RUNNING MD_RECOVERY_RECOVER MD_RECOVERY_DONE ]
       curr_resync_completed = <max dev size> (but not MaxSector)
       and all rdevs to be In_sync.
      This causes the 'array_in_sync' output parameter that is passed to
      rs_get_progress() to be computed incorrectly and reported as 'false' --
      or not in-sync.  This in turn causes the dev_health status characters to
      be reported as all 'a', rather than the proper 'A'.
      
      This can cause erroneous output for several seconds at a time when tools
      will want to be checking the condition due to events that are raised at
      the end of a sync process.  Fix this by properly calculating the
      'array_in_sync' return parameter in rs_get_progress().
      
      Also, remove an unnecessary intermediate 'recovery_cp' variable in
      rs_get_progress().
      Signed-off-by: default avatarJonathan Brassow <jbrassow@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      41dcf197
    • Arnd Bergmann's avatar
      KVM: add X86_LOCAL_APIC dependency · e42eef4b
      Arnd Bergmann authored
      The rework of the posted interrupt handling broke building without
      support for the local APIC:
      
      ERROR: "boot_cpu_physical_apicid" [arch/x86/kvm/kvm-intel.ko] undefined!
      
      That configuration is probably not particularly useful anyway, so
      we can avoid the randconfig failures by adding a Kconfig dependency.
      
      Fixes: 8b306e2f ("KVM: VMX: avoid double list add with VT-d posted interrupts")
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      e42eef4b
    • Linus Torvalds's avatar
      Merge tag 'sound-4.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · 0f380715
      Linus Torvalds authored
      Pull sound fixes from Takashi Iwai:
       "A collection of small fixes, mostly with stable ones:
      
       - X32 ABI fix for PCM; likely not so many people suffer from it, but
         still better to fix
      
       - Two minor kernel warning fixes on USB audio devices spotted by
         syzkaller
      
       - Regression fix of echoaudio due to its inconsistent dimension
      
       - Fix for HBR support on Intel DP audio, on some recent chips
      
       - USB-audio quirk for yet another Plantronics devices
      
       - Fix for potential double-fetch in ASIHPI FIFO queue"
      
      * tag 'sound-4.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
        ALSA: usx2y: Suppress kernel warning at page allocation failures
        Revert "ALSA: echoaudio: purge contradictions between dimension matrix members and total number of members"
        ALSA: usb-audio: Check out-of-bounds access by corrupted buffer descriptor
        ALSA: pcm: Fix structure definition for X32 ABI
        ALSA: usb-audio: Add sample rate quirk for Plantronics C310/C520-M
        ALSA: hda - program ICT bits to support HBR audio
        ALSA: asihpi: fix a potential double-fetch bug when copying puhm
        ALSA: compress: Remove unused variable
      0f380715
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid · 77ede3a0
      Linus Torvalds authored
      Pull HID subsystem fixes from Jiri Kosina:
      
       - buffer management size fix for i2c-hid driver, from Adrian Salido
      
       - tool ID regression fixes for Wacom driver from Jason Gerecke
      
       - a few small assorted fixes and a few device ID additions
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
        Revert "HID: multitouch: Support ALPS PTP stick with pid 0x120A"
        HID: hidraw: fix power sequence when closing device
        HID: wacom: Always increment hdev refcount within wacom_get_hdev_data
        HID: wacom: generic: Clear ABS_MISC when tool leaves proximity
        HID: wacom: generic: Send MSC_SERIAL and ABS_MISC when leaving prox
        HID: i2c-hid: allocate hid buffers for real worst case
        HID: rmi: Make sure the HID device is opened on resume
        HID: multitouch: Support ALPS PTP stick with pid 0x120A
        HID: multitouch: support buttons and trackpoint on Lenovo X1 Tab Gen2
        HID: wacom: Correct coordinate system of touchring and pen twist
        HID: wacom: Properly report negative values from Intuos Pro 2 Bluetooth
        HID: multitouch: Fix system-control buttons not working
        HID: add multi-input quirk for IDC6680 touchscreen
        HID: wacom: leds: Don't try to control the EKR's read-only LEDs
        HID: wacom: bits shifted too much for 9th and 10th buttons
      77ede3a0
    • Jens Axboe's avatar
      Merge branch 'nvme-4.14' of git://git.infradead.org/nvme into for-linus · d7b544de
      Jens Axboe authored
      Pull NVMe fixes from Christoph:
      
      "A trivial one-liner from Martin to fix the visible of the uuid attr,
      and another one (originally from Abhishek Shah, rewritten by me) to fix
      the CMB addresses passed back to the controller in case of a system that
      remaps BAR addresses between host and device."
      d7b544de
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 9a431ef9
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Check iwlwifi 9000 reorder buffer out-of-space condition properly,
          from Sara Sharon.
      
       2) Fix RCU splat in qualcomm rmnet driver, from Subash Abhinov
          Kasiviswanathan.
      
       3) Fix session and tunnel release races in l2tp, from Guillaume Nault
          and Sabrina Dubroca.
      
       4) Fix endian bug in sctp_diag_dump(), from Dan Carpenter.
      
       5) Several mlx5 driver fixes from the Mellanox folks (max flow counters
          cap check, invalid memory access in IPoIB support, etc.)
      
       6) tun_get_user() should bail if skb->len is zero, from Alexander
          Potapenko.
      
       7) Fix RCU lookups in inetpeer, from Eric Dumazet.
      
       8) Fix locking in packet_do_bund().
      
       9) Handle cb->start() error properly in netlink dump code, from Jason
          A. Donenfeld.
      
      10) Handle multicast properly in UDP socket early demux code. From Paolo
          Abeni.
      
      11) Several erspan bug fixes in ip_gre, from Xin Long.
      
      12) Fix use-after-free in socket filter code, in order to handle the
          fact that listener lock is no longer taken during the three-way TCP
          handshake. From Eric Dumazet.
      
      13) Fix infoleak in RTM_GETSTATS, from Nikolay Aleksandrov.
      
      14) Fix tail call generation in x86-64 BPF JIT, from Alexei Starovoitov.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (77 commits)
        net: 8021q: skip packets if the vlan is down
        bpf: fix bpf_tail_call() x64 JIT
        net: stmmac: dwmac-rk: Add RK3128 GMAC support
        rndis_host: support Novatel Verizon USB730L
        net: rtnetlink: fix info leak in RTM_GETSTATS call
        socket, bpf: fix possible use after free
        mlxsw: spectrum_router: Track RIF of IPIP next hops
        mlxsw: spectrum_router: Move VRF refcounting
        net: hns3: Fix an error handling path in 'hclge_rss_init_hw()'
        net: mvpp2: Fix clock resource by adding an optional bus clock
        r8152: add Linksys USB3GIGV1 id
        l2tp: fix l2tp_eth module loading
        ip_gre: erspan device should keep dst
        ip_gre: set tunnel hlen properly in erspan_tunnel_init
        ip_gre: check packet length and mtu correctly in erspan_xmit
        ip_gre: get key from session_id correctly in erspan_rcv
        tipc: use only positive error codes in messages
        ppp: fix __percpu annotation
        udp: perform source validation for mcast early demux
        IPv4: early demux can return an error code
        ...
      9a431ef9
    • Amir Goldstein's avatar
      ovl: fix regression caused by exclusive upper/work dir protection · 85fdee1e
      Amir Goldstein authored
      Enforcing exclusive ownership on upper/work dirs caused a docker
      regression: https://github.com/moby/moby/issues/34672.
      
      Euan spotted the regression and pointed to the offending commit.
      Vivek has brought the regression to my attention and provided this
      reproducer:
      
      Terminal 1:
      
        mount -t overlay -o workdir=work,lowerdir=lower,upperdir=upper none
              merged/
      
      Terminal 2:
      
        unshare -m
      
      Terminal 1:
      
        umount merged
        mount -t overlay -o workdir=work,lowerdir=lower,upperdir=upper none
              merged/
        mount: /root/overlay-testing/merged: none already mounted or mount point
               busy
      
      To fix the regression, I replaced the error with an alarming warning.
      With index feature enabled, mount does fail, but logs a suggestion to
      override exclusive dir protection by disabling index.
      Note that index=off mount does take the inuse locks, so a concurrent
      index=off will issue the warning and a concurrent index=on mount will fail.
      
      Documentation was updated to reflect this change.
      
      Fixes: 2cac0c00 ("ovl: get exclusive ownership on upper/work dirs")
      Cc: <stable@vger.kernel.org> # v4.13
      Reported-by: default avatarEuan Kemp <euank@euank.com>
      Reported-by: default avatarVivek Goyal <vgoyal@redhat.com>
      Signed-off-by: default avatarAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      85fdee1e