1. 10 Apr, 2016 2 commits
    • David Matlack's avatar
      kvm: x86: do not leak guest xcr0 into host interrupt handlers · fc5b7f3b
      David Matlack authored
      An interrupt handler that uses the fpu can kill a KVM VM, if it runs
      under the following conditions:
       - the guest's xcr0 register is loaded on the cpu
       - the guest's fpu context is not loaded
       - the host is using eagerfpu
      
      Note that the guest's xcr0 register and fpu context are not loaded as
      part of the atomic world switch into "guest mode". They are loaded by
      KVM while the cpu is still in "host mode".
      
      Usage of the fpu in interrupt context is gated by irq_fpu_usable(). The
      interrupt handler will look something like this:
      
      if (irq_fpu_usable()) {
              kernel_fpu_begin();
      
              [... code that uses the fpu ...]
      
              kernel_fpu_end();
      }
      
      As long as the guest's fpu is not loaded and the host is using eager
      fpu, irq_fpu_usable() returns true (interrupted_kernel_fpu_idle()
      returns true). The interrupt handler proceeds to use the fpu with
      the guest's xcr0 live.
      
      kernel_fpu_begin() saves the current fpu context. If this uses
      XSAVE[OPT], it may leave the xsave area in an undesirable state.
      According to the SDM, during XSAVE bit i of XSTATE_BV is not modified
      if bit i is 0 in xcr0. So it's possible that XSTATE_BV[i] == 1 and
      xcr0[i] == 0 following an XSAVE.
      
      kernel_fpu_end() restores the fpu context. Now if any bit i in
      XSTATE_BV == 1 while xcr0[i] == 0, XRSTOR generates a #GP. The
      fault is trapped and SIGSEGV is delivered to the current process.
      
      Only pre-4.2 kernels appear to be vulnerable to this sequence of
      events. Commit 653f52c3 ("kvm,x86: load guest FPU context more eagerly")
      from 4.2 forces the guest's fpu to always be loaded on eagerfpu hosts.
      
      This patch fixes the bug by keeping the host's xcr0 loaded outside
      of the interrupts-disabled region where KVM switches into guest mode.
      
      Cc: stable@vger.kernel.org
      Suggested-by: default avatarAndy Lutomirski <luto@amacapital.net>
      Signed-off-by: default avatarDavid Matlack <dmatlack@google.com>
      [Move load after goto cancel_injection. - Paolo]
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      fc5b7f3b
    • Xiao Guangrong's avatar
      KVM: MMU: fix permission_fault() · 7a98205d
      Xiao Guangrong authored
      kvm-unit-tests complained about the PFEC is not set properly, e.g,:
      test pte.rw pte.d pte.nx pde.p pde.rw pde.pse user fetch: FAIL: error code 15
      expected 5
      Dump mapping: address: 0x123400000000
      ------L4: 3e95007
      ------L3: 3e96007
      ------L2: 2000083
      
      It's caused by the reason that PFEC returned to guest is copied from the
      PFEC triggered by shadow page table
      
      This patch fixes it and makes the logic of updating errcode more clean
      Signed-off-by: default avatarXiao Guangrong <guangrong.xiao@linux.intel.com>
      [Do not assume pfec.p=1. - Paolo]
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      7a98205d
  2. 08 Apr, 2016 1 commit
  3. 07 Apr, 2016 1 commit
  4. 06 Apr, 2016 3 commits
    • Sudeep Holla's avatar
      arm64: KVM: unregister notifiers in hyp mode teardown path · 06a71a24
      Sudeep Holla authored
      Commit 1e947bad ("arm64: KVM: Skip HYP setup when already running
      in HYP") re-organized the hyp init code and ended up leaving the CPU
      hotplug and PM notifier even if hyp mode initialization fails.
      
      Since KVM is not yet supported with ACPI, the above mentioned commit
      breaks CPU hotplug in ACPI boot.
      
      This patch fixes teardown_hyp_mode to properly unregister both CPU
      hotplug and PM notifiers in the teardown path.
      
      Fixes: 1e947bad ("arm64: KVM: Skip HYP setup when already running in HYP")
      Cc: Christoffer Dall <christoffer.dall@linaro.org>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: default avatarSudeep Holla <sudeep.holla@arm.com>
      Signed-off-by: default avatarChristoffer Dall <christoffer.dall@linaro.org>
      06a71a24
    • Marc Zyngier's avatar
      arm64: KVM: Warn when PARange is less than 40 bits · 6141570c
      Marc Zyngier authored
      We always thought that 40bits of PA range would be the minimum people
      would actually build. Anything less is terrifyingly small.
      
      Turns out that we were both right and wrong. Nobody has ever built
      such a system, but the ARM Foundation Model has a PARange set to 36bits.
      Just because we can. Oh well. Now, the KVM API explicitely says that
      we offer a 40bit PA space to the VM, so we shouldn't run KVM on
      the Foundation Model at all.
      
      That being said, this patch offers a less agressive alternative, and
      loudly warns about the configuration being unsupported. You'll still
      be able to run VMs (at your own risks, though).
      
      This is just a workaround until we have a proper userspace API where
      we report the PARange to userspace.
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: default avatarChristoffer Dall <christoffer.dall@linaro.org>
      6141570c
    • Marc Zyngier's avatar
      KVM: arm/arm64: Handle forward time correction gracefully · 1c5631c7
      Marc Zyngier authored
      On a host that runs NTP, corrections can have a direct impact on
      the background timer that we program on the behalf of a vcpu.
      
      In particular, NTP performing a forward correction will result in
      a timer expiring sooner than expected from a guest point of view.
      Not a big deal, we kick the vcpu anyway.
      
      But on wake-up, the vcpu thread is going to perform a check to
      find out whether or not it should block. And at that point, the
      timer check is going to say "timer has not expired yet, go back
      to sleep". This results in the timer event being lost forever.
      
      There are multiple ways to handle this. One would be record that
      the timer has expired and let kvm_cpu_has_pending_timer return
      true in that case, but that would be fairly invasive. Another is
      to check for the "short sleep" condition in the hrtimer callback,
      and restart the timer for the remaining time when the condition
      is detected.
      
      This patch implements the latter, with a bit of refactoring in
      order to avoid too much code duplication.
      
      Cc: <stable@vger.kernel.org>
      Reported-by: default avatarAlexander Graf <agraf@suse.de>
      Reviewed-by: default avatarAlexander Graf <agraf@suse.de>
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: default avatarChristoffer Dall <christoffer.dall@linaro.org>
      1c5631c7
  5. 05 Apr, 2016 7 commits
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 541d8f4d
      Linus Torvalds authored
      Pull KVM fixes from Paolo Bonzini:
       "Miscellaneous bugfixes.
      
        The ARM and s390 fixes are for new regressions from the merge window,
        others are usual stable material"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        compiler-gcc: disable -ftracer for __noclone functions
        kvm: x86: make lapic hrtimer pinned
        s390/mm/kvm: fix mis-merge in gmap handling
        kvm: set page dirty only if page has been writable
        KVM: x86: reduce default value of halt_poll_ns parameter
        KVM: Hyper-V: do not do hypercall userspace exits if SynIC is disabled
        KVM: x86: Inject pending interrupt even if pending nmi exist
        arm64: KVM: Register CPU notifiers when the kernel runs at HYP
        arm64: kvm: 4.6-rc1: Fix VTCR_EL2 VS setting
      541d8f4d
    • Linus Torvalds's avatar
      Merge tag 'spi-fix-v4.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi · 5003bc6c
      Linus Torvalds authored
      Pull spi fixes from Mark Brown:
       "A couple of driver specific fixes here that came in since the merge
        window plus one core fix for locking in cases where a client driver
        grabs a lock on the whole bus for an extended series of operations
        that was introduced by the changes to support accelerated flash
        operations"
      
      * tag 'spi-fix-v4.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
        spi: rockchip: fix probe deferral handling
        spi: omap2-mcspi: fix dma transfer for vmalloced buffer
        spi: fix possible deadlock between internal bus locks and bus_lock_flag
        spi: imx: Fix possible NULL pointer deref
        spi: imx: only do necessary changes to ECSPIx_CONFIGREG
        spi: rockchip: Spelling s/divsor/divisor/
      5003bc6c
    • Linus Torvalds's avatar
      Merge tag 'pinctrl-v4.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl · 1b5caa3e
      Linus Torvalds authored
      Pull pin control fixes from Linus Walleij:
       "Here is a set of pin control fixes for the v4.6 series.
      
        A bit bigger than what I hoped for, but all fixes are confined to
        drivers, a few of them also targeted to stable.
      
        Summary:
      
         - On Super-H PFC (Renesas) controllers: only use dummies on legacy
           systems.  This fixes a serious ethernet regression on a Renesas
           board.
         - Pistachio: Fix errors in the pin table.
         - Allwinner SunXi: fix the external interrupts to work.
         - Intel: fix so the high level interrupts start working, and fix a
           spurious interrupt issue.
         - Qualcomm ipq4019: fix the number of GPIOs provided (bump to 100),
           correct register offsets and handle GPIO mode properly.
         - Revert the revert on the revert so that Xway has a .to_irq()
           callback again.
         - Minor fixes to errorpaths and debug info.
         - A MAINTAINERS update"
      
      * tag 'pinctrl-v4.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
        Revert "Revert "pinctrl: lantiq: Implement gpio_chip.to_irq""
        pinctrl: qcom: ipq4019: fix register offsets
        pinctrl: qcom: ipq4019: fix the function enum for gpio mode
        pinctrl: qcom: ipq4019: set ngpios to correct value
        pinctrl: nomadik: fix pull debug print inversion
        MAINTAINERS: pinctrl: samsung: Add two new maintainers
        pinctrl: intel: implement gpio_irq_enable
        pinctrl: intel: make the high level interrupt working
        pinctrl: freescale: imx: fix bogus check of of_iomap() return value
        pinctrl: sunxi: Fix A33 external interrupts not working
        pinctrl: pistachio: fix mfio84-89 function description and pinmux.
        pinctrl: sh-pfc: only use dummy states for non-DT platforms
      1b5caa3e
    • Linus Torvalds's avatar
      Merge tag 'media/v4.6-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media · 62d2def9
      Linus Torvalds authored
      Pull media fixes from Mauro Carvalho Chehab:
       "Some bug fixes on au0828 and snd-usb-audio:
      
         - the au0828+snd-usb-audio MC patch broke several things and produced
           some race conditions.  Better to revert the patches, and re-work on
           them for a next version
      
         - fix a regression at tuner disable links logic
      
         - properly handle dev_state as a bitmask"
      
      * tag 'media/v4.6-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
        [media] Revert "[media] media: au0828 change to use Managed Media Controller API"
        [media] Revert "[media] sound/usb: Use Media Controller API to share media resources"
        [media] au0828: Fix dev_state handling
        [media] au0828: fix au0828_v4l2_close() dev_state race condition
        [media] media: au0828 fix to clear enable/disable/change source handlers
        [media] v4l2-mc: cleanup a warning
        [media] au0828: disable tuner links and cache tuner/decoder
      62d2def9
    • Paolo Bonzini's avatar
      compiler-gcc: disable -ftracer for __noclone functions · 95272c29
      Paolo Bonzini authored
      -ftracer can duplicate asm blocks causing compilation to fail in
      noclone functions.  For example, KVM declares a global variable
      in an asm like
      
          asm("2: ... \n
               .pushsection data \n
               .global vmx_return \n
               vmx_return: .long 2b");
      
      and -ftracer causes a double declaration.
      
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Michal Marek <mmarek@suse.cz>
      Cc: stable@vger.kernel.org
      Cc: kvm@vger.kernel.org
      Reported-by: default avatarLinda Walsh <lkml@tlinx.org>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      95272c29
    • Luiz Capitulino's avatar
      kvm: x86: make lapic hrtimer pinned · 61abdbe0
      Luiz Capitulino authored
      When a vCPU runs on a nohz_full core, the hrtimer used by
      the lapic emulation code can be migrated to another core.
      When this happens, it's possible to observe milisecond
      latency when delivering timer IRQs to KVM guests.
      
      The huge latency is mainly due to the fact that
      apic_timer_fn() expects to run during a kvm exit. It
      sets KVM_REQ_PENDING_TIMER and let it be handled on kvm
      entry. However, if the timer fires on a different core,
      we have to wait until the next kvm exit for the guest
      to see KVM_REQ_PENDING_TIMER set.
      
      This problem became visible after commit 9642d18e. This
      commit changed the timer migration code to always attempt
      to migrate timers away from nohz_full cores. While it's
      discussable if this is correct/desirable (I don't think
      it is), it's clear that the lapic emulation code has
      a requirement on firing the hrtimer in the same core
      where it was started. This is achieved by making the
      hrtimer pinned.
      
      Lastly, note that KVM has code to migrate timers when a
      vCPU is scheduled to run in different core. However, this
      forced migration may fail. When this happens, we can have
      the same problem. If we want 100% correctness, we'll have
      to modify apic_timer_fn() to cause a kvm exit when it runs
      on a different core than the vCPU. Not sure if this is
      possible.
      
      Here's a reproducer for the issue being fixed:
      
       1. Set all cores but core0 to be nohz_full cores
       2. Start a guest with a single vCPU
       3. Trace apic_timer_fn() and kvm_inject_apic_timer_irqs()
      
      You'll see that apic_timer_fn() will run in core0 while
      kvm_inject_apic_timer_irqs() runs in a different core. If
      you get both on core0, try running a program that takes 100%
      of the CPU and pin it to core0 to force the vCPU out.
      Signed-off-by: default avatarLuiz Capitulino <lcapitulino@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      61abdbe0
    • Christian Borntraeger's avatar
      s390/mm/kvm: fix mis-merge in gmap handling · 9c650d09
      Christian Borntraeger authored
      commit 1e133ab2 ("s390/mm: split arch/s390/mm/pgtable.c") dropped
      some changes from commit a3a92c31 ("KVM: s390: fix mismatch
      between user and in-kernel guest limit") - this breaks KVM for some
      memory sizes (kvm-s390: failed to commit memory region) like
      exactly 2GB.
      
      Cc: Dominik Dingel <dingel@linux.vnet.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Acked-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      9c650d09
  6. 04 Apr, 2016 17 commits
  7. 03 Apr, 2016 9 commits
    • Linus Torvalds's avatar
      Linux 4.6-rc2 · 9735a227
      Linus Torvalds authored
      9735a227
    • Linus Torvalds's avatar
      Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 4c3b73c6
      Linus Torvalds authored
      Pull perf fixes from Ingo Molnar:
       "Misc kernel side fixes:
      
         - fix event leak
         - fix AMD PMU driver bug
         - fix core event handling bug
         - fix build bug on certain randconfigs
      
        Plus misc tooling fixes"
      
      * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        perf/x86/amd/ibs: Fix pmu::stop() nesting
        perf/core: Don't leak event in the syscall error path
        perf/core: Fix time tracking bug with multiplexing
        perf jit: genelf makes assumptions about endian
        perf hists: Fix determination of a callchain node's childlessness
        perf tools: Add missing initialization of perf_sample.cpumode in synthesized samples
        perf tools: Fix build break on powerpc
        perf/x86: Move events_sysfs_show() outside CPU_SUP_INTEL
        perf bench: Fix detached tarball building due to missing 'perf bench memcpy' headers
        perf tests: Fix tarpkg build test error output redirection
      4c3b73c6
    • Linus Torvalds's avatar
      Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 7b367f5d
      Linus Torvalds authored
      Pull core kernel fixes from Ingo Molnar:
       "This contains the nohz/atomic cleanup/fix for the fetch_or() ugliness
        you noted during the original nohz pull request, plus there's also
        misc fixes:
      
         - fix liblockdep build bug
         - fix uapi header build bug
         - print more lockdep hash collision info to help debug recent reports
           of hash collisions
         - update MAINTAINERS email address"
      
      * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        MAINTAINERS: Update my email address
        locking/lockdep: Print chain_key collision information
        uapi/linux/stddef.h: Provide __always_inline to userspace headers
        tools/lib/lockdep: Fix unsupported 'basename -s' in run_tests.sh
        locking/atomic, sched: Unexport fetch_or()
        timers/nohz: Convert tick dependency mask to atomic_t
        locking/atomic: Introduce atomic_fetch_or()
      7b367f5d
    • Linus Torvalds's avatar
      v4l2-mc: avoid warning about unused variable · 17084b7e
      Linus Torvalds authored
      Commit 840f5b05 ("media: au0828 disable tuner to demod link in
      au0828_media_device_register()") removed all uses of the 'dtv_demod',
      but left the variable itself around.
      
      Remove it.
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      17084b7e
    • Linus Torvalds's avatar
      Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 30cebb6c
      Linus Torvalds authored
      Pull x86 fixes from Thomas Gleixner:
       "This lot contains:
      
         - Some fixups for the fallout of the topology consolidation which
           unearthed AMD/Intel inconsistencies
         - Documentation for the x86 topology management
         - Support for AMD advanced power management bits
         - Two simple cleanups removing duplicated code"
      
      * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/cpu: Add advanced power management bits
        x86/thread_info: Merge two !__ASSEMBLY__ sections
        x86/cpufreq: Remove duplicated TDP MSR macro definitions
        x86/Documentation: Start documenting x86 topology
        x86/cpu: Get rid of compute_unit_id
        perf/x86/amd: Cleanup Fam10h NB event constraints
        x86/topology: Fix AMD core count
      30cebb6c
    • Paul Burton's avatar
      MIPS: Bail on unsupported module relocs · 04211a57
      Paul Burton authored
      When an unsupported reloc is encountered in a module, we currently
      blindly branch to whatever would be at its entry in the reloc handler
      function pointer arrays. This may be NULL, or if the unsupported reloc
      has a type greater than that of the supported reloc with the highest
      type then we'll dereference some value after the function pointer array
      & branch to that. The result is at best a kernel oops.
      
      Fix this by checking that the reloc type has an entry in the function
      pointer array (ie. is less than the number of items in the array) and
      that the handler is non-NULL, returning an error code to fail the module
      load if no handler is found.
      Signed-off-by: default avatarPaul Burton <paul.burton@imgtec.com>
      Cc: James Hogan <james.hogan@imgtec.com>
      Cc: Steven J. Hill <Steven.Hill@imgtec.com>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: linux-mips@linux-mips.org
      Cc: linux-kernel@vger.kernel.org
      Patchwork: https://patchwork.linux-mips.org/patch/12432/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      04211a57
    • Antony Pavlov's avatar
      MIPS: dts: qca: ar9132_tl_wr1043nd_v1.dts: use "ref" for reference clock name · f7f797cf
      Antony Pavlov authored
      Current ath79 clock.c code does not read reference clock and
      pll setup from devicetree. The ar724x_clocks_init() function
      recreates the clocks from scratch so devicetree clock
      information is dropped. After adding the code which picked up
      reference clock from devicetree I have found
      that kernel does not boot anymore. The SPI and UART drivers
      can't get clk; here are the bootlog error messages:
      
          of_serial: probe of 18020000.uart failed with error -22
          ath79-spi: probe of 1f000000.spi failed with error -22
      
      The problem is that clock code assumes that reference clock
      name is "ref" but current dts-file uses another name: "oscillator".
      
      This patch fixes the problem by changing external oscillator
      dt node name to "ref".
      
      Please note that there is an alternative solution for the problem:
      
          > --- a/arch/mips/boot/dts/qca/ar9132_tl_wr1043nd_v1.dts
          > +++ b/arch/mips/boot/dts/qca/ar9132_tl_wr1043nd_v1.dts
          > @@ -16,6 +16,7 @@
          >
          >         extosc: oscillator {
          >                 compatible = "fixed-clock";
          > +               clock-output-names = "ref";
          >                 #clock-cells = <0>;
          >                 clock-frequency = <40000000>;
          >         };
      Signed-off-by: default avatarAntony Pavlov <antonynpavlov@gmail.com>
      Cc: Alban Bedel <albeu@free.fr>
      Cc: Michael Turquette <mturquette@baylibre.com>
      Cc: Rob Herring <robh+dt@kernel.org>
      Cc: linux-clk@vger.kernel.org
      Cc: linux-mips@linux-mips.org
      Cc: devicetree@vger.kernel.org
      Patchwork: https://patchwork.linux-mips.org/patch/12874/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      f7f797cf
    • Alban Bedel's avatar
      MIPS: ath79: Fix the ar913x reference clock rate · f4c87b7a
      Alban Bedel authored
      The reference clock on ar913x is at 40MHz and not 5MHz. The current
      implementation use the wrong reference rate because it doesn't take
      the PLL divider in account. But if we fix the code to use the divider
      it becomes identical with the implementation for ar724x, so just drop
      the broken ar913x implementation.
      Signed-off-by: default avatarAlban Bedel <albeu@free.fr>
      Tested-by: default avatarAntony Pavlov <antonynpavlov@gmail.com>
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/12871/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      f4c87b7a
    • Weijie Gao's avatar
      MIPS: ath79: Fix the ar724x clock calculation · c338d59d
      Weijie Gao authored
      According to the AR7242 datasheet section 2.8, AR724X CPUs use a 40MHz
      input clock as the REF_CLK instead of 5MHz.
      
      The correct CPU PLL calculation procedure is as follows:
      CPU_PLL = (FB * REF_CLK) / REF_DIV / 2.
      
      This patch is compatible with the current calculation procedure with
      default FB and REF_DIV values.
      
      Tested on AR7240, AR7241 and AR7242.
      Signed-off-by: default avatarWeijie Gao <hackpascal@gmail.com>
      Signed-off-by: Alban Bedel <albeu@free.fr> (Fixed the commit log message)
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/12870/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      c338d59d