1. 16 Apr, 2019 7 commits
    • WANG Chao's avatar
      x86/kvm: move kvm_load/put_guest_xcr0 into atomic context · 1811d979
      WANG Chao authored
      guest xcr0 could leak into host when MCE happens in guest mode. Because
      do_machine_check() could schedule out at a few places.
      
      For example:
      
      kvm_load_guest_xcr0
      ...
      kvm_x86_ops->run(vcpu) {
        vmx_vcpu_run
          vmx_complete_atomic_exit
            kvm_machine_check
              do_machine_check
                do_memory_failure
                  memory_failure
                    lock_page
      
      In this case, host_xcr0 is 0x2ff, guest vcpu xcr0 is 0xff. After schedule
      out, host cpu has guest xcr0 loaded (0xff).
      
      In __switch_to {
           switch_fpu_finish
             copy_kernel_to_fpregs
               XRSTORS
      
      If any bit i in XSTATE_BV[i] == 1 and xcr0[i] == 0, XRSTORS will
      generate #GP (In this case, bit 9). Then ex_handler_fprestore kicks in
      and tries to reinitialize fpu by restoring init fpu state. Same story as
      last #GP, except we get DOUBLE FAULT this time.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarWANG Chao <chao.wang@ucloud.cn>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      1811d979
    • Vitaly Kuznetsov's avatar
      KVM: x86: svm: make sure NMI is injected after nmi_singlestep · 99c22179
      Vitaly Kuznetsov authored
      I noticed that apic test from kvm-unit-tests always hangs on my EPYC 7401P,
      the hanging test nmi-after-sti is trying to deliver 30000 NMIs and tracing
      shows that we're sometimes able to deliver a few but never all.
      
      When we're trying to inject an NMI we may fail to do so immediately for
      various reasons, however, we still need to inject it so enable_nmi_window()
      arms nmi_singlestep mode. #DB occurs as expected, but we're not checking
      for pending NMIs before entering the guest and unless there's a different
      event to process, the NMI will never get delivered.
      
      Make KVM_REQ_EVENT request on the vCPU from db_interception() to make sure
      pending NMIs are checked and possibly injected.
      Signed-off-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      99c22179
    • Suthikulpanit, Suravee's avatar
      svm/avic: Fix invalidate logical APIC id entry · e44e3eac
      Suthikulpanit, Suravee authored
      Only clear the valid bit when invalidate logical APIC id entry.
      The current logic clear the valid bit, but also set the rest of
      the bits (including reserved bits) to 1.
      
      Fixes: 98d90582 ('svm: Fix AVIC DFR and LDR handling')
      Signed-off-by: default avatarSuravee Suthikulpanit <suravee.suthikulpanit@amd.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      e44e3eac
    • Suthikulpanit, Suravee's avatar
      Revert "svm: Fix AVIC incomplete IPI emulation" · 4a58038b
      Suthikulpanit, Suravee authored
      This reverts commit bb218fbc.
      
      As Oren Twaig pointed out the old discussion:
      
        https://patchwork.kernel.org/patch/8292231/
      
      that the change coud potentially cause an extra IPI to be sent to
      the destination vcpu because the AVIC hardware already set the IRR bit
      before the incomplete IPI #VMEXIT with id=1 (target vcpu is not running).
      Since writting to ICR and ICR2 will also set the IRR. If something triggers
      the destination vcpu to get scheduled before the emulation finishes, then
      this could result in an additional IPI.
      
      Also, the issue mentioned in the commit bb218fbc was misdiagnosed.
      
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Reported-by: default avatarOren Twaig <oren@scalemp.com>
      Signed-off-by: default avatarSuravee Suthikulpanit <suravee.suthikulpanit@amd.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      4a58038b
    • Ben Gardon's avatar
      kvm: mmu: Fix overflow on kvm mmu page limit calculation · bc8a3d89
      Ben Gardon authored
      KVM bases its memory usage limits on the total number of guest pages
      across all memslots. However, those limits, and the calculations to
      produce them, use 32 bit unsigned integers. This can result in overflow
      if a VM has more guest pages that can be represented by a u32. As a
      result of this overflow, KVM can use a low limit on the number of MMU
      pages it will allocate. This makes KVM unable to map all of guest memory
      at once, prompting spurious faults.
      
      Tested: Ran all kvm-unit-tests on an Intel Haswell machine. This patch
      	introduced no new failures.
      Signed-off-by: default avatarBen Gardon <bgardon@google.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      bc8a3d89
    • Paolo Bonzini's avatar
      KVM: nVMX: always use early vmcs check when EPT is disabled · 2b27924b
      Paolo Bonzini authored
      The remaining failures of vmx.flat when EPT is disabled are caused by
      incorrectly reflecting VMfails to the L1 hypervisor.  What happens is
      that nested_vmx_restore_host_state corrupts the guest CR3, reloading it
      with the host's shadow CR3 instead, because it blindly loads GUEST_CR3
      from the vmcs01.
      
      For simplicity let's just always use hardware VMCS checks when EPT is
      disabled.  This way, nested_vmx_restore_host_state is not reached at
      all (or at least shouldn't be reached).
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      2b27924b
    • Paolo Bonzini's avatar
      KVM: nVMX: allow tests to use bad virtual-APIC page address · 69090810
      Paolo Bonzini authored
      As mentioned in the comment, there are some special cases where we can simply
      clear the TPR shadow bit from the CPU-based execution controls in the vmcs02.
      Handle them so that we can remove some XFAILs from vmx.flat.
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      69090810
  2. 15 Apr, 2019 1 commit
    • Sean Christopherson's avatar
      KVM: x86/mmu: Fix an inverted list_empty() check when zapping sptes · cfd32acf
      Sean Christopherson authored
      A recently introduced helper for handling zap vs. remote flush
      incorrectly bails early, effectively leaking defunct shadow pages.
      Manifests as a slab BUG when exiting KVM due to the shadow pages
      being alive when their associated cache is destroyed.
      
      ==========================================================================
      BUG kvm_mmu_page_header: Objects remaining in kvm_mmu_page_header on ...
      --------------------------------------------------------------------------
      Disabling lock debugging due to kernel taint
      INFO: Slab 0x00000000fc436387 objects=26 used=23 fp=0x00000000d023caee ...
      CPU: 6 PID: 4315 Comm: rmmod Tainted: G    B             5.1.0-rc2+ #19
      Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
      Call Trace:
       dump_stack+0x46/0x5b
       slab_err+0xad/0xd0
       ? on_each_cpu_mask+0x3c/0x50
       ? ksm_migrate_page+0x60/0x60
       ? on_each_cpu_cond_mask+0x7c/0xa0
       ? __kmalloc+0x1ca/0x1e0
       __kmem_cache_shutdown+0x13a/0x310
       shutdown_cache+0xf/0x130
       kmem_cache_destroy+0x1d5/0x200
       kvm_mmu_module_exit+0xa/0x30 [kvm]
       kvm_arch_exit+0x45/0x60 [kvm]
       kvm_exit+0x6f/0x80 [kvm]
       vmx_exit+0x1a/0x50 [kvm_intel]
       __x64_sys_delete_module+0x153/0x1f0
       ? exit_to_usermode_loop+0x88/0xc0
       do_syscall_64+0x4f/0x100
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Fixes: a2113634 ("KVM: x86/mmu: Split remote_flush+zap case out of kvm_mmu_flush_or_zap()")
      Signed-off-by: default avatarSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      cfd32acf
  3. 10 Apr, 2019 3 commits
  4. 09 Apr, 2019 3 commits
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 869e3305
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Off by one and bounds checking fixes in NFC, from Dan Carpenter.
      
       2) There have been many weird regressions in r8169 since we turned ASPM
          support on, some are still not understood nor completely resolved.
          Let's turn this back off for now. From Heiner Kallweit.
      
       3) Signess fixes for ethtool speed value handling, from Michael
          Zhivich.
      
       4) Handle timestamps properly in macb driver, from Paul Thomas.
      
       5) Two erspan fixes, it's the usual "skb ->data potentially reallocated
          and we're holding a stale protocol header pointer". From Lorenzo
          Bianconi.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
        bnxt_en: Reset device on RX buffer errors.
        bnxt_en: Improve RX consumer index validity check.
        net: macb driver, check for SKBTX_HW_TSTAMP
        qlogic: qlcnic: fix use of SPEED_UNKNOWN ethtool constant
        broadcom: tg3: fix use of SPEED_UNKNOWN ethtool constant
        ethtool: avoid signed-unsigned comparison in ethtool_validate_speed()
        net: ip6_gre: fix possible use-after-free in ip6erspan_rcv
        net: ip_gre: fix possible use-after-free in erspan_rcv
        r8169: disable ASPM again
        MAINTAINERS: ieee802154: update documentation file pattern
        net: vrf: Fix ping failed when vrf mtu is set to 0
        selftests: add a tc matchall test case
        nfc: nci: Potential off by one in ->pipes[] array
        NFC: nci: Add some bounds checking in nci_hci_cmd_received()
      869e3305
    • Linus Torvalds's avatar
      Merge branch 'fixes-v5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security · a556810d
      Linus Torvalds authored
      Pull TPM fixes from James Morris:
       "From Jarkko: These are critical fixes for v5.1. Contains also couple
        of new selftests for v5.1 features (partial reads in /dev/tpm0)"
      
      * 'fixes-v5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
        selftests/tpm2: Open tpm dev in unbuffered mode
        selftests/tpm2: Extend tests to cover partial reads
        KEYS: trusted: fix -Wvarags warning
        tpm: Fix the type of the return value in calc_tpm2_event_size()
        KEYS: trusted: allow trusted.ko to initialize w/o a TPM
        tpm: fix an invalid condition in tpm_common_poll
        tpm: turn on TPM on suspend for TPM 1.x
      a556810d
    • Linus Torvalds's avatar
      Merge tag 'xtensa-20190408' of git://github.com/jcmvbkbc/linux-xtensa · 10d43397
      Linus Torvalds authored
      Pull xtensa fixes from Max Filippov:
      
       - fix syscall number passed to trace_sys_exit
      
       - fix syscall number initialization in start_thread
      
       - fix level interpretation in the return_address
      
       - fix format string warning in init_pmd
      
      * tag 'xtensa-20190408' of git://github.com/jcmvbkbc/linux-xtensa:
        xtensa: fix format string warning in init_pmd
        xtensa: fix return_address
        xtensa: fix initialization of pt_regs::syscall in start_thread
        xtensa: use actual syscall number in do_syscall_trace_leave
      10d43397
  5. 08 Apr, 2019 25 commits
  6. 07 Apr, 2019 1 commit
    • Linus Torvalds's avatar
      Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc · d8491223
      Linus Torvalds authored
      Pull ARM SoC fixes from Olof Johansson:
       "A collection of fixes from the last few weeks. Most of them are
        smaller tweaks and fixes to DT and hardware descriptions for boards.
        Some of the more significant ones are:
      
         - eMMC and RGMII stability tweaks for rk3288
      
         - DDC fixes for Rock PI 4
      
         - Audio fixes for two TI am335x eval boards
      
         - D_CAN clock fix for am335x
      
         - Compilation fixes for clang
      
         - !HOTPLUG_CPU compilation fix for one of the new platforms this
           release (milbeaut)
      
         - A revert of a gpio fix for nomadik that instead was fixed in the
           gpio subsystem
      
         - Whitespace fix for the DT JSON schema (no tabs allowed)"
      
      * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (25 commits)
        ARM: milbeaut: fix build with !CONFIG_HOTPLUG_CPU
        ARM: iop: don't use using 64-bit DMA masks
        ARM: orion: don't use using 64-bit DMA masks
        Revert "ARM: dts: nomadik: Fix polarity of SPI CS"
        dt-bindings: cpu: Fix JSON schema
        arm/mach-at91/pm : fix possible object reference leak
        ARM: dts: at91: Fix typo in ISC_D0 on PC9
        ARM: dts: Fix dcan clkctrl clock for am3
        reset: meson-audio-arb: Fix missing .owner setting of reset_controller_dev
        dt-bindings: reset: meson-g12a: Add missing USB2 PHY resets
        ARM: dts: rockchip: Remove #address/#size-cells from rk3288-veyron gpio-keys
        ARM: dts: rockchip: Remove #address/#size-cells from rk3288 mipi_dsi
        ARM: dts: rockchip: Fix gpu opp node names for rk3288
        ARM: dts: am335x-evmsk: Correct the regulators for the audio codec
        ARM: dts: am335x-evm: Correct the regulators for the audio codec
        ARM: OMAP2+: add missing of_node_put after of_device_is_available
        ARM: OMAP1: ams-delta: Fix broken GPIO ID allocation
        arm64: dts: stratix10: add the sysmgr-syscon property from the gmac's
        arm64: dts: rockchip: fix rk3328 sdmmc0 write errors
        arm64: dts: rockchip: fix rk3328 rgmii high tx error rate
        ...
      d8491223