1. 21 Feb, 2020 6 commits
    • Vitaly Kuznetsov's avatar
      KVM: nVMX: clear PIN_BASED_POSTED_INTR from nested pinbased_ctls only when... · a4443267
      Vitaly Kuznetsov authored
      KVM: nVMX: clear PIN_BASED_POSTED_INTR from nested pinbased_ctls only when apicv is globally disabled
      
      When apicv is disabled on a vCPU (e.g. by enabling KVM_CAP_HYPERV_SYNIC*),
      nothing happens to VMX MSRs on the already existing vCPUs, however, all new
      ones are created with PIN_BASED_POSTED_INTR filtered out. This is very
      confusing and results in the following picture inside the guest:
      
      $ rdmsr -ax 0x48d
      ff00000016
      7f00000016
      7f00000016
      7f00000016
      
      This is observed with QEMU and 4-vCPU guest: QEMU creates vCPU0, does
      KVM_CAP_HYPERV_SYNIC2 and then creates the remaining three.
      
      L1 hypervisor may only check CPU0's controls to find out what features
      are available and it will be very confused later. Switch to setting
      PIN_BASED_POSTED_INTR control based on global 'enable_apicv' setting.
      Signed-off-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      a4443267
    • Vitaly Kuznetsov's avatar
      KVM: nVMX: handle nested posted interrupts when apicv is disabled for L1 · 91a5f413
      Vitaly Kuznetsov authored
      Even when APICv is disabled for L1 it can (and, actually, is) still
      available for L2, this means we need to always call
      vmx_deliver_nested_posted_interrupt() when attempting an interrupt
      delivery.
      Suggested-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      91a5f413
    • Suravee Suthikulpanit's avatar
      kvm: x86: svm: Fix NULL pointer dereference when AVIC not enabled · 93fd9666
      Suravee Suthikulpanit authored
      Launching VM w/ AVIC disabled together with pass-through device
      results in NULL pointer dereference bug with the following call trace.
      
          RIP: 0010:svm_refresh_apicv_exec_ctrl+0x17e/0x1a0 [kvm_amd]
      
          Call Trace:
           kvm_vcpu_update_apicv+0x44/0x60 [kvm]
           kvm_arch_vcpu_ioctl_run+0x3f4/0x1c80 [kvm]
           kvm_vcpu_ioctl+0x3d8/0x650 [kvm]
           do_vfs_ioctl+0xaa/0x660
           ? tomoyo_file_ioctl+0x19/0x20
           ksys_ioctl+0x67/0x90
           __x64_sys_ioctl+0x1a/0x20
           do_syscall_64+0x57/0x190
           entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Investigation shows that this is due to the uninitialized usage of
      struct vapu_svm.ir_list in the svm_set_pi_irte_mode(), which is
      called from svm_refresh_apicv_exec_ctrl().
      
      The ir_list is initialized only if AVIC is enabled. So, fixes by
      adding a check if AVIC is enabled in the svm_refresh_apicv_exec_ctrl().
      
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=206579
      Fixes: 8937d762 ("kvm: x86: svm: Add support to (de)activate posted interrupts.")
      Signed-off-by: default avatarSuravee Suthikulpanit <suravee.suthikulpanit@amd.com>
      Tested-by: default avatarAlex Williamson <alex.williamson@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      93fd9666
    • Xiaoyao Li's avatar
      KVM: VMX: Add VMX_FEATURE_USR_WAIT_PAUSE · 624e18f9
      Xiaoyao Li authored
      Commit 15934878 ("x86/vmx: Introduce VMX_FEATURES_*") missed
      bit 26 (enable user wait and pause) of Secondary Processor-based
      VM-Execution Controls.
      
      Add VMX_FEATURE_USR_WAIT_PAUSE flag so that it shows up in /proc/cpuinfo,
      and use it to define SECONDARY_EXEC_ENABLE_USR_WAIT_PAUSE to make them
      uniform.
      Signed-off-by: default avatarXiaoyao Li <xiaoyao.li@intel.com>
      Reviewed-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      624e18f9
    • wanpeng li's avatar
      KVM: nVMX: Hold KVM's srcu lock when syncing vmcs12->shadow · c9dfd3fb
      wanpeng li authored
      For the duration of mapping eVMCS, it derefences ->memslots without holding
      ->srcu or ->slots_lock when accessing hv assist page. This patch fixes it by
      moving nested_sync_vmcs12_to_shadow to prepare_guest_switch, where the SRCU
      is already taken.
      
      It can be reproduced by running kvm's evmcs_test selftest.
      
        =============================
        warning: suspicious rcu usage
        5.6.0-rc1+ #53 tainted: g        w ioe
        -----------------------------
        ./include/linux/kvm_host.h:623 suspicious rcu_dereference_check() usage!
      
        other info that might help us debug this:
      
         rcu_scheduler_active = 2, debug_locks = 1
        1 lock held by evmcs_test/8507:
         #0: ffff9ddd156d00d0 (&vcpu->mutex){+.+.}, at:
      kvm_vcpu_ioctl+0x85/0x680 [kvm]
      
        stack backtrace:
        cpu: 6 pid: 8507 comm: evmcs_test tainted: g        w ioe     5.6.0-rc1+ #53
        hardware name: dell inc. optiplex 7040/0jctf8, bios 1.4.9 09/12/2016
        call trace:
         dump_stack+0x68/0x9b
         kvm_read_guest_cached+0x11d/0x150 [kvm]
         kvm_hv_get_assist_page+0x33/0x40 [kvm]
         nested_enlightened_vmentry+0x2c/0x60 [kvm_intel]
         nested_vmx_handle_enlightened_vmptrld.part.52+0x32/0x1c0 [kvm_intel]
         nested_sync_vmcs12_to_shadow+0x439/0x680 [kvm_intel]
         vmx_vcpu_run+0x67a/0xe60 [kvm_intel]
         vcpu_enter_guest+0x35e/0x1bc0 [kvm]
         kvm_arch_vcpu_ioctl_run+0x40b/0x670 [kvm]
         kvm_vcpu_ioctl+0x370/0x680 [kvm]
         ksys_ioctl+0x235/0x850
         __x64_sys_ioctl+0x16/0x20
         do_syscall_64+0x77/0x780
         entry_syscall_64_after_hwframe+0x49/0xbe
      Signed-off-by: default avatarWanpeng Li <wanpengli@tencent.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      c9dfd3fb
    • Miaohe Lin's avatar
      KVM: x86: don't notify userspace IOAPIC on edge-triggered interrupt EOI · 7455a832
      Miaohe Lin authored
      Commit 13db7734 ("KVM: x86: don't notify userspace IOAPIC on edge
      EOI") said, edge-triggered interrupts don't set a bit in TMR, which means
      that IOAPIC isn't notified on EOI. And var level indicates level-triggered
      interrupt.
      But commit 3159d36a ("KVM: x86: use generic function for MSI parsing")
      replace var level with irq.level by mistake. Fix it by changing irq.level
      to irq.trig_mode.
      
      Cc: stable@vger.kernel.org
      Fixes: 3159d36a ("KVM: x86: use generic function for MSI parsing")
      Signed-off-by: default avatarMiaohe Lin <linmiaohe@huawei.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      7455a832
  2. 20 Feb, 2020 2 commits
  3. 17 Feb, 2020 2 commits
  4. 12 Feb, 2020 30 commits