1. 19 Dec, 2018 1 commit
    • Vitaly Kuznetsov's avatar
      KVM: x86: nSVM: fix switch to guest mmu · 3cf85f9f
      Vitaly Kuznetsov authored
      Recent optimizations in MMU code broke nested SVM with NPT in L1
      completely: when we do nested_svm_{,un}init_mmu_context() we want
      to switch from TDP MMU to shadow MMU, both init_kvm_tdp_mmu() and
      kvm_init_shadow_mmu() check if re-configuration is needed by looking
      at cache source data. The data, however, doesn't change - it's only
      the type of the MMU which changes. We end up not re-initializing
      guest MMU as shadow and everything goes off the rails.
      
      The issue could have been fixed by putting MMU type into extended MMU
      role but this is not really needed. We can just split root and guest MMUs
      the exact same way we did for nVMX, their types never change in the
      lifetime of a vCPU.
      
      There is still room for improvement: currently, we reset all MMU roots
      when switching from L1 to L2 and back and this is not needed.
      
      Fixes: 7dcd5755 ("x86/kvm/mmu: check if tdp/shadow MMU reconfiguration is needed")
      Signed-off-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      3cf85f9f
  2. 18 Dec, 2018 4 commits
    • Eduardo Habkost's avatar
      kvm: x86: Add AMD's EX_CFG to the list of ignored MSRs · 0e1b869f
      Eduardo Habkost authored
      Some guests OSes (including Windows 10) write to MSR 0xc001102c
      on some cases (possibly while trying to apply a CPU errata).
      Make KVM ignore reads and writes to that MSR, so the guest won't
      crash.
      
      The MSR is documented as "Execution Unit Configuration (EX_CFG)",
      at AMD's "BIOS and Kernel Developer's Guide (BKDG) for AMD Family
      15h Models 00h-0Fh Processors".
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarEduardo Habkost <ehabkost@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      0e1b869f
    • Wanpeng Li's avatar
      KVM: X86: Fix NULL deref in vcpu_scan_ioapic · dcbd3e49
      Wanpeng Li authored
      Reported by syzkaller:
      
          CPU: 1 PID: 5962 Comm: syz-executor118 Not tainted 4.20.0-rc6+ #374
          Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
          RIP: 0010:kvm_apic_hw_enabled arch/x86/kvm/lapic.h:169 [inline]
          RIP: 0010:vcpu_scan_ioapic arch/x86/kvm/x86.c:7449 [inline]
          RIP: 0010:vcpu_enter_guest arch/x86/kvm/x86.c:7602 [inline]
          RIP: 0010:vcpu_run arch/x86/kvm/x86.c:7874 [inline]
          RIP: 0010:kvm_arch_vcpu_ioctl_run+0x5296/0x7320 arch/x86/kvm/x86.c:8074
          Call Trace:
      	 kvm_vcpu_ioctl+0x5c8/0x1150 arch/x86/kvm/../../../virt/kvm/kvm_main.c:2596
      	 vfs_ioctl fs/ioctl.c:46 [inline]
      	 file_ioctl fs/ioctl.c:509 [inline]
      	 do_vfs_ioctl+0x1de/0x1790 fs/ioctl.c:696
      	 ksys_ioctl+0xa9/0xd0 fs/ioctl.c:713
      	 __do_sys_ioctl fs/ioctl.c:720 [inline]
      	 __se_sys_ioctl fs/ioctl.c:718 [inline]
      	 __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718
      	 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
      	 entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      The reason is that the testcase writes hyperv synic HV_X64_MSR_SINT14 msr
      and triggers scan ioapic logic to load synic vectors into EOI exit bitmap.
      However, irqchip is not initialized by this simple testcase, ioapic/apic
      objects should not be accessed.
      
      This patch fixes it by also considering whether or not apic is present.
      
      Reported-by: syzbot+39810e6c400efadfef71@syzkaller.appspotmail.com
      Cc: stable@vger.kernel.org
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: default avatarWanpeng Li <wanpengli@tencent.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      dcbd3e49
    • Cfir Cohen's avatar
      KVM: Fix UAF in nested posted interrupt processing · c2dd5146
      Cfir Cohen authored
      nested_get_vmcs12_pages() processes the posted_intr address in vmcs12. It
      caches the kmap()ed page object and pointer, however, it doesn't handle
      errors correctly: it's possible to cache a valid pointer, then release
      the page and later dereference the dangling pointer.
      
      I was able to reproduce with the following steps:
      
      1. Call vmlaunch with valid posted_intr_desc_addr but an invalid
      MSR_EFER. This causes nested_get_vmcs12_pages() to cache the kmap()ed
      pi_desc_page and pi_desc. Later the invalid EFER value fails
      check_vmentry_postreqs() which fails the first vmlaunch.
      
      2. Call vmlanuch with a valid EFER but an invalid posted_intr_desc_addr
      (I set it to 2G - 0x80). The second time we call nested_get_vmcs12_pages
      pi_desc_page is unmapped and released and pi_desc_page is set to NULL
      (the "shouldn't happen" clause). Due to the invalid
      posted_intr_desc_addr, kvm_vcpu_gpa_to_page() fails and
      nested_get_vmcs12_pages() returns. It doesn't return an error value so
      vmlaunch proceeds. Note that at this time we have a dangling pointer in
      vmx->nested.pi_desc and POSTED_INTR_DESC_ADDR in L0's vmcs.
      
      3. Issue an IPI in L2 guest code. This triggers a call to
      vmx_complete_nested_posted_interrupt() and pi_test_and_clear_on() which
      dereferences the dangling pointer.
      
      Vulnerable code requires nested and enable_apicv variables to be set to
      true. The host CPU must also support posted interrupts.
      
      Fixes: 5e2f30b7 "KVM: nVMX: get rid of nested_get_page()"
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarAndy Honig <ahonig@google.com>
      Signed-off-by: default avatarCfir Cohen <cfir@google.com>
      Reviewed-by: default avatarLiran Alon <liran.alon@oracle.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      c2dd5146
    • Eric Biggers's avatar
      KVM: fix unregistering coalesced mmio zone from wrong bus · 987d1149
      Eric Biggers authored
      If you register a kvm_coalesced_mmio_zone with '.pio = 0' but then
      unregister it with '.pio = 1', KVM_UNREGISTER_COALESCED_MMIO will try to
      unregister it from KVM_PIO_BUS rather than KVM_MMIO_BUS, which is a
      no-op.  But it frees the kvm_coalesced_mmio_dev anyway, causing a
      use-after-free.
      
      Fix it by only unregistering and freeing the zone if the correct value
      of 'pio' is provided.
      
      Reported-by: syzbot+f87f60bb6f13f39b54e3@syzkaller.appspotmail.com
      Fixes: 0804c849 ("kvm/x86 : add coalesced pio support")
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      987d1149
  3. 16 Dec, 2018 1 commit
  4. 14 Dec, 2018 20 commits
  5. 13 Dec, 2018 13 commits
  6. 12 Dec, 2018 1 commit