1. 08 Jun, 2022 30 commits
  2. 07 Jun, 2022 10 commits
    • Paolo Bonzini's avatar
      Merge tag 'kvm-s390-next-5.19-2' of... · 5552de7b
      Paolo Bonzini authored
      Merge tag 'kvm-s390-next-5.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD
      
      KVM: s390: pvdump and selftest improvements
      
      - add an interface to provide a hypervisor dump for secure guests
      - improve selftests to show tests
      5552de7b
    • Paolo Bonzini's avatar
      b31455e9
    • Paolo Bonzini's avatar
      a280e358
    • Maxim Levitsky's avatar
      KVM: SVM: fix tsc scaling cache logic · 11d39e8c
      Maxim Levitsky authored
      SVM uses a per-cpu variable to cache the current value of the
      tsc scaling multiplier msr on each cpu.
      
      Commit 1ab9287a
      ("KVM: X86: Add vendor callbacks for writing the TSC multiplier")
      broke this caching logic.
      
      Refactor the code so that all TSC scaling multiplier writes go through
      a single function which checks and updates the cache.
      
      This fixes the following scenario:
      
      1. A CPU runs a guest with some tsc scaling ratio.
      
      2. New guest with different tsc scaling ratio starts on this CPU
         and terminates almost immediately.
      
         This ensures that the short running guest had set the tsc scaling ratio just
         once when it was set via KVM_SET_TSC_KHZ. Due to the bug,
         the per-cpu cache is not updated.
      
      3. The original guest continues to run, it doesn't restore the msr
         value back to its own value, because the cache matches,
         and thus continues to run with a wrong tsc scaling ratio.
      
      Fixes: 1ab9287a ("KVM: X86: Add vendor callbacks for writing the TSC multiplier")
      Signed-off-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
      Message-Id: <20220606181149.103072-1-mlevitsk@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      11d39e8c
    • Vitaly Kuznetsov's avatar
      KVM: selftests: Make hyperv_clock selftest more stable · eae260be
      Vitaly Kuznetsov authored
      hyperv_clock doesn't always give a stable test result, especially with
      AMD CPUs. The test compares Hyper-V MSR clocksource (acquired either
      with rdmsr() from within the guest or KVM_GET_MSRS from the host)
      against rdtsc(). To increase the accuracy, increase the measured delay
      (done with nop loop) by two orders of magnitude and take the mean rdtsc()
      value before and after rdmsr()/KVM_GET_MSRS.
      Reported-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
      Signed-off-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Reviewed-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
      Tested-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
      Message-Id: <20220601144322.1968742-1-vkuznets@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      eae260be
    • Ben Gardon's avatar
      KVM: x86/MMU: Zap non-leaf SPTEs when disabling dirty logging · 5ba7c4c6
      Ben Gardon authored
      Currently disabling dirty logging with the TDP MMU is extremely slow.
      On a 96 vCPU / 96G VM backed with gigabyte pages, it takes ~200 seconds
      to disable dirty logging with the TDP MMU, as opposed to ~4 seconds with
      the shadow MMU.
      
      When disabling dirty logging, zap non-leaf parent entries to allow
      replacement with huge pages instead of recursing and zapping all of the
      child, leaf entries. This reduces the number of TLB flushes required.
      and reduces the disable dirty log time with the TDP MMU to ~3 seconds.
      
      Opportunistically add a WARN() to catch GFNs that are mapped at a
      higher level than their max level.
      Signed-off-by: default avatarBen Gardon <bgardon@google.com>
      Message-Id: <20220525230904.1584480-1-bgardon@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      5ba7c4c6
    • Jan Beulich's avatar
      x86: drop bogus "cc" clobber from __try_cmpxchg_user_asm() · 1df931d9
      Jan Beulich authored
      As noted (and fixed) a couple of times in the past, "=@cc<cond>" outputs
      and clobbering of "cc" don't work well together. The compiler appears to
      mean to reject such, but doesn't - in its upstream form - quite manage
      to yet for "cc". Furthermore two similar macros don't clobber "cc", and
      clobbering "cc" is pointless in asm()-s for x86 anyway - the compiler
      always assumes status flags to be clobbered there.
      
      Fixes: 989b5db2 ("x86/uaccess: Implement macros for CMPXCHG on user addresses")
      Signed-off-by: default avatarJan Beulich <jbeulich@suse.com>
      Message-Id: <485c0c0b-a3a7-0b7c-5264-7d00c01de032@suse.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      1df931d9
    • Shaoqin Huang's avatar
      KVM: x86/mmu: Check every prev_roots in __kvm_mmu_free_obsolete_roots() · cf4a8693
      Shaoqin Huang authored
      When freeing obsolete previous roots, check prev_roots as intended, not
      the current root.
      Signed-off-by: default avatarShaoqin Huang <shaoqin.huang@intel.com>
      Fixes: 527d5cd7 ("KVM: x86/mmu: Zap only obsolete roots if a root shadow page is zapped")
      Message-Id: <20220607005905.2933378-1-shaoqin.huang@intel.com>
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarSean Christopherson <seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      cf4a8693
    • Seth Forshee's avatar
      entry/kvm: Exit to user mode when TIF_NOTIFY_SIGNAL is set · 3e684903
      Seth Forshee authored
      A livepatch transition may stall indefinitely when a kvm vCPU is heavily
      loaded. To the host, the vCPU task is a user thread which is spending a
      very long time in the ioctl(KVM_RUN) syscall. During livepatch
      transition, set_notify_signal() will be called on such tasks to
      interrupt the syscall so that the task can be transitioned. This
      interrupts guest execution, but when xfer_to_guest_mode_work() sees that
      TIF_NOTIFY_SIGNAL is set but not TIF_SIGPENDING it concludes that an
      exit to user mode is unnecessary, and guest execution is resumed without
      transitioning the task for the livepatch.
      
      This handling of TIF_NOTIFY_SIGNAL is incorrect, as set_notify_signal()
      is expected to break tasks out of interruptible kernel loops and cause
      them to return to userspace. Change xfer_to_guest_mode_work() to handle
      TIF_NOTIFY_SIGNAL the same as TIF_SIGPENDING, signaling to the vCPU run
      loop that an exit to userpsace is needed. Any pending task_work will be
      run when get_signal() is called from exit_to_user_mode_loop(), so there
      is no longer any need to run task work from xfer_to_guest_mode_work().
      Suggested-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Petr Mladek <pmladek@suse.com>
      Signed-off-by: default avatarSeth Forshee <sforshee@digitalocean.com>
      Message-Id: <20220504180840.2907296-1-sforshee@digitalocean.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      3e684903
    • Alexey Kardashevskiy's avatar
      KVM: Don't null dereference ops->destroy · e8bc2427
      Alexey Kardashevskiy authored
      A KVM device cleanup happens in either of two callbacks:
      1) destroy() which is called when the VM is being destroyed;
      2) release() which is called when a device fd is closed.
      
      Most KVM devices use 1) but Book3s's interrupt controller KVM devices
      (XICS, XIVE, XIVE-native) use 2) as they need to close and reopen during
      the machine execution. The error handling in kvm_ioctl_create_device()
      assumes destroy() is always defined which leads to NULL dereference as
      discovered by Syzkaller.
      
      This adds a checks for destroy!=NULL and adds a missing release().
      
      This is not changing kvm_destroy_devices() as devices with defined
      release() should have been removed from the KVM devices list by then.
      Suggested-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarAlexey Kardashevskiy <aik@ozlabs.ru>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      e8bc2427