1. 15 Nov, 2019 4 commits
    • Andrea Arcangeli's avatar
      x86: retpolines: eliminate retpoline from msr event handlers · 74c504a6
      Andrea Arcangeli authored
      It's enough to check the value and issue the direct call.
      
      After this commit is applied, here the most common retpolines executed
      under a high resolution timer workload in the guest on a VMX host:
      
      [..]
      @[
          trace_retpoline+1
          __trace_retpoline+30
          __x86_indirect_thunk_rax+33
          do_syscall_64+89
          entry_SYSCALL_64_after_hwframe+68
      ]: 267
      @[]: 2256
      @[
          trace_retpoline+1
          __trace_retpoline+30
          __x86_indirect_thunk_rax+33
          __kvm_wait_lapic_expire+284
          vmx_vcpu_run.part.97+1091
          vcpu_enter_guest+377
          kvm_arch_vcpu_ioctl_run+261
          kvm_vcpu_ioctl+559
          do_vfs_ioctl+164
          ksys_ioctl+96
          __x64_sys_ioctl+22
          do_syscall_64+89
          entry_SYSCALL_64_after_hwframe+68
      ]: 2390
      @[]: 33410
      
      @total: 315707
      
      Note the highest hit above is __delay so probably not worth optimizing
      even if it would be more frequent than 2k hits per sec.
      Signed-off-by: default avatarAndrea Arcangeli <aarcange@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      74c504a6
    • Andrea Arcangeli's avatar
      KVM: retpolines: x86: eliminate retpoline from svm.c exit handlers · 3dcb2a3f
      Andrea Arcangeli authored
      It's enough to check the exit value and issue a direct call to avoid
      the retpoline for all the common vmexit reasons.
      
      After this commit is applied, here the most common retpolines executed
      under a high resolution timer workload in the guest on a SVM host:
      
      [..]
      @[
          trace_retpoline+1
          __trace_retpoline+30
          __x86_indirect_thunk_rax+33
          ktime_get_update_offsets_now+70
          hrtimer_interrupt+131
          smp_apic_timer_interrupt+106
          apic_timer_interrupt+15
          start_sw_timer+359
          restart_apic_timer+85
          kvm_set_msr_common+1497
          msr_interception+142
          vcpu_enter_guest+684
          kvm_arch_vcpu_ioctl_run+261
          kvm_vcpu_ioctl+559
          do_vfs_ioctl+164
          ksys_ioctl+96
          __x64_sys_ioctl+22
          do_syscall_64+89
          entry_SYSCALL_64_after_hwframe+68
      ]: 1940
      @[
          trace_retpoline+1
          __trace_retpoline+30
          __x86_indirect_thunk_r12+33
          force_qs_rnp+217
          rcu_gp_kthread+1270
          kthread+268
          ret_from_fork+34
      ]: 4644
      @[]: 25095
      @[
          trace_retpoline+1
          __trace_retpoline+30
          __x86_indirect_thunk_rax+33
          lapic_next_event+28
          clockevents_program_event+148
          hrtimer_start_range_ns+528
          start_sw_timer+356
          restart_apic_timer+85
          kvm_set_msr_common+1497
          msr_interception+142
          vcpu_enter_guest+684
          kvm_arch_vcpu_ioctl_run+261
          kvm_vcpu_ioctl+559
          do_vfs_ioctl+164
          ksys_ioctl+96
          __x64_sys_ioctl+22
          do_syscall_64+89
          entry_SYSCALL_64_after_hwframe+68
      ]: 41474
      @[
          trace_retpoline+1
          __trace_retpoline+30
          __x86_indirect_thunk_rax+33
          clockevents_program_event+148
          hrtimer_start_range_ns+528
          start_sw_timer+356
          restart_apic_timer+85
          kvm_set_msr_common+1497
          msr_interception+142
          vcpu_enter_guest+684
          kvm_arch_vcpu_ioctl_run+261
          kvm_vcpu_ioctl+559
          do_vfs_ioctl+164
          ksys_ioctl+96
          __x64_sys_ioctl+22
          do_syscall_64+89
          entry_SYSCALL_64_after_hwframe+68
      ]: 41474
      @[
          trace_retpoline+1
          __trace_retpoline+30
          __x86_indirect_thunk_rax+33
          ktime_get+58
          clockevents_program_event+84
          hrtimer_start_range_ns+528
          start_sw_timer+356
          restart_apic_timer+85
          kvm_set_msr_common+1497
          msr_interception+142
          vcpu_enter_guest+684
          kvm_arch_vcpu_ioctl_run+261
          kvm_vcpu_ioctl+559
          do_vfs_ioctl+164
          ksys_ioctl+96
          __x64_sys_ioctl+22
          do_syscall_64+89
          entry_SYSCALL_64_after_hwframe+68
      ]: 41887
      @[
          trace_retpoline+1
          __trace_retpoline+30
          __x86_indirect_thunk_rax+33
          lapic_next_event+28
          clockevents_program_event+148
          hrtimer_try_to_cancel+168
          hrtimer_cancel+21
          kvm_set_lapic_tscdeadline_msr+43
          kvm_set_msr_common+1497
          msr_interception+142
          vcpu_enter_guest+684
          kvm_arch_vcpu_ioctl_run+261
          kvm_vcpu_ioctl+559
          do_vfs_ioctl+164
          ksys_ioctl+96
          __x64_sys_ioctl+22
          do_syscall_64+89
          entry_SYSCALL_64_after_hwframe+68
      ]: 42723
      @[
          trace_retpoline+1
          __trace_retpoline+30
          __x86_indirect_thunk_rax+33
          clockevents_program_event+148
          hrtimer_try_to_cancel+168
          hrtimer_cancel+21
          kvm_set_lapic_tscdeadline_msr+43
          kvm_set_msr_common+1497
          msr_interception+142
          vcpu_enter_guest+684
          kvm_arch_vcpu_ioctl_run+261
          kvm_vcpu_ioctl+559
          do_vfs_ioctl+164
          ksys_ioctl+96
          __x64_sys_ioctl+22
          do_syscall_64+89
          entry_SYSCALL_64_after_hwframe+68
      ]: 42766
      @[
          trace_retpoline+1
          __trace_retpoline+30
          __x86_indirect_thunk_rax+33
          ktime_get+58
          clockevents_program_event+84
          hrtimer_try_to_cancel+168
          hrtimer_cancel+21
          kvm_set_lapic_tscdeadline_msr+43
          kvm_set_msr_common+1497
          msr_interception+142
          vcpu_enter_guest+684
          kvm_arch_vcpu_ioctl_run+261
          kvm_vcpu_ioctl+559
          do_vfs_ioctl+164
          ksys_ioctl+96
          __x64_sys_ioctl+22
          do_syscall_64+89
          entry_SYSCALL_64_after_hwframe+68
      ]: 42848
      @[
          trace_retpoline+1
          __trace_retpoline+30
          __x86_indirect_thunk_rax+33
          ktime_get+58
          start_sw_timer+279
          restart_apic_timer+85
          kvm_set_msr_common+1497
          msr_interception+142
          vcpu_enter_guest+684
          kvm_arch_vcpu_ioctl_run+261
          kvm_vcpu_ioctl+559
          do_vfs_ioctl+164
          ksys_ioctl+96
          __x64_sys_ioctl+22
          do_syscall_64+89
          entry_SYSCALL_64_after_hwframe+68
      ]: 499845
      
      @total: 1780243
      
      SVM has no TSC based programmable preemption timer so it is invoking
      ktime_get() frequently.
      Signed-off-by: default avatarAndrea Arcangeli <aarcange@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      3dcb2a3f
    • Andrea Arcangeli's avatar
      KVM: retpolines: x86: eliminate retpoline from vmx.c exit handlers · 4289d272
      Andrea Arcangeli authored
      It's enough to check the exit value and issue a direct call to avoid
      the retpoline for all the common vmexit reasons.
      
      Of course CONFIG_RETPOLINE already forbids gcc to use indirect jumps
      while compiling all switch() statements, however switch() would still
      allow the compiler to bisect the case value. It's more efficient to
      prioritize the most frequent vmexits instead.
      
      The halt may be slow paths from the point of the guest, but not
      necessarily so from the point of the host if the host runs at full CPU
      capacity and no host CPU is ever left idle.
      Signed-off-by: default avatarAndrea Arcangeli <aarcange@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      4289d272
    • Andrea Arcangeli's avatar
      KVM: x86: optimize more exit handlers in vmx.c · f399e60c
      Andrea Arcangeli authored
      Eliminate wasteful call/ret non RETPOLINE case and unnecessary fentry
      dynamic tracing hooking points.
      Signed-off-by: default avatarAndrea Arcangeli <aarcange@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      f399e60c
  2. 11 Nov, 2019 1 commit
  3. 02 Nov, 2019 1 commit
    • Marcelo Tosatti's avatar
      KVM: x86: switch KVMCLOCK base to monotonic raw clock · 53fafdbb
      Marcelo Tosatti authored
      Commit 0bc48bea ("KVM: x86: update master clock before computing
      kvmclock_offset")
      switches the order of operations to avoid the conversion
      
      TSC (without frequency correction) ->
      system_timestamp (with frequency correction),
      
      which might cause a time jump.
      
      However, it leaves any other masterclock update unsafe, which includes,
      at the moment:
      
              * HV_X64_MSR_REFERENCE_TSC MSR write.
              * TSC writes.
              * Host suspend/resume.
      
      Avoid the time jump issue by using frequency uncorrected
      CLOCK_MONOTONIC_RAW clock.
      
      Its the guests time keeping software responsability
      to track and correct a reference clock such as UTC.
      
      This fixes forward time jump (which can result in
      failure to bring up a vCPU) during vCPU hotplug:
      
      Oct 11 14:48:33 storage kernel: CPU2 has been hot-added
      Oct 11 14:48:34 storage kernel: CPU3 has been hot-added
      Oct 11 14:49:22 storage kernel: smpboot: Booting Node 0 Processor 2 APIC 0x2          <-- time jump of almost 1 minute
      Oct 11 14:49:22 storage kernel: smpboot: do_boot_cpu failed(-1) to wakeup CPU#2
      Oct 11 14:49:23 storage kernel: smpboot: Booting Node 0 Processor 3 APIC 0x3
      Oct 11 14:49:23 storage kernel: kvm-clock: cpu 3, msr 0:7ff640c1, secondary cpu clock
      
      Which happens because:
      
                      /*
                       * Wait 10s total for a response from AP
                       */
                      boot_error = -1;
                      timeout = jiffies + 10*HZ;
                      while (time_before(jiffies, timeout)) {
                               ...
                      }
      Analyzed-by: default avatarIgor Mammedov <imammedo@redhat.com>
      Signed-off-by: default avatarMarcelo Tosatti <mtosatti@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      53fafdbb
  4. 31 Oct, 2019 1 commit
    • Paolo Bonzini's avatar
      Merge tag 'kvm-ppc-next-5.5-1' of... · e7011c5d
      Paolo Bonzini authored
      Merge tag 'kvm-ppc-next-5.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into HEAD
      
      KVM PPC update for 5.5
      
      * Add capability to tell userspace whether we can single-step the guest.
      
      * Improve the allocation of XIVE virtual processor IDs, to reduce the
        risk of running out of IDs when running many VMs on POWER9.
      
      * Rewrite interrupt synthesis code to deliver interrupts in virtual
        mode when appropriate.
      
      * Minor cleanups and improvements.
      e7011c5d
  5. 25 Oct, 2019 1 commit
  6. 22 Oct, 2019 32 commits