1. 25 Sep, 2019 2 commits
    • Vitaly Kuznetsov's avatar
      KVM: selftests: fix ucall on x86 · 90a48843
      Vitaly Kuznetsov authored
      After commit e8bb4755eea2("KVM: selftests: Split ucall.c into architecture
      specific files") selftests which use ucall on x86 started segfaulting and
      apparently it's gcc to blame: it "optimizes" ucall() function throwing away
      va_start/va_end part because it thinks the structure is not being used.
      Previously, it couldn't do that because the there was also MMIO version and
      the decision which particular implementation to use was done at runtime.
      
      With older gccs it's possible to solve the problem by adding 'volatile'
      to 'struct ucall' but at least with gcc-8.3 this trick doesn't work.
      
      'memory' clobber seems to do the job.
      Signed-off-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      90a48843
    • Wanpeng Li's avatar
      Revert "locking/pvqspinlock: Don't wait if vCPU is preempted" · 89340d09
      Wanpeng Li authored
      This patch reverts commit 75437bb3 (locking/pvqspinlock: Don't
      wait if vCPU is preempted).  A large performance regression was caused
      by this commit.  on over-subscription scenarios.
      
      The test was run on a Xeon Skylake box, 2 sockets, 40 cores, 80 threads,
      with three VMs of 80 vCPUs each.  The score of ebizzy -M is reduced from
      13000-14000 records/s to 1700-1800 records/s:
      
                Host                Guest                score
      
      vanilla w/o kvm optimizations     upstream    1700-1800 records/s
      vanilla w/o kvm optimizations     revert      13000-14000 records/s
      vanilla w/ kvm optimizations      upstream    4500-5000 records/s
      vanilla w/ kvm optimizations      revert      14000-15500 records/s
      
      Exit from aggressive wait-early mechanism can result in premature yield
      and extra scheduling latency.
      
      Actually, only 6% of wait_early events are caused by vcpu_is_preempted()
      being true.  However, when one vCPU voluntarily releases its vCPU, all
      the subsequently waiters in the queue will do the same and the cascading
      effect leads to bad performance.
      
      kvm optimizations:
      [1] commit d73eb57b (KVM: Boost vCPUs that are delivering interrupts)
      [2] commit 266e85a5 (KVM: X86: Boost queue head vCPU to mitigate lock waiter preemption)
      
      Tested-by: loobinliu@tencent.com
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Waiman Long <longman@redhat.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: loobinliu@tencent.com
      Cc: stable@vger.kernel.org
      Fixes: 75437bb3 (locking/pvqspinlock: Don't wait if vCPU is preempted)
      Signed-off-by: default avatarWanpeng Li <wanpengli@tencent.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      89340d09
  2. 24 Sep, 2019 38 commits