1. 17 Nov, 2022 21 commits
  2. 16 Nov, 2022 14 commits
  3. 09 Nov, 2022 5 commits
    • Paolo Bonzini's avatar
      KVM: replace direct irq.h inclusion · d663b8a2
      Paolo Bonzini authored
      virt/kvm/irqchip.c is including "irq.h" from the arch-specific KVM source
      directory (i.e. not from arch/*/include) for the sole purpose of retrieving
      irqchip_in_kernel.
      
      Making the function inline in a header that is already included,
      such as asm/kvm_host.h, is not possible because it needs to look at
      struct kvm which is defined after asm/kvm_host.h is included.  So add a
      kvm_arch_irqchip_in_kernel non-inline function; irqchip_in_kernel() is
      only performance critical on arm64 and x86, and the non-inline function
      is enough on all other architectures.
      
      irq.h can then be deleted from all architectures except x86.
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      d663b8a2
    • Like Xu's avatar
      KVM: x86/pmu: Defer counter emulated overflow via pmc->prev_counter · de0f6195
      Like Xu authored
      Defer reprogramming counters and handling overflow via KVM_REQ_PMU
      when incrementing counters.  KVM skips emulated WRMSR in the VM-Exit
      fastpath, the fastpath runs with IRQs disabled, skipping instructions
      can increment and reprogram counters, reprogramming counters can
      sleep, and sleeping is disallowed while IRQs are disabled.
      
       [*] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:580
       [*] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 2981888, name: CPU 15/KVM
       [*] preempt_count: 1, expected: 0
       [*] RCU nest depth: 0, expected: 0
       [*] INFO: lockdep is turned off.
       [*] irq event stamp: 0
       [*] hardirqs last  enabled at (0): [<0000000000000000>] 0x0
       [*] hardirqs last disabled at (0): [<ffffffff8121222a>] copy_process+0x146a/0x62d0
       [*] softirqs last  enabled at (0): [<ffffffff81212269>] copy_process+0x14a9/0x62d0
       [*] softirqs last disabled at (0): [<0000000000000000>] 0x0
       [*] Preemption disabled at:
       [*] [<ffffffffc2063fc1>] vcpu_enter_guest+0x1001/0x3dc0 [kvm]
       [*] CPU: 17 PID: 2981888 Comm: CPU 15/KVM Kdump: 5.19.0-rc1-g239111db364c-dirty #2
       [*] Call Trace:
       [*]  <TASK>
       [*]  dump_stack_lvl+0x6c/0x9b
       [*]  __might_resched.cold+0x22e/0x297
       [*]  __mutex_lock+0xc0/0x23b0
       [*]  perf_event_ctx_lock_nested+0x18f/0x340
       [*]  perf_event_pause+0x1a/0x110
       [*]  reprogram_counter+0x2af/0x1490 [kvm]
       [*]  kvm_pmu_trigger_event+0x429/0x950 [kvm]
       [*]  kvm_skip_emulated_instruction+0x48/0x90 [kvm]
       [*]  handle_fastpath_set_msr_irqoff+0x349/0x3b0 [kvm]
       [*]  vmx_vcpu_run+0x268e/0x3b80 [kvm_intel]
       [*]  vcpu_enter_guest+0x1d22/0x3dc0 [kvm]
      
      Add a field to kvm_pmc to track the previous counter value in order
      to defer overflow detection to kvm_pmu_handle_event() (the counter must
      be paused before handling overflow, and that may increment the counter).
      
      Opportunistically shrink sizeof(struct kvm_pmc) a bit.
      Suggested-by: default avatarWanpeng Li <wanpengli@tencent.com>
      Fixes: 9cd803d4 ("KVM: x86: Update vPMCs when retiring instructions")
      Signed-off-by: default avatarLike Xu <likexu@tencent.com>
      Link: https://lore.kernel.org/r/20220831085328.45489-6-likexu@tencent.com
      [sean: avoid re-triggering KVM_REQ_PMU on overflow, tweak changelog]
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20220923001355.3741194-5-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      de0f6195
    • Like Xu's avatar
      KVM: x86/pmu: Defer reprogram_counter() to kvm_pmu_handle_event() · 68fb4757
      Like Xu authored
      Batch reprogramming PMU counters by setting KVM_REQ_PMU and thus
      deferring reprogramming kvm_pmu_handle_event() to avoid reprogramming
      a counter multiple times during a single VM-Exit.
      
      Deferring programming will also allow KVM to fix a bug where immediately
      reprogramming a counter can result in sleeping (taking a mutex) while
      interrupts are disabled in the VM-Exit fastpath.
      
      Introduce kvm_pmu_request_counter_reprogam() to make it obvious that
      KVM is _requesting_ a reprogram and not actually doing the reprogram.
      
      Opportunistically refine related comments to avoid misunderstandings.
      Signed-off-by: default avatarLike Xu <likexu@tencent.com>
      Link: https://lore.kernel.org/r/20220831085328.45489-5-likexu@tencent.comSigned-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20220923001355.3741194-4-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      68fb4757
    • Sean Christopherson's avatar
      KVM: x86/pmu: Clear "reprogram" bit if counter is disabled or disallowed · dcbb816a
      Sean Christopherson authored
      When reprogramming a counter, clear the counter's "reprogram pending" bit
      if the counter is disabled (by the guest) or is disallowed (by the
      userspace filter).  In both cases, there's no need to re-attempt
      programming on the next coincident KVM_REQ_PMU as enabling the counter by
      either method will trigger reprogramming.
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20220923001355.3741194-3-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      dcbb816a
    • Sean Christopherson's avatar
      KVM: x86/pmu: Force reprogramming of all counters on PMU filter change · f1c5651f
      Sean Christopherson authored
      Force vCPUs to reprogram all counters on a PMU filter change to provide
      a sane ABI for userspace.  Use the existing KVM_REQ_PMU to do the
      programming, and take advantage of the fact that the reprogram_pmi bitmap
      fits in a u64 to set all bits in a single atomic update.  Note, setting
      the bitmap and making the request needs to be done _after_ the SRCU
      synchronization to ensure that vCPUs will reprogram using the new filter.
      
      KVM's current "lazy" approach is confusing and non-deterministic.  It's
      confusing because, from a developer perspective, the code is buggy as it
      makes zero sense to let userspace modify the filter but then not actually
      enforce the new filter.  The lazy approach is non-deterministic because
      KVM enforces the filter whenever a counter is reprogrammed, not just on
      guest WRMSRs, i.e. a guest might gain/lose access to an event at random
      times depending on what is going on in the host.
      
      Note, the resulting behavior is still non-determinstic while the filter
      is in flux.  If userspace wants to guarantee deterministic behavior, all
      vCPUs should be paused during the filter update.
      
      Jim Mattson <jmattson@google.com>
      
      Fixes: 66bb8a06 ("KVM: x86: PMU Event Filter")
      Cc: Aaron Lewis <aaronlewis@google.com>
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20220923001355.3741194-2-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      f1c5651f