An error occurred fetching the project authors.
  1. 21 Apr, 2021 2 commits
    • Paolo Bonzini's avatar
      KVM: selftests: Always run vCPU thread with blocked SIG_IPI · bf1e15a8
      Paolo Bonzini authored
      The main thread could start to send SIG_IPI at any time, even before signal
      blocked on vcpu thread.  Therefore, start the vcpu thread with the signal
      blocked.
      
      Without this patch, on very busy cores the dirty_log_test could fail directly
      on receiving a SIGUSR1 without a handler (when vcpu runs far slower than main).
      Reported-by: default avatarPeter Xu <peterx@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      bf1e15a8
    • Peter Xu's avatar
      KVM: selftests: Sync data verify of dirty logging with guest sync · 016ff1a4
      Peter Xu authored
      This fixes a bug that can trigger with e.g. "taskset -c 0 ./dirty_log_test" or
      when the testing host is very busy.
      
      A similar previous attempt is done [1] but that is not enough, the reason is
      stated in the reply [2].
      
      As a summary (partly quotting from [2]):
      
      The problem is I think one guest memory write operation (of this specific test)
      contains a few micro-steps when page is during kvm dirty tracking (here I'm
      only considering write-protect rather than pml but pml should be similar at
      least when the log buffer is full):
      
        (1) Guest read 'iteration' number into register, prepare to write, page fault
        (2) Set dirty bit in either dirty bitmap or dirty ring
        (3) Return to guest, data written
      
      When we verify the data, we assumed that all these steps are "atomic", say,
      when (1) happened for this page, we assume (2) & (3) must have happened.  We
      had some trick to workaround "un-atomicity" of above three steps, as previous
      version of this patch wanted to fix atomicity of step (2)+(3) by explicitly
      letting the main thread wait for at least one vmenter of vcpu thread, which
      should work.  However what I overlooked is probably that we still have race
      when (1) and (2) can be interrupted.
      
      One example calltrace when it could happen that we read an old interation, got
      interrupted before even setting the dirty bit and flushing data:
      
          __schedule+1742
          __cond_resched+52
          __get_user_pages+530
          get_user_pages_unlocked+197
          hva_to_pfn+206
          try_async_pf+132
          direct_page_fault+320
          kvm_mmu_page_fault+103
          vmx_handle_exit+288
          vcpu_enter_guest+2460
          kvm_arch_vcpu_ioctl_run+325
          kvm_vcpu_ioctl+526
          __x64_sys_ioctl+131
          do_syscall_64+51
          entry_SYSCALL_64_after_hwframe+68
      
      It means iteration number cached in vcpu register can be very old when dirty
      bit set and data flushed.
      
      So far I don't see an easy way to guarantee all steps 1-3 atomicity but to sync
      at the GUEST_SYNC() point of guest code when we do verification of the dirty
      bits as what this patch does.
      
      [1] https://lore.kernel.org/lkml/20210413213641.23742-1-peterx@redhat.com/
      [2] https://lore.kernel.org/lkml/20210417140956.GV4440@xz-x1/
      
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Sean Christopherson <seanjc@google.com>
      Cc: Andrew Jones <drjones@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPeter Xu <peterx@redhat.com>
      Message-Id: <20210417143602.215059-2-peterx@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      016ff1a4
  2. 07 Jan, 2021 1 commit
  3. 16 Nov, 2020 1 commit
  4. 15 Nov, 2020 4 commits
    • Peter Xu's avatar
      KVM: selftests: Add "-c" parameter to dirty log test · edd3de6f
      Peter Xu authored
      It's only used to override the existing dirty ring size/count.  If
      with a bigger ring count, we test async of dirty ring.  If with a
      smaller ring count, we test ring full code path.  Async is default.
      
      It has no use for non-dirty-ring tests.
      Reviewed-by: default avatarAndrew Jones <drjones@redhat.com>
      Signed-off-by: default avatarPeter Xu <peterx@redhat.com>
      Message-Id: <20201001012241.6208-1-peterx@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      edd3de6f
    • Peter Xu's avatar
      KVM: selftests: Run dirty ring test asynchronously · 019d321a
      Peter Xu authored
      Previously the dirty ring test was working in synchronous way, because
      only with a vmexit (with that it was the ring full event) we'll know
      the hardware dirty bits will be flushed to the dirty ring.
      
      With this patch we first introduce a vcpu kick mechanism using SIGUSR1,
      which guarantees a vmexit and also therefore the flushing of hardware
      dirty bits.  Once this is in place, we can keep the vcpu dirty work
      asynchronous of the whole collection procedure now.  Still, we need
      to be very careful that when reaching the ring buffer soft limit
      (KVM_EXIT_DIRTY_RING_FULL) we must collect the dirty bits before
      continuing the vcpu.
      
      Further increase the dirty ring size to current maximum to make sure
      we torture more on the no-ring-full case, which should be the major
      scenario when the hypervisors like QEMU would like to use this feature.
      Reviewed-by: default avatarAndrew Jones <drjones@redhat.com>
      Signed-off-by: default avatarPeter Xu <peterx@redhat.com>
      Message-Id: <20201001012239.6159-1-peterx@redhat.com>
      [Use KVM_SET_SIGNAL_MASK+sigwait instead of a signal handler. - Paolo]
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      019d321a
    • Peter Xu's avatar
      KVM: selftests: Add dirty ring buffer test · 84292e56
      Peter Xu authored
      Add the initial dirty ring buffer test.
      
      The current test implements the userspace dirty ring collection, by
      only reaping the dirty ring when the ring is full.
      
      So it's still running synchronously like this:
      
                  vcpu                             main thread
      
        1. vcpu dirties pages
        2. vcpu gets dirty ring full
           (userspace exit)
      
                                             3. main thread waits until full
                                                (so hardware buffers flushed)
                                             4. main thread collects
                                             5. main thread continues vcpu
      
        6. vcpu continues, goes back to 1
      
      We can't directly collects dirty bits during vcpu execution because
      otherwise we can't guarantee the hardware dirty bits were flushed when
      we collect and we're very strict on the dirty bits so otherwise we can
      fail the future verify procedure.  A follow up patch will make this
      test to support async just like the existing dirty log test, by adding
      a vcpu kick mechanism.
      Signed-off-by: default avatarPeter Xu <peterx@redhat.com>
      Message-Id: <20201001012237.6111-1-peterx@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      84292e56
    • Peter Xu's avatar
      KVM: selftests: Introduce after_vcpu_run hook for dirty log test · 60f644fb
      Peter Xu authored
      Provide a hook for the checks after vcpu_run() completes.  Preparation
      for the dirty ring test because we'll need to take care of another
      exit reason.
      Reviewed-by: default avatarAndrew Jones <drjones@redhat.com>
      Signed-off-by: default avatarPeter Xu <peterx@redhat.com>
      Message-Id: <20201001012235.6063-1-peterx@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      60f644fb
  5. 08 Nov, 2020 3 commits
  6. 16 Mar, 2020 6 commits
  7. 24 Feb, 2020 3 commits
  8. 24 Sep, 2019 4 commits
    • Peter Xu's avatar
      KVM: selftests: Remove duplicate guest mode handling · 52200d0d
      Peter Xu authored
      Remove the duplication code in run_test() of dirty_log_test because
      after some reordering of functions now we can directly use the outcome
      of vm_create().
      
      Meanwhile, with the new VM_MODE_PXXV48_4K, we can safely revert
      b442324b too where we stick the x86_64 PA width to 39 bits for
      dirty_log_test.
      Reviewed-by: default avatarAndrew Jones <drjones@redhat.com>
      Signed-off-by: default avatarPeter Xu <peterx@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      52200d0d
    • Peter Xu's avatar
      KVM: selftests: Introduce VM_MODE_PXXV48_4K · 567a9f1e
      Peter Xu authored
      The naming VM_MODE_P52V48_4K is explicit but unclear when used on
      x86_64 machines, because x86_64 machines are having various physical
      address width rather than some static values.  Here's some examples:
      
        - Intel Xeon E3-1220:  36 bits
        - Intel Core i7-8650:  39 bits
        - AMD   EPYC 7251:     48 bits
      
      All of them are using 48 bits linear address width but with totally
      different physical address width (and most of the old machines should
      be less than 52 bits).
      
      Let's create a new guest mode called VM_MODE_PXXV48_4K for current
      x86_64 tests and make it as the default to replace the old naming of
      VM_MODE_P52V48_4K because it shows more clearly that the PA width is
      not really a constant.  Meanwhile we also stop assuming all the x86
      machines are having 52 bits PA width but instead we fetch the real
      vm->pa_bits from CPUID 0x80000008 during runtime.
      
      We currently make this exclusively used by x86_64 but no other arch.
      
      As a slight touch up, moving DEBUG macro from dirty_log_test.c to
      kvm_util.h so lib can use it too.
      Signed-off-by: default avatarPeter Xu <peterx@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      567a9f1e
    • Peter Xu's avatar
      KVM: selftests: Create VM earlier for dirty log test · 338eb298
      Peter Xu authored
      Since we've just removed the dependency of vm type in previous patch,
      now we can create the vm much earlier.  Note that to move it earlier
      we used an approximation of number of extra pages but it should be
      fine.
      
      This prepares for the follow up patches to finally remove the
      duplication of guest mode parsings.
      Reviewed-by: default avatarAndrew Jones <drjones@redhat.com>
      Signed-off-by: default avatarPeter Xu <peterx@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      338eb298
    • Peter Xu's avatar
      KVM: selftests: Move vm type into _vm_create() internally · 12c386b2
      Peter Xu authored
      Rather than passing the vm type from the top level to the end of vm
      creation, let's simply keep that as an internal of kvm_vm struct and
      decide the type in _vm_create().  Several reasons for doing this:
      
      - The vm type is only decided by physical address width and currently
        only used in aarch64, so we've got enough information as long as
        we're passing vm_guest_mode into _vm_create(),
      
      - This removes a loop dependency between the vm->type and creation of
        vms.  That's why now we need to parse vm_guest_mode twice sometimes,
        once in run_test() and then again in _vm_create().  The follow up
        patches will move on to clean up that as well so we can have a
        single place to decide guest machine types and so.
      
      Note that this patch will slightly change the behavior of aarch64
      tests in that previously most vm_create() callers will directly pass
      in type==0 into _vm_create() but now the type will depend on
      vm_guest_mode, however it shouldn't affect any user because all
      vm_create() users of aarch64 will be using VM_MODE_DEFAULT guest
      mode (which is VM_MODE_P40V48_4K) so at last type will still be zero.
      Reviewed-by: default avatarAndrew Jones <drjones@redhat.com>
      Signed-off-by: default avatarPeter Xu <peterx@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      12c386b2
  9. 02 Aug, 2019 2 commits
  10. 04 Jun, 2019 1 commit
  11. 24 May, 2019 2 commits
  12. 08 May, 2019 1 commit
  13. 30 Apr, 2019 2 commits
    • Paolo Bonzini's avatar
      KVM: fix KVM_CLEAR_DIRTY_LOG for memory slots of unaligned size · 65c4189d
      Paolo Bonzini authored
      If a memory slot's size is not a multiple of 64 pages (256K), then
      the KVM_CLEAR_DIRTY_LOG API is unusable: clearing the final 64 pages
      either requires the requested page range to go beyond memslot->npages,
      or requires log->num_pages to be unaligned, and kvm_clear_dirty_log_protect
      requires log->num_pages to be both in range and aligned.
      
      To allow this case, allow log->num_pages not to be a multiple of 64 if
      it ends exactly on the last page of the slot.
      Reported-by: default avatarPeter Xu <peterx@redhat.com>
      Fixes: 98938aa8 ("KVM: validate userspace input in kvm_clear_dirty_log_protect()", 2019-01-02)
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      65c4189d
    • Paolo Bonzini's avatar
      KVM: fix KVM_CLEAR_DIRTY_LOG for memory slots of unaligned size · 76d58e0f
      Paolo Bonzini authored
      If a memory slot's size is not a multiple of 64 pages (256K), then
      the KVM_CLEAR_DIRTY_LOG API is unusable: clearing the final 64 pages
      either requires the requested page range to go beyond memslot->npages,
      or requires log->num_pages to be unaligned, and kvm_clear_dirty_log_protect
      requires log->num_pages to be both in range and aligned.
      
      To allow this case, allow log->num_pages not to be a multiple of 64 if
      it ends exactly on the last page of the slot.
      Reported-by: default avatarPeter Xu <peterx@redhat.com>
      Fixes: 98938aa8 ("KVM: validate userspace input in kvm_clear_dirty_log_protect()", 2019-01-02)
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      76d58e0f
  14. 21 Dec, 2018 6 commits
  15. 14 Dec, 2018 1 commit
    • Paolo Bonzini's avatar
      kvm: introduce manual dirty log reprotect · 2a31b9db
      Paolo Bonzini authored
      There are two problems with KVM_GET_DIRTY_LOG.  First, and less important,
      it can take kvm->mmu_lock for an extended period of time.  Second, its user
      can actually see many false positives in some cases.  The latter is due
      to a benign race like this:
      
        1. KVM_GET_DIRTY_LOG returns a set of dirty pages and write protects
           them.
        2. The guest modifies the pages, causing them to be marked ditry.
        3. Userspace actually copies the pages.
        4. KVM_GET_DIRTY_LOG returns those pages as dirty again, even though
           they were not written to since (3).
      
      This is especially a problem for large guests, where the time between
      (1) and (3) can be substantial.  This patch introduces a new
      capability which, when enabled, makes KVM_GET_DIRTY_LOG not
      write-protect the pages it returns.  Instead, userspace has to
      explicitly clear the dirty log bits just before using the content
      of the page.  The new KVM_CLEAR_DIRTY_LOG ioctl can also operate on a
      64-page granularity rather than requiring to sync a full memslot;
      this way, the mmu_lock is taken for small amounts of time, and
      only a small amount of time will pass between write protection
      of pages and the sending of their content.
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      2a31b9db
  16. 24 Oct, 2018 1 commit
    • Andrea Parri's avatar
      selftests: kvm: Fix -Wformat warnings · fb363e2d
      Andrea Parri authored
      Fixes the following warnings:
      
      dirty_log_test.c: In function ‘help’:
      dirty_log_test.c:216:9: warning: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 2 has type ‘int’ [-Wformat=]
        printf(" -i: specify iteration counts (default: %"PRIu64")\n",
               ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      In file included from include/test_util.h:18:0,
                       from dirty_log_test.c:16:
      /usr/include/inttypes.h:105:34: note: format string is defined here
       # define PRIu64  __PRI64_PREFIX "u"
      dirty_log_test.c:218:9: warning: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 2 has type ‘int’ [-Wformat=]
        printf(" -I: specify interval in ms (default: %"PRIu64" ms)\n",
               ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      In file included from include/test_util.h:18:0,
                       from dirty_log_test.c:16:
      /usr/include/inttypes.h:105:34: note: format string is defined here
       # define PRIu64  __PRI64_PREFIX "u"
      Signed-off-by: default avatarAndrea Parri <andrea.parri@amarulasolutions.com>
      Signed-off-by: default avatarShuah Khan (Samsung OSG) <shuah@kernel.org>
      fb363e2d