1. 26 Sep, 2022 10 commits
  2. 30 Aug, 2022 2 commits
  3. 24 Aug, 2022 4 commits
  4. 19 Aug, 2022 17 commits
    • David Matlack's avatar
      KVM: selftests: Fix ambiguous mov in KVM_ASM_SAFE() · 372d0708
      David Matlack authored
      Change the mov in KVM_ASM_SAFE() that zeroes @vector to a movb to
      make it unambiguous.
      
      This fixes a build failure with Clang since, unlike the GNU assembler,
      the LLVM integrated assembler rejects ambiguous X86 instructions that
      don't have suffixes:
      
        In file included from x86_64/hyperv_features.c:13:
        include/x86_64/processor.h:825:9: error: ambiguous instructions require an explicit suffix (could be 'movb', 'movw', 'movl', or 'movq')
                return kvm_asm_safe("wrmsr", "a"(val & -1u), "d"(val >> 32), "c"(msr));
                       ^
        include/x86_64/processor.h:802:15: note: expanded from macro 'kvm_asm_safe'
                asm volatile(KVM_ASM_SAFE(insn)                 \
                             ^
        include/x86_64/processor.h:788:16: note: expanded from macro 'KVM_ASM_SAFE'
                "1: " insn "\n\t"                                       \
                              ^
        <inline asm>:5:2: note: instantiated into assembly here
                mov $0, 15(%rsp)
                ^
      
      It seems like this change could introduce undesirable behavior in the
      future, e.g. if someone used a type larger than a u8 for @vector, since
      KVM_ASM_SAFE() will only zero the bottom byte. I tried changing the type
      of @vector to an int to see what would happen. GCC failed to compile due
      to a size mismatch between `movb` and `%eax`. Clang succeeded in
      compiling, but the generated code looked correct, so perhaps it will not
      be an issue. That being said it seems like there could be a better
      solution to this issue that does not assume @vector is a u8.
      
      Fixes: 3b23054c ("KVM: selftests: Add x86-64 support for exception fixup")
      Signed-off-by: default avatarDavid Matlack <dmatlack@google.com>
      Reviewed-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20220722234838.2160385-3-dmatlack@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      372d0708
    • David Matlack's avatar
      KVM: selftests: Fix KVM_EXCEPTION_MAGIC build with Clang · 67ef8664
      David Matlack authored
      Change KVM_EXCEPTION_MAGIC to use the all-caps "ULL", rather than lower
      case. This fixes a build failure with Clang:
      
        In file included from x86_64/hyperv_features.c:13:
        include/x86_64/processor.h:825:9: error: unexpected token in argument list
                return kvm_asm_safe("wrmsr", "a"(val & -1u), "d"(val >> 32), "c"(msr));
                       ^
        include/x86_64/processor.h:802:15: note: expanded from macro 'kvm_asm_safe'
                asm volatile(KVM_ASM_SAFE(insn)                 \
                             ^
        include/x86_64/processor.h:785:2: note: expanded from macro 'KVM_ASM_SAFE'
                "mov $" __stringify(KVM_EXCEPTION_MAGIC) ", %%r9\n\t"   \
                ^
        <inline asm>:1:18: note: instantiated into assembly here
                mov $0xabacadabaull, %r9
                                ^
      
      Fixes: 3b23054c ("KVM: selftests: Add x86-64 support for exception fixup")
      Signed-off-by: default avatarDavid Matlack <dmatlack@google.com>
      Reviewed-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20220722234838.2160385-2-dmatlack@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      67ef8664
    • Jim Mattson's avatar
      KVM: VMX: Heed the 'msr' argument in msr_write_intercepted() · 020dac41
      Jim Mattson authored
      Regardless of the 'msr' argument passed to the VMX version of
      msr_write_intercepted(), the function always checks to see if a
      specific MSR (IA32_SPEC_CTRL) is intercepted for write.  This behavior
      seems unintentional and unexpected.
      
      Modify the function so that it checks to see if the provided 'msr'
      index is intercepted for write.
      
      Fixes: 67f4b996 ("KVM: nVMX: Handle dynamic MSR intercept toggling")
      Cc: Sean Christopherson <seanjc@google.com>
      Signed-off-by: default avatarJim Mattson <jmattson@google.com>
      Reviewed-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20220810213050.2655000-1-jmattson@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      020dac41
    • Junaid Shahid's avatar
      kvm: x86: mmu: Always flush TLBs when enabling dirty logging · b64d740e
      Junaid Shahid authored
      When A/D bits are not available, KVM uses a software access tracking
      mechanism, which involves making the SPTEs inaccessible. However,
      the clear_young() MMU notifier does not flush TLBs. So it is possible
      that there may still be stale, potentially writable, TLB entries.
      This is usually fine, but can be problematic when enabling dirty
      logging, because it currently only does a TLB flush if any SPTEs were
      modified. But if all SPTEs are in access-tracked state, then there
      won't be a TLB flush, which means that the guest could still possibly
      write to memory and not have it reflected in the dirty bitmap.
      
      So just unconditionally flush the TLBs when enabling dirty logging.
      As an alternative, KVM could explicitly check the MMU-Writable bit when
      write-protecting SPTEs to decide if a flush is needed (instead of
      checking the Writable bit), but given that a flush almost always happens
      anyway, so just making it unconditional seems simpler.
      Signed-off-by: default avatarJunaid Shahid <junaids@google.com>
      Message-Id: <20220810224939.2611160-1-junaids@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      b64d740e
    • Junaid Shahid's avatar
      kvm: x86: mmu: Drop the need_remote_flush() function · 1441ca14
      Junaid Shahid authored
      This is only used by kvm_mmu_pte_write(), which no longer actually
      creates the new SPTE and instead just clears the old SPTE. So we
      just need to check if the old SPTE was shadow-present instead of
      calling need_remote_flush(). Hence we can drop this function. It was
      incomplete anyway as it didn't take access-tracking into account.
      
      This patch should not result in any functional change.
      Signed-off-by: default avatarJunaid Shahid <junaids@google.com>
      Reviewed-by: default avatarDavid Matlack <dmatlack@google.com>
      Reviewed-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20220723024316.2725328-1-junaids@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      1441ca14
    • Paolo Bonzini's avatar
      Merge tag 'kvmarm-fixes-6.0-1' of... · 959d6c4a
      Paolo Bonzini authored
      Merge tag 'kvmarm-fixes-6.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
      
      KVM/arm64 fixes for 6.0, take #1
      
      - Fix unexpected sign extension of KVM_ARM_DEVICE_ID_MASK
      
      - Tidy-up handling of AArch32 on asymmetric systems
      959d6c4a
    • Li kunyu's avatar
      KVM: Drop unnecessary initialization of "ops" in kvm_ioctl_create_device() · eceb6e1d
      Li kunyu authored
      The variable is initialized but it is only used after its assignment.
      Reviewed-by: default avatarSean Christopherson <seanjc@google.com>
      Signed-off-by: default avatarLi kunyu <kunyu@nfschina.com>
      Message-Id: <20220819021535.483702-1-kunyu@nfschina.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      eceb6e1d
    • Li kunyu's avatar
      KVM: Drop unnecessary initialization of "npages" in hva_to_pfn_slow() · 28249139
      Li kunyu authored
      The variable is initialized but it is only used after its assignment.
      Reviewed-by: default avatarSean Christopherson <seanjc@google.com>
      Signed-off-by: default avatarLi kunyu <kunyu@nfschina.com>
      Message-Id: <20220819022804.483914-1-kunyu@nfschina.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      28249139
    • Josh Poimboeuf's avatar
      x86/kvm: Fix "missing ENDBR" BUG for fastop functions · 3d9606b0
      Josh Poimboeuf authored
      The following BUG was reported:
      
        traps: Missing ENDBR: andw_ax_dx+0x0/0x10 [kvm]
        ------------[ cut here ]------------
        kernel BUG at arch/x86/kernel/traps.c:253!
        invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
         <TASK>
         asm_exc_control_protection+0x2b/0x30
        RIP: 0010:andw_ax_dx+0x0/0x10 [kvm]
        Code: c3 cc cc cc cc 0f 1f 44 00 00 66 0f 1f 00 48 19 d0 c3 cc cc cc
              cc 0f 1f 40 00 f3 0f 1e fa 20 d0 c3 cc cc cc cc 0f 1f 44 00 00
              <66> 0f 1f 00 66 21 d0 c3 cc cc cc cc 0f 1f 40 00 66 0f 1f 00 21
              d0
      
         ? andb_al_dl+0x10/0x10 [kvm]
         ? fastop+0x5d/0xa0 [kvm]
         x86_emulate_insn+0x822/0x1060 [kvm]
         x86_emulate_instruction+0x46f/0x750 [kvm]
         complete_emulated_mmio+0x216/0x2c0 [kvm]
         kvm_arch_vcpu_ioctl_run+0x604/0x650 [kvm]
         kvm_vcpu_ioctl+0x2f4/0x6b0 [kvm]
         ? wake_up_q+0xa0/0xa0
      
      The BUG occurred because the ENDBR in the andw_ax_dx() fastop function
      had been incorrectly "sealed" (converted to a NOP) by apply_ibt_endbr().
      
      Objtool marked it to be sealed because KVM has no compile-time
      references to the function.  Instead KVM calculates its address at
      runtime.
      
      Prevent objtool from annotating fastop functions as sealable by creating
      throwaway dummy compile-time references to the functions.
      
      Fixes: 6649fa87 ("x86/ibt,kvm: Add ENDBR to fastops")
      Reported-by: default avatarPengfei Xu <pengfei.xu@intel.com>
      Debugged-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@kernel.org>
      Message-Id: <0d4116f90e9d0c1b754bb90c585e6f0415a1c508.1660837839.git.jpoimboe@kernel.org>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      3d9606b0
    • Josh Poimboeuf's avatar
      x86/kvm: Simplify FOP_SETCC() · 22472d12
      Josh Poimboeuf authored
      SETCC_ALIGN and FOP_ALIGN are both 16.  Remove the special casing for
      FOP_SETCC() and just make it a normal fastop.
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@kernel.org>
      Message-Id: <7c13d94d1a775156f7e36eed30509b274a229140.1660837839.git.jpoimboe@kernel.org>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      22472d12
    • Josh Poimboeuf's avatar
      x86/ibt, objtool: Add IBT_NOSEAL() · e27e5bea
      Josh Poimboeuf authored
      Add a macro which prevents a function from getting sealed if there are
      no compile-time references to it.
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@kernel.org>
      Message-Id: <20220818213927.e44fmxkoq4yj6ybn@treble>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      e27e5bea
    • Chao Peng's avatar
      KVM: Rename mmu_notifier_* to mmu_invalidate_* · 20ec3ebd
      Chao Peng authored
      The motivation of this renaming is to make these variables and related
      helper functions less mmu_notifier bound and can also be used for non
      mmu_notifier based page invalidation. mmu_invalidate_* was chosen to
      better describe the purpose of 'invalidating' a page that those
      variables are used for.
      
        - mmu_notifier_seq/range_start/range_end are renamed to
          mmu_invalidate_seq/range_start/range_end.
      
        - mmu_notifier_retry{_hva} helper functions are renamed to
          mmu_invalidate_retry{_hva}.
      
        - mmu_notifier_count is renamed to mmu_invalidate_in_progress to
          avoid confusion with mn_active_invalidate_count.
      
        - While here, also update kvm_inc/dec_notifier_count() to
          kvm_mmu_invalidate_begin/end() to match the change for
          mmu_notifier_count.
      
      No functional change intended.
      Signed-off-by: default avatarChao Peng <chao.p.peng@linux.intel.com>
      Message-Id: <20220816125322.1110439-3-chao.p.peng@linux.intel.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      20ec3ebd
    • Chao Peng's avatar
      KVM: Rename KVM_PRIVATE_MEM_SLOTS to KVM_INTERNAL_MEM_SLOTS · bdd1c37a
      Chao Peng authored
      KVM_INTERNAL_MEM_SLOTS better reflects the fact those slots are KVM
      internally used (invisible to userspace) and avoids confusion to future
      private slots that can have different meaning.
      Signed-off-by: default avatarChao Peng <chao.p.peng@linux.intel.com>
      Message-Id: <20220816125322.1110439-2-chao.p.peng@linux.intel.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      bdd1c37a
    • Paolo Bonzini's avatar
      KVM: MIPS: remove unnecessary definition of KVM_PRIVATE_MEM_SLOTS · b0754508
      Paolo Bonzini authored
      KVM_PRIVATE_MEM_SLOTS defaults to zero, so it is not necessary to
      define it in MIPS's asm/kvm_host.h.
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      b0754508
    • Sean Christopherson's avatar
      KVM: Move coalesced MMIO initialization (back) into kvm_create_vm() · c2b82397
      Sean Christopherson authored
      Invoke kvm_coalesced_mmio_init() from kvm_create_vm() now that allocating
      and initializing coalesced MMIO objects is separate from registering any
      associated devices.  Moving coalesced MMIO cleans up the last oddity
      where KVM does VM creation/initialization after kvm_create_vm(), and more
      importantly after kvm_arch_post_init_vm() is called and the VM is added
      to the global vm_list, i.e. after the VM is fully created as far as KVM
      is concerned.
      
      Originally, kvm_coalesced_mmio_init() was called by kvm_create_vm(), but
      the original implementation was completely devoid of error handling.
      Commit 6ce5a090 ("KVM: coalesced_mmio: fix kvm_coalesced_mmio_init()'s
      error handling" fixed the various bugs, and in doing so rightly moved the
      call to after kvm_create_vm() because kvm_coalesced_mmio_init() also
      registered the coalesced MMIO device.  Commit 2b3c246a ("KVM: Make
      coalesced mmio use a device per zone") cleaned up that mess by having
      each zone register a separate device, i.e. moved device registration to
      its logical home in kvm_vm_ioctl_register_coalesced_mmio().  As a result,
      kvm_coalesced_mmio_init() is now a "pure" initialization helper and can
      be safely called from kvm_create_vm().
      
      Opportunstically drop the #ifdef, KVM provides stubs for
      kvm_coalesced_mmio_{init,free}() when CONFIG_KVM_MMIO=n (s390).
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20220816053937.2477106-4-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      c2b82397
    • Sean Christopherson's avatar
      KVM: Unconditionally get a ref to /dev/kvm module when creating a VM · 405294f2
      Sean Christopherson authored
      Unconditionally get a reference to the /dev/kvm module when creating a VM
      instead of using try_get_module(), which will fail if the module is in
      the process of being forcefully unloaded.  The error handling when
      try_get_module() fails doesn't properly unwind all that has been done,
      e.g. doesn't call kvm_arch_pre_destroy_vm() and doesn't remove the VM
      from the global list.  Not removing VMs from the global list tends to be
      fatal, e.g. leads to use-after-free explosions.
      
      The obvious alternative would be to add proper unwinding, but the
      justification for using try_get_module(), "rmmod --wait", is completely
      bogus as support for "rmmod --wait", i.e. delete_module() without
      O_NONBLOCK, was removed by commit 3f2b9c9c ("module: remove rmmod
      --wait option.") nearly a decade ago.
      
      It's still possible for try_get_module() to fail due to the module dying
      (more like being killed), as the module will be tagged MODULE_STATE_GOING
      by "rmmod --force", i.e. delete_module(..., O_TRUNC), but playing nice
      with forced unloading is an exercise in futility and gives a falsea sense
      of security.  Using try_get_module() only prevents acquiring _new_
      references, it doesn't magically put the references held by other VMs,
      and forced unloading doesn't wait, i.e. "rmmod --force" on KVM is all but
      guaranteed to cause spectacular fireworks; the window where KVM will fail
      try_get_module() is tiny compared to the window where KVM is building and
      running the VM with an elevated module refcount.
      
      Addressing KVM's inability to play nice with "rmmod --force" is firmly
      out-of-scope.  Forcefully unloading any module taints kernel (for obvious
      reasons)  _and_ requires the kernel to be built with
      CONFIG_MODULE_FORCE_UNLOAD=y, which is off by default and comes with the
      amusing disclaimer that it's "mainly for kernel developers and desperate
      users".  In other words, KVM is free to scoff at bug reports due to using
      "rmmod --force" while VMs may be running.
      
      Fixes: 5f6de5cb ("KVM: Prevent module exit until all VMs are freed")
      Cc: stable@vger.kernel.org
      Cc: David Matlack <dmatlack@google.com>
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20220816053937.2477106-3-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      405294f2
    • Sean Christopherson's avatar
      KVM: Properly unwind VM creation if creating debugfs fails · 4ba4f419
      Sean Christopherson authored
      Properly unwind VM creation if kvm_create_vm_debugfs() fails.  A recent
      change to invoke kvm_create_vm_debug() in kvm_create_vm() was led astray
      by buggy try_get_module() handling adding by commit 5f6de5cb ("KVM:
      Prevent module exit until all VMs are freed").  The debugfs error path
      effectively inherits the bad error path of try_module_get(), e.g. KVM
      leaves the to-be-free VM on vm_list even though KVM appears to do the
      right thing by calling module_put() and falling through.
      
      Opportunistically hoist kvm_create_vm_debugfs() above the call to
      kvm_arch_post_init_vm() so that the "post-init" arch hook is actually
      invoked after the VM is initialized (ignoring kvm_coalesced_mmio_init()
      for the moment).  x86 is the only non-nop implementation of the post-init
      hook, and it doesn't allocate/initialize any objects that are reachable
      via debugfs code (spawns a kthread worker for the NX huge page mitigation).
      
      Leave the buggy try_get_module() alone for now, it will be fixed in a
      separate commit.
      
      Fixes: b74ed7a6 ("KVM: Actually create debugfs in kvm_create_vm()")
      Reported-by: syzbot+744e173caec2e1627ee0@syzkaller.appspotmail.com
      Cc: Oliver Upton <oliver.upton@linux.dev>
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Reviewed-by: default avatarOliver Upton <oliver.upton@linux.dev>
      Message-Id: <20220816053937.2477106-2-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      4ba4f419
  5. 17 Aug, 2022 2 commits
  6. 14 Aug, 2022 5 commits
    • Linus Torvalds's avatar
      Linux 6.0-rc1 · 568035b0
      Linus Torvalds authored
      568035b0
    • Yury Norov's avatar
      radix-tree: replace gfp.h inclusion with gfp_types.h · 9f162193
      Yury Norov authored
      Radix tree header includes gfp.h for __GFP_BITS_SHIFT only. Now we
      have gfp_types.h for this.
      
      Fixes powerpc allmodconfig build:
      
         In file included from include/linux/nodemask.h:97,
                          from include/linux/mmzone.h:17,
                          from include/linux/gfp.h:7,
                          from include/linux/radix-tree.h:12,
                          from include/linux/idr.h:15,
                          from include/linux/kernfs.h:12,
                          from include/linux/sysfs.h:16,
                          from include/linux/kobject.h:20,
                          from include/linux/pci.h:35,
                          from arch/powerpc/kernel/prom_init.c:24:
         include/linux/random.h: In function 'add_latent_entropy':
      >> include/linux/random.h:25:46: error: 'latent_entropy' undeclared (first use in this function); did you mean 'add_latent_entropy'?
            25 |         add_device_randomness((const void *)&latent_entropy, sizeof(latent_entropy));
               |                                              ^~~~~~~~~~~~~~
               |                                              add_latent_entropy
         include/linux/random.h:25:46: note: each undeclared identifier is reported only once for each function it appears in
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      CC: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
      CC: Andrew Morton <akpm@linux-foundation.org>
      CC: Jason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarYury Norov <yury.norov@gmail.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9f162193
    • Linus Torvalds's avatar
      Merge tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 74cbb480
      Linus Torvalds authored
      Pull vfs lseek fix from Al Viro:
       "Fix proc_reg_llseek() breakage. Always had been possible if somebody
        left NULL ->proc_lseek, became a practical issue now"
      
      * tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        take care to handle NULL ->proc_lseek()
      74cbb480
    • Al Viro's avatar
      take care to handle NULL ->proc_lseek() · 3f61631d
      Al Viro authored
      Easily done now, just by clearing FMODE_LSEEK in ->f_mode
      during proc_reg_open() for such entries.
      
      Fixes: 868941b1 "fs: remove no_llseek"
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      3f61631d
    • Linus Torvalds's avatar
      Merge tag 'for-linus-6.0-rc1b-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip · 5d6a0f4d
      Linus Torvalds authored
      Pull more xen updates from Juergen Gross:
      
       - fix the handling of the "persistent grants" feature negotiation
         between Xen blkfront and Xen blkback drivers
      
       - a cleanup of xen.config and adding xen.config to Xen section in
         MAINTAINERS
      
       - support HVMOP_set_evtchn_upcall_vector, which is more compliant to
         "normal" interrupt handling than the global callback used up to now
      
       - further small cleanups
      
      * tag 'for-linus-6.0-rc1b-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
        MAINTAINERS: add xen config fragments to XEN HYPERVISOR sections
        xen: remove XEN_SCRUB_PAGES in xen.config
        xen/pciback: Fix comment typo
        xen/xenbus: fix return type in xenbus_file_read()
        xen-blkfront: Apply 'feature_persistent' parameter when connect
        xen-blkback: Apply 'feature_persistent' parameter when connect
        xen-blkback: fix persistent grants negotiation
        x86/xen: Add support for HVMOP_set_evtchn_upcall_vector
      5d6a0f4d