1. 01 Aug, 2016 14 commits
    • James Hogan's avatar
      MIPS: KVM: Reset CP0_PageMask during host TLB flush · a700434d
      James Hogan authored
      KVM sometimes flushes host TLB entries, reading each one to check if it
      corresponds to a guest KSeg0 address. In the absence of EntryHi.EHInv
      bits to invalidate the whole entry, the entries will be set to unique
      virtual addresses in KSeg0 (which is not TLB mapped), spaced 2*PAGE_SIZE
      apart.
      
      The TLB read however will clobber the CP0_PageMask register with
      whatever page size that TLB entry had, and that same page size will be
      written back into the TLB entry along with the unique address.
      
      This would cause breakage when transparent huge pages are enabled on
      64-bit host kernels, since huge page entries will overlap other nearby
      entries when separated by only 2*PAGE_SIZE, causing a machine check
      exception.
      
      Fix this by restoring the old CP0_PageMask value (which should be set to
      the normal page size) after reading the TLB entry if we're going to go
      ahead and invalidate it.
      Signed-off-by: default avatarJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      a700434d
    • James Hogan's avatar
      MIPS: KVM: Fix ptr->int cast via KVM_GUEST_KSEGX() · 8296963e
      James Hogan authored
      kvm_mips_trans_replace() passes a pointer to KVM_GUEST_KSEGX(). This
      breaks on 64-bit builds due to the cast of that 64-bit pointer to a
      different sized 32-bit int. Cast the pointer argument to an unsigned
      long to work around the warning.
      Signed-off-by: default avatarJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      8296963e
    • James Hogan's avatar
      MIPS: KVM: Sign extend MFC0/RDHWR results · 172e02d1
      James Hogan authored
      When emulating MFC0 instructions to load 32-bit values from guest COP0
      registers and the RDHWR instruction to read the CC (Count) register,
      sign extend the result to comply with the MIPS64 architecture. The
      result must be in canonical 32-bit form or the guest may malfunction.
      Signed-off-by: default avatarJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      172e02d1
    • James Hogan's avatar
      MIPS: KVM: Fix 64-bit big endian dynamic translation · 5808844f
      James Hogan authored
      The MFC0 and MTC0 instructions in the guest which cause traps can be
      replaced with 32-bit loads and stores to the commpage, however on big
      endian 64-bit builds the offset needs to have 4 added so as to
      load/store the least significant half of the long instead of the most
      significant half.
      Signed-off-by: default avatarJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      5808844f
    • James Hogan's avatar
      MIPS: KVM: Fail if ebase doesn't fit in CP0_EBase · 2a06dab8
      James Hogan authored
      Fail if the address of the allocated exception base doesn't fit into the
      CP0_EBase register. This can happen on MIPS64 if CP0_EBase.WG isn't
      implemented but RAM is available outside of the range of KSeg0.
      Signed-off-by: default avatarJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      2a06dab8
    • James Hogan's avatar
      MIPS: KVM: Use 64-bit CP0_EBase when appropriate · 0d17aea5
      James Hogan authored
      Update the KVM entry point to write CP0_EBase as a 64-bit register when
      it is 64-bits wide, and to set the WG (write gate) bit if it exists in
      order to write bits 63:30 (or 31:30 on MIPS32).
      
      Prior to MIPS64r6 it was UNDEFINED to perform a 64-bit read or write of
      a 32-bit COP0 register. Since this is dynamically generated code,
      generate the right type of access depending on whether the kernel is
      64-bit and cpu_has_ebase_wg.
      Signed-off-by: default avatarJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      0d17aea5
    • James Hogan's avatar
      MIPS: KVM: Set CP0_Status.KX on MIPS64 · 1d756942
      James Hogan authored
      Update the KVM entry code to set the CP0_Entry.KX bit on 64-bit kernels.
      This is important to allow the entry code, running in kernel mode, to
      access the full 64-bit address space right up to the point of entering
      the guest, and immediately after exiting the guest, so it can safely
      restore & save the guest context from 64-bit segments.
      Signed-off-by: default avatarJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      1d756942
    • James Hogan's avatar
      MIPS: KVM: Make entry code MIPS64 friendly · e41637d8
      James Hogan authored
      The MIPS KVM entry code (originally kvm_locore.S, later locore.S, and
      now entry.c) has never quite been right when built for 64-bit, using
      32-bit instructions when 64-bit instructions were needed for handling
      64-bit registers and pointers. Fix several cases of this now.
      
      The changes roughly fall into the following categories.
      
      - COP0 scratch registers contain guest register values and the VCPU
        pointer, and are themselves full width. Similarly CP0_EPC and
        CP0_BadVAddr registers are full width (even though technically we
        don't support 64-bit guest address spaces with trap & emulate KVM).
        Use MFC0/MTC0 for accessing them.
      
      - Handling of stack pointers and the VCPU pointer must match the pointer
        size of the kernel ABI (always o32 or n64), so use ADDIU.
      
      - The CPU number in thread_info, and the guest_{user,kernel}_asid arrays
        in kvm_vcpu_arch are all 32 bit integers, so use lw (instead of LW) to
        load them.
      Signed-off-by: default avatarJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      e41637d8
    • James Hogan's avatar
      MIPS: KVM: Use kmap instead of CKSEG0ADDR() · 28cc5bd5
      James Hogan authored
      There are several unportable uses of CKSEG0ADDR() in MIPS KVM, which
      implicitly assume that a host physical address will be in the low 512MB
      of the physical address space (accessible in KSeg0). These assumptions
      don't hold for highmem or on 64-bit kernels.
      
      When interpreting the guest physical address when reading or overwriting
      a trapping instruction, use kmap_atomic() to get a usable virtual
      address to access guest memory, which is portable to 64-bit and highmem
      kernels.
      Signed-off-by: default avatarJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      28cc5bd5
    • James Hogan's avatar
      MIPS: KVM: Use virt_to_phys() to get commpage PFN · cfacaced
      James Hogan authored
      Calculate the PFN of the commpage using virt_to_phys() instead of
      CPHYSADDR(). This is more portable as kzalloc() may allocate from XKPhys
      instead of KSeg0 on 64-bit kernels, which CPHYSADDR() doesn't handle.
      This is sufficient for highmem kernels too since kzalloc() will allocate
      from lowmem in KSeg0.
      Signed-off-by: default avatarJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      cfacaced
    • James Hogan's avatar
      MIPS: Fix definition of KSEGX() for 64-bit · 6002bdd3
      James Hogan authored
      The KSEGX() macro is defined to 32-bit sign extend the address argument
      and logically AND the result with 0xe0000000, with the final result
      usually compared against one of the CKSEG macros. However the literal
      0xe0000000 is unsigned as the high bit is set, and is therefore
      zero-extended on 64-bit kernels, resulting in the sign extension bits of
      the argument being masked to zero. This results in the odd situation
      where:
      
        KSEGX(CKSEG) != CKSEG
        (0xffffffff80000000 & 0x00000000e0000000) != 0xffffffff80000000)
      
      Fix this by 32-bit sign extending the 0xe0000000 literal using
      _ACAST32_.
      
      This will help some MIPS KVM code handling 32-bit guest addresses to
      work on 64-bit host kernels, but will also affect KSEGX in
      dec_kn01_be_backend() on a 64-bit DECstation kernel, and the SiByte DMA
      page ops KSEGX check in clear_page() and copy_page() on 64-bit SB1
      kernels, neither of which appear to be designed with 64-bit segments in
      mind anyway.
      Signed-off-by: default avatarJames Hogan <james.hogan@imgtec.com>
      Acked-by: default avatarRalf Baechle <ralf@linux-mips.org>
      Cc: Maciej W. Rozycki <macro@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      6002bdd3
    • Jim Mattson's avatar
      KVM: VMX: Add VMCS to CPU's loaded VMCSs before VMPTRLD · b80c76ec
      Jim Mattson authored
      Kexec needs to know the addresses of all VMCSs that are active on
      each CPU, so that it can flush them from the VMCS caches. It is
      safe to record superfluous addresses that are not associated with
      an active VMCS, but it is not safe to omit an address associated
      with an active VMCS.
      
      After a call to vmcs_load, the VMCS that was loaded is active on
      the CPU. The VMCS should be added to the CPU's list of active
      VMCSs before it is loaded.
      Signed-off-by: default avatarJim Mattson <jmattson@google.com>
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      b80c76ec
    • David Matlack's avatar
      kvm: x86: nVMX: maintain internal copy of current VMCS · 4f2777bc
      David Matlack authored
      KVM maintains L1's current VMCS in guest memory, at the guest physical
      page identified by the argument to VMPTRLD. This makes hairy
      time-of-check to time-of-use bugs possible,as VCPUs can be writing
      the the VMCS page in memory while KVM is emulating VMLAUNCH and
      VMRESUME.
      
      The spec documents that writing to the VMCS page while it is loaded is
      "undefined". Therefore it is reasonable to load the entire VMCS into
      an internal cache during VMPTRLD and ignore writes to the VMCS page
      -- the guest should be using VMREAD and VMWRITE to access the current
      VMCS.
      
      To adhere to the spec, KVM should flush the current VMCS during VMPTRLD,
      and the target VMCS during VMCLEAR (as given by the operand to VMCLEAR).
      Since this implementation of VMCS caching only maintains the the current
      VMCS, VMCLEAR will only do a flush if the operand to VMCLEAR is the
      current VMCS pointer.
      
      KVM will also flush during VMXOFF, which is not mandated by the spec,
      but also not in conflict with the spec.
      Signed-off-by: default avatarDavid Matlack <dmatlack@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      4f2777bc
    • Radim Krčmář's avatar
      Merge branch 'kvm-ppc-next' of... · 601045bf
      Radim Krčmář authored
      Merge branch 'kvm-ppc-next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into next
      
      Fix for CVE-2016-5412, a denial-of-service vulnerability in HV KVM on
      POWER8 machines
      601045bf
  2. 28 Jul, 2016 2 commits
    • Paul Mackerras's avatar
      KVM: PPC: Book3S HV: Save/restore TM state in H_CEDE · 93d17397
      Paul Mackerras authored
      It turns out that if the guest does a H_CEDE while the CPU is in
      a transactional state, and the H_CEDE does a nap, and the nap
      loses the architected state of the CPU (which is is allowed to do),
      then we lose the checkpointed state of the virtual CPU.  In addition,
      the transactional-memory state recorded in the MSR gets reset back
      to non-transactional, and when we try to return to the guest, we take
      a TM bad thing type of program interrupt because we are trying to
      transition from non-transactional to transactional with a hrfid
      instruction, which is not permitted.
      
      The result of the program interrupt occurring at that point is that
      the host CPU will hang in an infinite loop with interrupts disabled.
      Thus this is a denial of service vulnerability in the host which can
      be triggered by any guest (and depending on the guest kernel, it can
      potentially triggered by unprivileged userspace in the guest).
      
      This vulnerability has been assigned the ID CVE-2016-5412.
      
      To fix this, we save the TM state before napping and restore it
      on exit from the nap, when handling a H_CEDE in real mode.  The
      case where H_CEDE exits to host virtual mode is already OK (as are
      other hcalls which exit to host virtual mode) because the exit
      path saves the TM state.
      
      Cc: stable@vger.kernel.org # v3.15+
      Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      93d17397
    • Paul Mackerras's avatar
      KVM: PPC: Book3S HV: Pull out TM state save/restore into separate procedures · f024ee09
      Paul Mackerras authored
      This moves the transactional memory state save and restore sequences
      out of the guest entry/exit paths into separate procedures.  This is
      so that these sequences can be used in going into and out of nap
      in a subsequent patch.
      
      The only code changes here are (a) saving and restore LR on the
      stack, since these new procedures get called with a bl instruction,
      (b) explicitly saving r1 into the PACA instead of assuming that
      HSTATE_HOST_R1(r13) is already set, and (c) removing an unnecessary
      and redundant setting of MSR[TM] that should have been removed by
      commit 9d4d0bdd9e0a ("KVM: PPC: Book3S HV: Add transactional memory
      support", 2013-09-24) but wasn't.
      
      Cc: stable@vger.kernel.org # v3.15+
      Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      f024ee09
  3. 22 Jul, 2016 1 commit
  4. 21 Jul, 2016 1 commit
  5. 18 Jul, 2016 22 commits