Commits · 324b3e63167bce69e6622c2be182595790bf7e38 · Kirill Smelkov / linux

10 Jan, 2013 8 commits

KVM: PPC: BookE: Add EPR ONE_REG sync · 324b3e63

Alexander Graf authored Jan 04, 2013

We need to be able to read and write the contents of the EPR register
from user space.

This patch implements that logic through the ONE_REG API and declares
its (never implemented) SREGS counterpart as deprecated.
Signed-off-by: Alexander Graf <agraf@suse.de>

324b3e63

KVM: PPC: BookE: Implement EPR exit · 1c810636

Alexander Graf authored Jan 04, 2013

The External Proxy Facility in FSL BookE chips allows the interrupt
controller to automatically acknowledge an interrupt as soon as a
core gets its pending external interrupt delivered.

Today, user space implements the interrupt controller, so we need to
check on it during such a cycle.

This patch implements logic for user space to enable EPR exiting,
disable EPR exiting and EPR exiting itself, so that user space can
acknowledge an interrupt when an external interrupt has successfully
been delivered into the guest vcpu.
Signed-off-by: Alexander Graf <agraf@suse.de>

1c810636

KVM: PPC: BookE: Emulate mfspr on EPR · 37ecb257

Alexander Graf authored Jan 04, 2013

The EPR register is potentially valid for PR KVM as well, so we need
to emulate accesses to it. It's only defined for reading, so only
handle the mfspr case.
Signed-off-by: Alexander Graf <agraf@suse.de>

37ecb257

KVM: PPC: BookE: Allow irq deliveries to inject requests · b8c649a9

Alexander Graf authored Dec 20, 2012

When injecting an interrupt into guest context, we usually don't need
to check for requests anymore. At least not until today.

With the introduction of EPR, we will have to create a request when the
guest has successfully accepted an external interrupt though.

So we need to prepare the interrupt delivery to abort guest entry
gracefully. Otherwise we'd delay the EPR request.
Signed-off-by: Alexander Graf <agraf@suse.de>

b8c649a9

KVM: PPC: Fix mfspr/mtspr MMUCFG emulation · f2be6550

Mihai Caraman authored Dec 20, 2012

On mfspr/mtspr emulation path Book3E's MMUCFG SPR with value 1015 clashes
with G4's MSSSR0 SPR. Move MSSSR0 emulation from generic part to Books3S.
MSSSR0 also clashes with Book3S's DABRX SPR. DABRX was not explicitly
handled so Book3S execution flow will behave as before.
Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>

f2be6550

KVM: PPC: Book3S: PR: Enable alternative instruction for SC 1 · 50c7bb80

Alexander Graf authored Dec 14, 2012

When running on top of pHyp, the hypercall instruction "sc 1" goes
straight into pHyp without trapping in supervisor mode.

So if we want to support PAPR guest in this configuration we need to
add a second way of accessing PAPR hypercalls, preferably with the
exact same semantics except for the instruction.

So let's overlay an officially reserved instruction and emulate PAPR
hypercalls whenever we hit that one.
Signed-off-by: Alexander Graf <agraf@suse.de>

50c7bb80

KVM: PPC: Only WARN on invalid emulation · 5a33169e

Alexander Graf authored Dec 14, 2012

When we hit an emulation result that we didn't expect, that is an error,
but it's nothing that warrants a BUG(), because it can be guest triggered.

So instead, let's only WARN() the user that this happened.
Signed-off-by: Alexander Graf <agraf@suse.de>

5a33169e

KVM: PPC: Fix SREGS documentation reference · 68e2ffed

Mihai Caraman authored Dec 11, 2012

Reflect the uapi folder change in SREGS API documentation.
Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
Reviewed-by: Amos Kong <kongjianjun@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>

68e2ffed

07 Jan, 2013 9 commits

KVM: MMU: simplify folding of dirty bit into accessed_dirty · 908e7d79

Gleb Natapov authored Dec 27, 2012

MMU code tries to avoid if()s HW is not able to predict reliably by using
bitwise operation to streamline code execution, but in case of a dirty bit
folding this gives us nothing since write_fault is checked right before
the folding code. Lets just piggyback onto the if() to make code more clear.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

908e7d79

KVM: mmu: remove unused trace event · ee04e0ce

Gleb Natapov authored Dec 25, 2012

trace_kvm_mmu_delay_free_pages() is no longer used.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

ee04e0ce

KVM: s390: Add support for channel I/O instructions. · fa6b7fe9

Cornelia Huck authored Dec 20, 2012

Add a new capability, KVM_CAP_S390_CSS_SUPPORT, which will pass
intercepts for channel I/O instructions to userspace. Only I/O
instructions interacting with I/O interrupts need to be handled
in-kernel:

- TEST PENDING INTERRUPTION (tpi) dequeues and stores pending
  interrupts entirely in-kernel.
- TEST SUBCHANNEL (tsch) dequeues pending interrupts in-kernel
  and exits via KVM_EXIT_S390_TSCH to userspace for subchannel-
  related processing.
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

fa6b7fe9

KVM: s390: Base infrastructure for enabling capabilities. · d6712df9

Cornelia Huck authored Dec 20, 2012

Make s390 support KVM_ENABLE_CAP.
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Acked-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

d6712df9

KVM: s390: In-kernel handling of I/O instructions. · f379aae5

Cornelia Huck authored Dec 20, 2012

Explicitely catch all channel I/O related instructions intercepts
in the kernel and set condition code 3 for them.

This paves the way for properly handling these instructions later
on.

Note: This is not architecture compliant (the previous code wasn't
either) since setting cc 3 is not the correct thing to do for some
of these instructions. For Linux guests, however, it still has the
intended effect of stopping css probing.
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

f379aae5

KVM: s390: Add support for machine checks. · 48a3e950

Cornelia Huck authored Dec 20, 2012

Add support for injecting machine checks (only repressible
conditions for now).

This is a bit more involved than I/O interrupts, for these reasons:

- Machine checks come in both floating and cpu varieties.
- We don't have a bit for machine checks enabling, but have to use
  a roundabout approach with trapping PSW changing instructions and
  watching for opened machine checks.
Reviewed-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

48a3e950

KVM: s390: Support for I/O interrupts. · d8346b7d

Cornelia Huck authored Dec 20, 2012

Add support for handling I/O interrupts (standard, subchannel-related
ones and rudimentary adapter interrupts).

The subchannel-identifying parameters are encoded into the interrupt
type.

I/O interrupts are floating, so they can't be injected on a specific
vcpu.
Reviewed-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

d8346b7d

KVM: s390: Decoding helper functions. · b1c571a5

Cornelia Huck authored Dec 20, 2012

Introduce helper functions for decoding the various base/displacement
instruction formats.
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

b1c571a5

KVM: s390: Constify intercept handler tables. · 77975357

Cornelia Huck authored Dec 20, 2012

These tables are never modified.
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

77975357

02 Jan, 2013 7 commits

KVM: VMX: handle IO when emulation is due to #GP in real mode. · 0ca1b4f4

Gleb Natapov authored Dec 20, 2012

With emulate_invalid_guest_state=0 if a vcpu is in real mode VMX can
enter the vcpu with smaller segment limit than guest configured.  If the
guest tries to access pass this limit it will get #GP at which point
instruction will be emulated with correct segment limit applied. If
during the emulation IO is detected it is not handled correctly. Vcpu
thread should exit to userspace to serve the IO, but it returns to the
guest instead.  Since emulation is not completed till userspace completes
the IO the faulty instruction is re-executed ad infinitum.

The patch fixes that by exiting to userspace if IO happens during
instruction emulation.
Reported-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

0ca1b4f4

KVM: VMX: Do not fix segment register during vcpu initialization. · d54d07b2

Gleb Natapov authored Dec 20, 2012

Segment registers will be fixed according to current emulation policy
during switching to real mode for the first time.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

d54d07b2

KVM: VMX: fix emulation of invalid guest state. · d99e4152

Gleb Natapov authored Dec 20, 2012

Currently when emulation of invalid guest state is enable
(emulate_invalid_guest_state=1) segment registers are still fixed for
entry to vm86 mode some times. Segment register fixing is avoided in
enter_rmode(), but vmx_set_segment() still does it unconditionally.
The patch fixes it.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

d99e4152

KVM: VMX: make rmode_segment_valid() more strict. · 89efbed0

Gleb Natapov authored Dec 20, 2012

Currently it allows entering vm86 mode if segment limit is greater than
0xffff and db bit is set. Both of those can cause incorrect execution of
instruction by cpu since in vm86 mode limit will be set to 0xffff and db
will be forced to 0.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

89efbed0

KVM: emulator: implement fninit, fnstsw, fnstcw · 045a282c

Gleb Natapov authored Dec 20, 2012

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

045a282c

KVM: emulator: drop RPL check from linearize() function · 3a78a4f4

Gleb Natapov authored Dec 20, 2012

According to Intel SDM Vol3 Section 5.5 "Privilege Levels" and 5.6
"Privilege Level Checking When Accessing Data Segments" RPL checking is
done during loading of a segment selector, not during data access. We
already do checking during segment selector loading, so drop the check
during data access. Checking RPL during data access triggers #GP if
after transition from real mode to protected mode RPL bits in a segment
selector are set.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

3a78a4f4

x86: kvm_para: fix typo in hypercall comments · 11393a07

Jesse Larrew authored Dec 10, 2012

Correct a typo in the comment explaining hypercalls.
Signed-off-by: Jesse Larrew <jlarrew@linux.vnet.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

11393a07

24 Dec, 2012 1 commit

KVM: move the code that installs new slots array to a separate function. · 7ec4fb44

Gleb Natapov authored Dec 24, 2012

Move repetitive code sequence to a separate function.
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>

7ec4fb44

23 Dec, 2012 9 commits

KVM: VMX: remove unneeded temporary variable from vmx_set_segment() · f924d66d
Gleb Natapov authored Dec 12, 2012
```
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
```
f924d66d

KVM: VMX: clean-up vmx_set_segment() · 1ecd50a9

Gleb Natapov authored Dec 12, 2012

Move all vm86_active logic into one place.
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>

1ecd50a9

KVM: VMX: remove redundant code from vmx_set_segment() · 39dcfb95

Gleb Natapov authored Dec 12, 2012

Segment descriptor's base is fixed by call to fix_rmode_seg(). Not need
to do it twice.
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>

39dcfb95

KVM: VMX: use fix_rmode_seg() to fix all code/data segments · beb853ff

Gleb Natapov authored Dec 12, 2012

The code for SS and CS does the same thing fix_rmode_seg() is doing.
Use it instead of hand crafted code.
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>

beb853ff

KVM: VMX: return correct segment limit and flags for CS/SS registers in real mode · c6ad1153

Gleb Natapov authored Dec 12, 2012

VMX without unrestricted mode cannot virtualize real mode, so if
emulate_invalid_guest_state=0 kvm uses vm86 mode to approximate
it. Sometimes, when guest moves from protected mode to real mode, it
leaves segment descriptors in a state not suitable for use by vm86 mode
virtualization, so we keep shadow copy of segment descriptors for internal
use and load fake register to VMCS for guest entry to succeed. Till
now we kept shadow for all segments except SS and CS (for SS and CS we
returned parameters directly from VMCS), but since commit a5625189
emulator enforces segment limits in real mode. This causes #GP during move
from protected mode to real mode when emulator fetches first instruction
after moving to real mode since it uses incorrect CS base and limit to
linearize the %rip. Fix by keeping shadow for SS and CS too.
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>

c6ad1153

KVM: VMX: relax check for CS register in rmode_segment_valid() · 0647f4aa

Gleb Natapov authored Dec 12, 2012

rmode_segment_valid() checks if segment descriptor can be used to enter
vm86 mode. VMX spec mandates that in vm86 mode CS register will be of
type data, not code. Lets allow guest entry with vm86 mode if the only
problem with CS register is incorrect type. Otherwise entire real mode
will be emulated.
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>

0647f4aa

KVM: VMX: cleanup rmode_segment_valid() · 07f42f5f

Gleb Natapov authored Dec 12, 2012

Set segment fields explicitly instead of using  binary operations.

No behaviour changes.
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>

07f42f5f

kvm: Fix memory slot generation updates · 116c14c0

Alex Williamson authored Dec 21, 2012

Previous patch "kvm: Minor memory slot optimization" (b7f69c55)
overlooked the generation field of the memory slots. Re-using the
original memory slots left us with with two slightly different memory
slots with the same generation. To fix this, make update_memslots()
take a new parameter to specify the last generation. This also makes
generation management more explicit to avoid such problems in the future.
Reported-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>

116c14c0

KVM: remove a wrong hack of delivery PIT intr to vcpu0 · 871a069d

Yang Zhang authored Dec 12, 2012

This hack is wrong. The pin number of PIT is connected to
2 not 0. This means this hack never takes effect. So it is ok
to remove it.
Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>

871a069d

18 Dec, 2012 4 commits

KVM: s390: Add a channel I/O based virtio transport driver. · 7e64e059

Cornelia Huck authored Dec 14, 2012

Add a driver for kvm guests that matches virtual ccw devices provided
by the host as virtio bridge devices.

These virtio-ccw devices use a special set of channel commands in order
to perform virtio functions.
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>

7e64e059

s390/ccwdev: Include asm/schid.h. · 0abbe448

Cornelia Huck authored Dec 14, 2012

Get the definition of struct subchannel_id.
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>

0abbe448

KVM: s390: Handle hosts not supporting s390-virtio. · 55c171a6

Cornelia Huck authored Dec 14, 2012

Running under a kvm host does not necessarily imply the presence of
a page mapped above the main memory with the virtio information;
however, the code includes a hard coded access to that page.

Instead, check for the presence of the page and exit gracefully
before we hit an addressing exception if it does not exist.
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
cc: stable@vger.kernel.org
Signed-off-by: Gleb Natapov <gleb@redhat.com>

55c171a6

kvm: fix i8254 counter 0 wraparound · d4b06c2d

Nickolai Zeldovich authored Dec 15, 2012

The kvm i8254 emulation for counter 0 (but not for counters 1 and 2)
has at least two bugs in mode 0:

1. The OUT bit, computed by pit_get_out(), is never set high.

2. The counter value, computed by pit_get_count(), wraps back around to
   the initial counter value, rather than wrapping back to 0xFFFF
   (which is the behavior described in the comment in __kpit_elapsed,
   the behavior implemented by qemu, and the behavior observed on AMD
   hardware).

The bug stems from __kpit_elapsed computing the elapsed time mod the
initial counter value (stored as nanoseconds in ps->period).  This is both
unnecessary (none of the callers of kpit_elapsed expect the value to be
at most the initial counter value) and incorrect (it causes pit_get_count
to appear to wrap around to the initial counter value rather than 0xFFFF).
Removing this mod from __kpit_elapsed fixes both of the above bugs.
Signed-off-by: Nickolai Zeldovich <nickolai@csail.mit.edu>
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>

d4b06c2d

15 Dec, 2012 1 commit

KVM: remove unused variable. · e11ae1a1

Gleb Natapov authored Dec 14, 2012

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

e11ae1a1

14 Dec, 2012 1 commit

KVM: Increase user memory slots on x86 to 125 · 0f888f5a

Alex Williamson authored Dec 10, 2012

With the 3 private slots, this gives us a nice round 128 slots total.
The primary motivation for this is to support more assigned devices.
Each assigned device can theoretically use up to 8 slots (6 MMIO BARs,
1 ROM BAR, 1 spare for a split MSI-X table mapping) though it's far
more typical for a device to use 3-4 slots.  If we assume a typical VM
uses a dozen slots for non-assigned devices purposes, we should always
be able to support 14 worst case assigned devices or 28 to 37 typical
devices.
Reviewed-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

0f888f5a