Commits · 44583cba9188b29b20ceeefe8ae23ad19e26d9a4 · Kirill Smelkov / linux

11 Jul, 2014 22 commits

KVM: x86: use kvm_read_guest_page for emulator accesses · 44583cba

Paolo Bonzini authored May 13, 2014

Emulator accesses are always done a page at a time, either by the emulator
itself (for fetches) or because we need to query the MMU for address
translations. Speed up these accesses by using kvm_read_guest_page
and, in the case of fetches, by inlining kvm_read_guest_virt_helper and
dropping the loop around kvm_read_guest_page.

This final tweak saves 30-100 more clock cycles (4-10%), bringing the
count (as measured by kvm-unit-tests) down to 720-1100 clock cycles on
a Sandy Bridge Xeon host, compared to 2300-3200 before the whole series
and 925-1700 after the first two low-hanging fruit changes.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

44583cba

KVM: x86: ensure emulator fetches do not span multiple pages · 719d5a9b

Paolo Bonzini authored Jun 19, 2014

When the CS base is not page-aligned, the linear address of the code could
get close to the page boundary (e.g. 0x...ffe) even if the EIP value is
not. So we need to first linearize the address, and only then compute
the number of valid bytes that can be fetched.

This happens relatively often when executing real mode code.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

719d5a9b

KVM: emulate: put pointers in the fetch_cache · 17052f16

Paolo Bonzini authored May 06, 2014

This simplifies the code a bit, especially the overflow checks.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

17052f16

KVM: emulate: avoid per-byte copying in instruction fetches · 9506d57d

Paolo Bonzini authored May 06, 2014

We do not need a memory copying loop anymore in insn_fetch; we
can use a byte-aligned pointer to access instruction fields directly
from the fetch_cache. This eliminates 50-150 cycles (corresponding to
a 5-10% improvement in performance) from each instruction.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

9506d57d

KVM: emulate: avoid repeated calls to do_insn_fetch_bytes · 5cfc7e0f

Paolo Bonzini authored May 06, 2014

do_insn_fetch_bytes will only be called once in a given insn_fetch and
insn_fetch_arr, because in fact it will only be called at most twice
for any instruction and the first call is explicit in x86_decode_insn.
This observation lets us hoist the call out of the memory copying loop.
It does not buy performance, because most fetches are one byte long
anyway, but it prepares for the next patch.

The overflow check is tricky, but correct. Because do_insn_fetch_bytes
has already been called once, we know that fc->end is at least 15. So
it is okay to subtract the number of bytes we want to read.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

5cfc7e0f

KVM: emulate: speed up do_insn_fetch · 285ca9e9

Paolo Bonzini authored May 06, 2014

Hoist the common case up from do_insn_fetch_byte to do_insn_fetch,
and prime the fetch_cache in x86_decode_insn.  This helps a bit the
compiler and the branch predictor, but above all it lays the
ground for further changes in the next few patches.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

285ca9e9