Commits · 0d97f884810410b2e0515e29424fe1ec5357176f · Kirill Smelkov / linux

15 Jan, 2015 1 commit

arm/arm64: KVM: add tracing support for arm64 exit handler · 0d97f884

Wei Huang authored Jan 12, 2015

arm64 uses its own copy of exit handler (arm64/kvm/handle_exit.c).
Currently this file doesn't hook up with any trace points. As a result
users might not see certain events (e.g. HVC & WFI) while using ftrace
with arm64 KVM. This patch fixes this issue by adding a new trace file
and defining two trace events (one of which is shared by wfi and wfe)
for arm64. The new trace points are then linked with related functions
in handle_exit.c.
Signed-off-by: Wei Huang <wei@redhat.com>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>

0d97f884

11 Jan, 2015 2 commits

KVM: arm/arm64: vgic: add init entry to VGIC KVM device · 065c0034

Eric Auger authored Dec 15, 2014

Since the advent of VGIC dynamic initialization, this latter is
initialized quite late on the first vcpu run or "on-demand", when
injecting an IRQ or when the guest sets its registers.

This initialization could be initiated explicitly much earlier
by the users-space, as soon as it has provided the requested
dimensioning parameters.

This patch adds a new entry to the VGIC KVM device that allows
the user to manually request the VGIC init:
- a new KVM_DEV_ARM_VGIC_GRP_CTRL group is introduced.
- Its first attribute is KVM_DEV_ARM_VGIC_CTRL_INIT

The rationale behind introducing a group is to be able to add other
controls later on, if needed.
Signed-off-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>

065c0034

KVM: arm/arm64: vgic: vgic_init returns -ENODEV when no online vcpu · 66b030e4

Eric Auger authored Dec 15, 2014

To be more explicit on vgic initialization failure, -ENODEV is
returned by vgic_init when no online vcpus can be found at init.
Signed-off-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>

66b030e4

08 Jan, 2015 20 commits

kvm: x86: Remove kvm_make_request from lapic.c · bab5bb39

Nicholas Krause authored Jan 01, 2015

Adds a function kvm_vcpu_set_pending_timer instead of calling
kvm_make_request in lapic.c.
Signed-off-by: Nicholas Krause <xerofoify@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

bab5bb39

KVM: x86: Access to LDT/GDT that wraparound is incorrect · edccda7c

Nadav Amit authored Dec 25, 2014

When access to descriptor in LDT/GDT wraparound outside long-mode, the address
of the descriptor should be truncated to 32-bit.  Citing Intel SDM 2.1.1.1
"Global and Local Descriptor Tables in IA-32e Mode": "GDTR and LDTR registers
are expanded to 64-bits wide in both IA-32e sub-modes (64-bit mode and
compatibility mode)."

So in other cases, we need to truncate. Creating new function to return a
pointer to descriptor table to avoid too much code duplication.
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
[Wrap 64-bit check with #ifdef CONFIG_X86_64, to avoid a "right shift count
 >= width of type" warning and consequent undefined behavior. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

edccda7c

KVM: x86: Do not set access bit on accessed segments · e2cefa74

Nadav Amit authored Dec 25, 2014

When segment is loaded, the segment access bit is set unconditionally. In
fact, it should be set conditionally, based on whether the segment had the
accessed bit set before. In addition, it can improve performance.
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

e2cefa74

KVM: x86: POP [ESP] is not emulated correctly · ab708099

Nadav Amit authored Dec 25, 2014

According to Intel SDM: "If the ESP register is used as a base register for
addressing a destination operand in memory, the POP instruction computes the
effective address of the operand after it increments the ESP register."

The current emulation does not behave so. The fix required to waste another
of the precious instruction flags and to check the flag in decode_modrm.
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

ab708099

KVM: x86: em_call_far should return failure result · 80976dbb

Nadav Amit authored Dec 25, 2014

Currently, if em_call_far fails it returns success instead of the resulting
error-code. Fix it.
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

80976dbb

KVM: x86: JMP/CALL using call- or task-gate causes exception · 3dc4bc4f

Nadav Amit authored Dec 25, 2014

The KVM emulator does not emulate JMP and CALL that target a call gate or a
task gate.  This patch does not try to implement these scenario as they are
presumably rare; yet it returns X86EMUL_UNHANDLEABLE error in such cases
instead of generating an exception.
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

3dc4bc4f

KVM: x86: fnstcw and fnstsw may cause spurious exception · 16bebefe

Nadav Amit authored Dec 25, 2014

Since the operand size of fnstcw and fnstsw is updated during the execution,
the emulation may cause spurious exceptions as it reads the memory beforehand.

Marking these instructions as Mov (since the previous value is ignored) and
DstMem16 to simplify the setting of operand size.
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

16bebefe

KVM: x86: pop sreg accesses only 2 bytes · 3313bc4e

Nadav Amit authored Dec 25, 2014

Although pop sreg updates RSP according to the operand size, only 2 bytes are
read. The current behavior may result in incorrect #GP or #PF exceptions.
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

3313bc4e

KVM: x86: mmu: replace assertions with MMU_WARN_ON, a conditional WARN_ON · fa4a2c08

Paolo Bonzini authored Oct 02, 2013

This makes the direction of the conditions consistent with code that
is already using WARN_ON.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

fa4a2c08

KVM: x86: mmu: remove ASSERT(vcpu) · 4c1a50de

Paolo Bonzini authored Oct 02, 2013

Because ASSERT is just a printk, these would oops right away.
The assertion thus hardly adds anything.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

4c1a50de

KVM: x86: mmu: remove argument to kvm_init_shadow_mmu and kvm_init_shadow_ept_mmu · ad896af0

Paolo Bonzini authored Oct 02, 2013

The initialization function in mmu.c can always use walk_mmu, which
is known to be vcpu->arch.mmu.  Only init_kvm_nested_mmu is used to
initialize vcpu->arch.nested_mmu.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

ad896af0

KVM: x86: mmu: do not use return to tail-call functions that return void · e0c6db3e
Paolo Bonzini authored Dec 23, 2014
```
This is, pedantically, not valid C.  It also looks weird.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
```
e0c6db3e

KVM: x86: add tracepoint to wait_lapic_expire · 6c19b753

Marcelo Tosatti authored Dec 16, 2014

Add tracepoint to wait_lapic_expire.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
[Remind reader if early or late. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

6c19b753

KVM: x86: add option to advance tscdeadline hrtimer expiration · d0659d94

Marcelo Tosatti authored Dec 16, 2014

For the hrtimer which emulates the tscdeadline timer in the guest,
add an option to advance expiration, and busy spin on VM-entry waiting
for the actual expiration time to elapse.

This allows achieving low latencies in cyclictest (or any scenario
which requires strict timing regarding timer expiration).

Reduces average cyclictest latency from 12us to 8us
on Core i5 desktop.

Note: this option requires tuning to find the appropriate value
for a particular hardware/guest combination. One method is to measure the
average delay between apic_timer_fn and VM-entry.
Another method is to start with 1000ns, and increase the value
in say 500ns increments until avg cyclictest numbers stop decreasing.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

d0659d94

KVM: x86: add method to test PIR bitmap vector · 7c6a98df

Marcelo Tosatti authored Dec 16, 2014

kvm_x86_ops->test_posted_interrupt() returns true/false depending
whether 'vector' is set.

Next patch makes use of this interface.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

7c6a98df

kvm: x86: vmx: NULL out hwapic_isr_update() in case of !enable_apicv · b4eef9b3

Tiejun Chen authored Dec 22, 2014

In most cases calling hwapic_isr_update(), we always check if
kvm_apic_vid_enabled() == 1, but actually,
kvm_apic_vid_enabled()
    -> kvm_x86_ops->vm_has_apicv()
        -> vmx_vm_has_apicv() or '0' in svm case
            -> return enable_apicv && irqchip_in_kernel(kvm)

So its a little cost to recall vmx_vm_has_apicv() inside
hwapic_isr_update(), here just NULL out hwapic_isr_update() in
case of !enable_apicv inside hardware_setup() then make all
related stuffs follow this. Note we don't check this under that
condition of irqchip_in_kernel() since we should make sure
definitely any caller don't work  without in-kernel irqchip.
Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

b4eef9b3

KVM: x86: Remove FIXMEs in emulate.c for the function,task_switch_32 · 5ff22e7e

Nicholas Krause authored Dec 18, 2014

Remove FIXME comments about needing fault addresses to be returned.  These
are propaagated from walk_addr_generic to gva_to_gpa and from there to
ops->read_std and ops->write_std.
Signed-off-by: Nicholas Krause <xerofoify@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

5ff22e7e

KVM: nVMX: consult PFEC_MASK and PFEC_MATCH when generating #PF VM-exit · 19d5f10b

Eugene Korenevsky authored Dec 16, 2014

When generating #PF VM-exit, check equality:
(PFEC & PFEC_MASK) == PFEC_MATCH
If there is equality, the 14 bit of exception bitmap is used to take decision
about generating #PF VM-exit. If there is inequality, inverted 14 bit is used.
Signed-off-by: Eugene Korenevsky <ekorenevsky@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

19d5f10b

KVM: nVMX: Improve nested msr switch checking · e9ac033e

Eugene Korenevsky authored Dec 11, 2014

This patch improve checks required by Intel Software Developer Manual.
 - SMM MSRs are not allowed.
 - microcode MSRs are not allowed.
 - check x2apic MSRs only when LAPIC is in x2apic mode.
 - MSR switch areas must be aligned to 16 bytes.
 - address of first and last byte in MSR switch areas should not set any bits
   beyond the processor's physical-address width.

Also it adds warning messages on failures during MSR switch. These messages
are useful for people who debug their VMMs in nVMX.
Signed-off-by: Eugene Korenevsky <ekorenevsky@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

e9ac033e

KVM: nVMX: Add nested msr load/restore algorithm · ff651cb6

Wincy Van authored Dec 11, 2014

Several hypervisors need MSR auto load/restore feature.
We read MSRs from VM-entry MSR load area which specified by L1,
and load them via kvm_set_msr in the nested entry.
When nested exit occurs, we get MSRs via kvm_get_msr, writing
them to L1`s MSR store area. After this, we read MSRs from VM-exit
MSR load area, and load them via kvm_set_msr.
Signed-off-by: Wincy Van <fanwenyi0529@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

ff651cb6

06 Jan, 2015 1 commit
- Linux 3.19-rc3 · b1940cd2
  Linus Torvalds authored Jan 05, 2015
  
  b1940cd2
05 Jan, 2015 3 commits

Merge tag 'powerpc-3.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux · 79b8cb97

Linus Torvalds authored Jan 05, 2015

Pull powerpc fixes from Michael Ellerman:

 - Wire up sys_execveat(). Tested on 32 & 64 bit.

 - Fix for kdump on LE systems with cpus hot unplugged.

 - Revert Anton's fix for "kernel BUG at kernel/smpboot.c:134!", this
   broke other platforms, we'll do a proper fix for 3.20.

* tag 'powerpc-3.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux:
  Revert "powerpc: Secondary CPUs must set cpu_callin_map after setting active and online"
  powerpc/kdump: Ignore failure in enabling big endian exception during crash
  powerpc: Wire up sys_execveat() syscall

79b8cb97

Merge tag 'please-pull-syscall' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux · f40bde85

Linus Torvalds authored Jan 05, 2015

Pull ia64 fixlet from Tony Luck:
 "Add execveat syscall"

* tag 'please-pull-syscall' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux:
  [IA64] Enable execveat syscall for ia64

f40bde85

[IA64] Enable execveat syscall for ia64 · b739896d

Tony Luck authored Jan 05, 2015

See commit 51f39a1f
    syscalls: implement execveat() system call
Signed-off-by: Tony Luck <tony.luck@intel.com>

b739896d

04 Jan, 2015 4 commits

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml · 693a30b8

Linus Torvalds authored Jan 04, 2015

Pull UML fixes from Richard Weinberger:
 "Two fixes for UML regressions. Nothing exciting"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
  x86, um: actually mark system call tables readonly
  um: Skip futex_atomic_cmpxchg_inatomic() test

693a30b8

Revert "ARM: 7830/1: delay: don't bother reporting bogomips in /proc/cpuinfo" · 4bf9636c

Pavel Machek authored Jan 04, 2015

Commit 9fc2105a ("ARM: 7830/1: delay: don't bother reporting
bogomips in /proc/cpuinfo") breaks audio in python, and probably
elsewhere, with message

  FATAL: cannot locate cpu MHz in /proc/cpuinfo

I'm not the first one to hit it, see for example

  https://theredblacktree.wordpress.com/2014/08/10/fatal-cannot-locate-cpu-mhz-in-proccpuinfo/
  https://devtalk.nvidia.com/default/topic/765800/workaround-for-fatal-cannot-locate-cpu-mhz-in-proc-cpuinf/?offset=1

Reading original changelog, I have to say "Stop breaking working setups.
You know who you are!".
Signed-off-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

4bf9636c

x86, um: actually mark system call tables readonly · b485342b

Daniel Borkmann authored Jan 03, 2015

Commit a074335a ("x86, um: Mark system call tables readonly") was
supposed to mark the sys_call_table in UML as RO by adding the const,
but it doesn't have the desired effect as it's nevertheless being placed
into the data section since __cacheline_aligned enforces sys_call_table
being placed into .data..cacheline_aligned instead. We need to use
the ____cacheline_aligned version instead to fix this issue.

Before:

$ nm -v arch/x86/um/sys_call_table_64.o | grep -1 "sys_call_table"
                 U sys_writev
0000000000000000 D sys_call_table
0000000000000000 D syscall_table_size

After:

$ nm -v arch/x86/um/sys_call_table_64.o | grep -1 "sys_call_table"
                 U sys_writev
0000000000000000 R sys_call_table
0000000000000000 D syscall_table_size

Fixes: a074335a ("x86, um: Mark system call tables readonly")
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Richard Weinberger <richard@nod.at>

b485342b

um: Skip futex_atomic_cmpxchg_inatomic() test · f911d731

Richard Weinberger authored Dec 10, 2014

futex_atomic_cmpxchg_inatomic() does not work on UML because
it triggers a copy_from_user() in kernel context.
On UML copy_from_user() can only be used if the kernel was called
by a real user space process such that UML can use ptrace()
to fetch the value.
Reported-by: Miklos Szeredi <miklos@szeredi.hu>
Suggested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Richard Weinberger <richard@nod.at>
Tested-by: Daniel Walter <d.walter@0x90.at>

f911d731

02 Jan, 2015 3 commits

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · d753856c

Linus Torvalds authored Jan 02, 2015

Pull SCSI fixes from James Bottomley:
 "This is a set of three fixes: one to correct an abort path thinko
  causing failures (and a panic) in USB on device misbehaviour, One to
  fix an out of order issue in the fnic driver and one to match discard
  expectations to qemu which otherwise cause Linux to behave badly as a
  guest"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  SCSI: fix regression in scsi_send_eh_cmnd()
  fnic: IOMMU Fault occurs when IO and abort IO is out of order
  sd: tweak discard heuristics to work around QEMU SCSI issue

d753856c

Merge tag 'sound-3.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · 6a4bfa7c

Linus Torvalds authored Jan 02, 2015

Pull sound fixes from Takashi Iwai:
 "Nothing too exciting as a new year's start here: most of fixes are for
  ASoC, a boot crash fix on OMAP for deferred probe, a few driver
  specific fixes (Intel, dwc, rockchip, rt5677), in addition to typo
  fixes in kerneldoc comments for PCM"

* tag 'sound-3.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: pcm: Fix kerneldoc for params_*() functions
  ASoC: rockchip: i2s: fix maxburst of dma data to 4
  ASoC: rockchip: i2s: fix error defination of transmit data level
  ASoC: Intel: correct the fixed free block allocation
  ASoC: rt5677: fixed rt5677_dsp_vad_put rt5677_dsp_vad_get panic
  ASoC: Intel: Fix BYTCR machine driver MODULE_ALIAS
  ASoC: Intel: Fix BYTCR firmware name
  ASoC: dwc: Iterate over all channels
  ASoC: dwc: Ensure FIFOs are flushed to prevent channel swap
  ASoC: Intel: Add I2C dependency to two new machines
  ASoC: dapm: Remove snd_soc_of_parse_audio_routing() due to deferred probe

6a4bfa7c

Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost · d7e19bd8

Linus Torvalds authored Jan 02, 2015

Pull vhost cleanup and virtio bugfix
 "There's a single change here, fixing a vhost bug where vhost
  initialization fails due to used ring alignment check being too
  strict"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  vhost: relax used address alignment
  virtio_ring: document alignment requirements

d7e19bd8

31 Dec, 2014 6 commits

Merge branch 'upstream' of git://git.infradead.org/users/pcmoore/audit · 5e0f872c

Linus Torvalds authored Dec 31, 2014

Pull audit fix from Paul Moore:
 "One audit patch to resolve a panic/oops when recording filenames in
  the audit log, see the mail archive link below.

  The fix isn't as nice as I would like, as it involves an allocate/copy
  of the filename, but it solves the problem and the overhead should
  only affect users who have configured audit rules involving file
  names.

  We'll revisit this issue with future kernels in an attempt to make
  this suck less, but in the meantime I think this fix should go into
  the next release of v3.19-rcX.

  [ https://marc.info/?t=141986927600001&r=1&w=2 ]"

* 'upstream' of git://git.infradead.org/users/pcmoore/audit:
  audit: create private file name copies when auditing inodes

5e0f872c

Revert "Input: atmel_mxt_ts - use deep sleep mode when stopped" · 7f405483

Linus Torvalds authored Dec 31, 2014

This reverts commit 9d469d03.

It breaks the Chromebook Pixel touchpad (and touchscreen).
Reported-by: Dirk Hohndel <dirk@hohndel.org>
Bisected-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Nick Dyer <nick.dyer@itdev.co.uk>
Cc: Benson Leung <bleung@chromium.org>
Cc: Yufeng Shen <miletus@chromium.org>
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: stable@vger.kernel.org  # v3.16+
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

7f405483

Merge tag 'nios2-fixes-v3.19-rc3' of git://git.rocketboards.org/linux-socfpga-next · a5cb2366

Linus Torvalds authored Dec 31, 2014

Pull arch/nios2 fixes from Ley Foon Tan:

 - fix compilation error when enable CONFIG_PREEMPT

 - initialize cpuinfo.mmu variable supplied by the device tree

* tag 'nios2-fixes-v3.19-rc3' of git://git.rocketboards.org/linux-socfpga-next:
  nios2: Use preempt_schedule_irq
  nios2: Initialize cpuinfo.mmu

a5cb2366

Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 · 6ca793ab

Linus Torvalds authored Dec 31, 2014

Pull crypto fix from Herbert Xu:
 "Fix a use-after-free crash in the user-space crypto API"

* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: af_alg - fix backlog handling

6ca793ab

nios2: Use preempt_schedule_irq · 1b0f4492

Tobias Klauser authored Dec 31, 2014

Follow aa0d5326 ("ia64: Use preempt_schedule_irq") and use
preempt_schedule_irq instead of enabling/disabling interrupts and
messing around with PREEMPT_ACTIVE in the nios2 low-level preemption
code ourselves. Also get rid of the now needless re-check for
TIF_NEED_RESCHED, preempt_schedule_irq will already take care of
rescheduling.

This also fixes the following build error when building with
CONFIG_PREEMPT:

arch/nios2/kernel/built-in.o: In function `need_resched':
arch/nios2/kernel/entry.S:374: undefined reference to `PREEMPT_ACTIVE'

Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Acked-by: Ley Foon Tan <lftan@altera.com>

1b0f4492

nios2: Initialize cpuinfo.mmu · 6f3d2b00

Walter Goossens authored Dec 31, 2014

This patch initializes the mmu field of the cpuinfo structure to the
value supplied by the devicetree.
Signed-off-by: Walter Goossens <waltergoossens@home.nl>
Acked-by: Ley Foon Tan <lftan@altera.com>

6f3d2b00