Commits · 8b56ee91ffc88ea01400c012e10fe22a9d233265 · Kirill Smelkov / linux

19 Sep, 2018 18 commits

kvm: selftests: Add platform_info_test · 8b56ee91

Drew Schmitt authored Aug 20, 2018

Test guest access to MSR_PLATFORM_INFO when the capability is enabled
or disabled.
Signed-off-by: Drew Schmitt <dasch@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

8b56ee91

KVM: x86: Control guest reads of MSR_PLATFORM_INFO · 6fbbde9a

Drew Schmitt authored Aug 20, 2018

Add KVM_CAP_MSR_PLATFORM_INFO so that userspace can disable guest access
to reads of MSR_PLATFORM_INFO.

Disabling access to reads of this MSR gives userspace the control to "expose"
this platform-dependent information to guests in a clear way. As it exists
today, guests that read this MSR would get unpopulated information if userspace
hadn't already set it (and prior to this patch series, only the CPUID faulting
information could have been populated). This existing interface could be
confusing if guests don't handle the potential for incorrect/incomplete
information gracefully (e.g. zero reported for base frequency).
Signed-off-by: Drew Schmitt <dasch@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

6fbbde9a

KVM: x86: Turbo bits in MSR_PLATFORM_INFO · d84f1cff

Drew Schmitt authored Aug 20, 2018

Allow userspace to set turbo bits in MSR_PLATFORM_INFO. Previously, only
the CPUID faulting bit was settable. But now any bit in
MSR_PLATFORM_INFO would be settable. This can be used, for example, to
convey frequency information about the platform on which the guest is
running.
Signed-off-by: Drew Schmitt <dasch@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

d84f1cff

nVMX x86: Check VPID value on vmentry of L2 guests · ba8e23db

Krish Sadhukhan authored Sep 04, 2018

According to section "Checks on VMX Controls" in Intel SDM vol 3C, the
following check needs to be enforced on vmentry of L2 guests:

    If the 'enable VPID' VM-execution control is 1, the value of the
    of the VPID VM-execution control field must not be 0000H.
Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Reviewed-by: Mark Kanda <mark.kanda@oracle.com>
Reviewed-by: Liran Alon <liran.alon@oracle.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

ba8e23db

nVMX x86: check posted-interrupt descriptor addresss on vmentry of L2 · 6de84e58

Krish Sadhukhan authored Aug 23, 2018

According to section "Checks on VMX Controls" in Intel SDM vol 3C,
the following check needs to be enforced on vmentry of L2 guests:

   - Bits 5:0 of the posted-interrupt descriptor address are all 0.
   - The posted-interrupt descriptor address does not set any bits
     beyond the processor's physical-address width.
Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Reviewed-by: Mark Kanda <mark.kanda@oracle.com>
Reviewed-by: Liran Alon <liran.alon@oracle.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Karl Heubaum <karl.heubaum@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

6de84e58

KVM: nVMX: Wake blocked vCPU in guest-mode if pending interrupt in virtual APICv · e6c67d8c

Liran Alon authored Sep 04, 2018

In case L1 do not intercept L2 HLT or enter L2 in HLT activity-state,
it is possible for a vCPU to be blocked while it is in guest-mode.

According to Intel SDM 26.6.5 Interrupt-Window Exiting and
Virtual-Interrupt Delivery: "These events wake the logical processor
if it just entered the HLT state because of a VM entry".
Therefore, if L1 enters L2 in HLT activity-state and L2 has a pending
deliverable interrupt in vmcs12->guest_intr_status.RVI, then the vCPU
should be waken from the HLT state and injected with the interrupt.

In addition, if while the vCPU is blocked (while it is in guest-mode),
it receives a nested posted-interrupt, then the vCPU should also be
waken and injected with the posted interrupt.

To handle these cases, this patch enhances kvm_vcpu_has_events() to also
check if there is a pending interrupt in L2 virtual APICv provided by
L1. That is, it evaluates if there is a pending virtual interrupt for L2
by checking RVI[7:4] > VPPR[7:4] as specified in Intel SDM 29.2.1
Evaluation of Pending Interrupts.

Note that this also handles the case of nested posted-interrupt by the
fact RVI is updated in vmx_complete_nested_posted_interrupt() which is
called from kvm_vcpu_check_block() -> kvm_arch_vcpu_runnable() ->
kvm_vcpu_running() -> vmx_check_nested_events() ->
vmx_complete_nested_posted_interrupt().
Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

e6c67d8c

KVM: VMX: check nested state and CR4.VMXE against SMM · 5bea5123

Paolo Bonzini authored Sep 18, 2018

VMX cannot be enabled under SMM, check it when CR4 is set and when nested
virtualization state is restored.

This should fix some WARNs reported by syzkaller, mostly around
alloc_shadow_vmcs.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

5bea5123

kvm: x86: make kvm_{load|put}_guest_fpu() static · 822f312d

Sebastian Andrzej Siewior authored Sep 12, 2018

The functions
	kvm_load_guest_fpu()
	kvm_put_guest_fpu()

are only used locally, make them static. This requires also that both
functions are moved because they are used before their implementation.
Those functions were exported (via EXPORT_SYMBOL) before commit
e5bb4025 ("KVM: Drop kvm_{load,put}_guest_fpu() exports").
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

822f312d

x86/hyper-v: rename ipi_arg_{ex,non_ex} structures · a1efa9b7

Vitaly Kuznetsov authored Aug 27, 2018

These structures are going to be used from KVM code so let's make
their names reflect their Hyper-V origin.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
Acked-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

a1efa9b7

KVM: VMX: use preemption timer to force immediate VMExit · d264ee0c

Sean Christopherson authored Aug 27, 2018

A VMX preemption timer value of '0' is guaranteed to cause a VMExit
prior to the CPU executing any instructions in the guest.  Use the
preemption timer (if it's supported) to trigger immediate VMExit
in place of the current method of sending a self-IPI.  This ensures
that pending VMExit injection to L1 occurs prior to executing any
instructions in the guest (regardless of nesting level).

When deferring VMExit injection, KVM generates an immediate VMExit
from the (possibly nested) guest by sending itself an IPI.  Because
hardware interrupts are blocked prior to VMEnter and are unblocked
(in hardware) after VMEnter, this results in taking a VMExit(INTR)
before any guest instruction is executed.  But, as this approach
relies on the IPI being received before VMEnter executes, it only
works as intended when KVM is running as L0.  Because there are no
architectural guarantees regarding when IPIs are delivered, when
running nested the INTR may "arrive" long after L2 is running e.g.
L0 KVM doesn't force an immediate switch to L1 to deliver an INTR.

For the most part, this unintended delay is not an issue since the
events being injected to L1 also do not have architectural guarantees
regarding their timing.  The notable exception is the VMX preemption
timer[1], which is architecturally guaranteed to cause a VMExit prior
to executing any instructions in the guest if the timer value is '0'
at VMEnter.  Specifically, the delay in injecting the VMExit causes
the preemption timer KVM unit test to fail when run in a nested guest.

Note: this approach is viable even on CPUs with a broken preemption
timer, as broken in this context only means the timer counts at the
wrong rate.  There are no known errata affecting timer value of '0'.

[1] I/O SMIs also have guarantees on when they arrive, but I have
    no idea if/how those are emulated in KVM.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
[Use a hook for SVM instead of leaving the default in x86.c - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

d264ee0c

KVM: VMX: modify preemption timer bit only when arming timer · f459a707

Sean Christopherson authored Aug 27, 2018

Provide a singular location where the VMX preemption timer bit is
set/cleared so that future usages of the preemption timer can ensure
the VMCS bit is up-to-date without having to modify unrelated code
paths.  For example, the preemption timer can be used to force an
immediate VMExit.  Cache the status of the timer to avoid redundant
VMREAD and VMWRITE, e.g. if the timer stays armed across multiple
VMEnters/VMExits.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

f459a707

KVM: VMX: immediately mark preemption timer expired only for zero value · 4c008127

Sean Christopherson authored Aug 27, 2018

A VMX preemption timer value of '0' at the time of VMEnter is
architecturally guaranteed to cause a VMExit prior to the CPU
executing any instructions in the guest.  This architectural
definition is in place to ensure that a previously expired timer
is correctly recognized by the CPU as it is possible for the timer
to reach zero and not trigger a VMexit due to a higher priority
VMExit being signalled instead, e.g. a pending #DB that morphs into
a VMExit.

Whether by design or coincidence, commit f4124500 ("KVM: nVMX:
Fully emulate preemption timer") special cased timer values of '0'
and '1' to ensure prompt delivery of the VMExit.  Unlike '0', a
timer value of '1' has no has no architectural guarantees regarding
when it is delivered.

Modify the timer emulation to trigger immediate VMExit if and only
if the timer value is '0', and document precisely why '0' is special.
Do this even if calibration of the virtual TSC failed, i.e. VMExit
will occur immediately regardless of the frequency of the timer.
Making only '0' a special case gives KVM leeway to be more aggressive
in ensuring the VMExit is injected prior to executing instructions in
the nested guest, and also eliminates any ambiguity as to why '1' is
a special case, e.g. why wasn't the threshold for a "short timeout"
set to 10, 100, 1000, etc...
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

4c008127

KVM: SVM: Switch to bitmap_zalloc() · a101c9d6

Andy Shevchenko authored Aug 30, 2018

Switch to bitmap_zalloc() to show clearly what we are allocating.
Besides that it returns pointer of bitmap type instead of opaque void *.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

a101c9d6

KVM/MMU: Fix comment in walk_shadow_page_lockless_end() · 9a984586

Tianyu Lan authored Sep 07, 2018

kvm_commit_zap_page() has been renamed to kvm_mmu_commit_zap_page()
This patch is to fix the commit.
Signed-off-by: Lan Tianyu <Tianyu.Lan@microsoft.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

9a984586

kvm: selftests: use -pthread instead of -lpthread · 6bd317d3

Lei Yang authored Aug 29, 2018

I run into the following error

testing/selftests/kvm/dirty_log_test.c:285: undefined reference to `pthread_create'
testing/selftests/kvm/dirty_log_test.c:297: undefined reference to `pthread_join'
collect2: error: ld returned 1 exit status

my gcc version is gcc version 4.8.4
"-pthread" would work everywhere
Signed-off-by: Lei Yang <Lei.Yang@windriver.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

6bd317d3

KVM: x86: don't reset root in kvm_mmu_setup() · 83b20b28

Wei Yang authored Sep 07, 2018

Here is the code path which shows kvm_mmu_setup() is invoked after
kvm_mmu_create(). Since kvm_mmu_setup() is only invoked in this code path,
this means the root_hpa and prev_roots are guaranteed to be invalid. And
it is not necessary to reset it again.

    kvm_vm_ioctl_create_vcpu()
        kvm_arch_vcpu_create()
            vmx_create_vcpu()
                kvm_vcpu_init()
                    kvm_arch_vcpu_init()
                        kvm_mmu_create()
        kvm_arch_vcpu_setup()
            kvm_mmu_setup()
                kvm_init_mmu()

This patch set reset_roots to false in kmv_mmu_setup().

Fixes: 50c28f21Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

83b20b28

kvm: mmu: Don't read PDPTEs when paging is not enabled · d35b34a9

Junaid Shahid authored Aug 08, 2018

kvm should not attempt to read guest PDPTEs when CR0.PG = 0 and
CR4.PAE = 1.
Signed-off-by: Junaid Shahid <junaids@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

d35b34a9

x86/kvm/lapic: always disable MMIO interface in x2APIC mode · d1766202

Vitaly Kuznetsov authored Aug 02, 2018

When VMX is used with flexpriority disabled (because of no support or
if disabled with module parameter) MMIO interface to lAPIC is still
available in x2APIC mode while it shouldn't be (kvm-unit-tests):

PASS: apic_disable: Local apic enabled in x2APIC mode
PASS: apic_disable: CPUID.1H:EDX.APIC[bit 9] is set
FAIL: apic_disable: *0xfee00030: 50014

The issue appears because we basically do nothing while switching to
x2APIC mode when APIC access page is not used. apic_mmio_{read,write}
only check if lAPIC is disabled before proceeding to actual write.

When APIC access is virtualized we correctly manipulate with VMX controls
in vmx_set_virtual_apic_mode() and we don't get vmexits from memory writes
in x2APIC mode so there's no issue.

Disabling MMIO interface seems to be easy. The question is: what do we
do with these reads and writes? If we add apic_x2apic_mode() check to
apic_mmio_in_range() and return -EOPNOTSUPP these reads and writes will
go to userspace. When lAPIC is in kernel, Qemu uses this interface to
inject MSIs only (see kvm_apic_mem_write() in hw/i386/kvm/apic.c). This
somehow works with disabled lAPIC but when we're in xAPIC mode we will
get a real injected MSI from every write to lAPIC. Not good.

The simplest solution seems to be to just ignore writes to the region
and return ~0 for all reads when we're in x2APIC mode. This is what this
patch does. However, this approach is inconsistent with what currently
happens when flexpriority is enabled: we allocate APIC access page and
create KVM memory region so in x2APIC modes all reads and writes go to
this pre-allocated page which is, btw, the same for all vCPUs.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

d1766202

18 Sep, 2018 2 commits

Merge tag 'kvm-ppc-fixes-4.19-2' of... · 1795f81f

Paolo Bonzini authored Sep 18, 2018

Merge tag 'kvm-ppc-fixes-4.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into HEAD

Second set of PPC KVM fixes for 4.19

Two fixes for KVM on POWER machines. Both of these relate to memory
corruption and host crashes seen when transparent huge pages are
enabled. The first fixes a host crash that can occur when a DMA
mapping is removed by the guest and the page mapped was part of a
transparent huge page; the second fixes corruption that could occur
when a hypervisor page fault for a radix guest is being serviced at
the same time that the backing page is being collapsed or split.

1795f81f

Merge tag 'kvm-s390-master-4.19-2' of... · cb5fb87a

Paolo Bonzini authored Sep 18, 2018

Merge tag 'kvm-s390-master-4.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD

KVM: s390: Fixes for 4.19

- more fallout from the hugetlbfs enablement
- bugfix for vma handling

cb5fb87a

16 Sep, 2018 2 commits

Linux 4.19-rc4 · 7876320f
Linus Torvalds authored Sep 16, 2018

7876320f

Code of Conduct: Let's revamp it. · 8a104f8b

Greg Kroah-Hartman authored Sep 15, 2018

The Code of Conflict is not achieving its implicit goal of fostering
civility and the spirit of 'be excellent to each other'.  Explicit
guidelines have demonstrated success in other projects and other areas
of the kernel.

Here is a Code of Conduct statement for the wider kernel.  It is based
on the Contributor Covenant as described at www.contributor-covenant.org

From this point forward, we should abide by these rules in order to help
make the kernel community a welcoming environment to participate in.
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Olof Johansson <olof@lxom.net>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

8a104f8b

15 Sep, 2018 8 commits

Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 27c5a778

Linus Torvalds authored Sep 15, 2018

Pull x86 fixes from Ingol Molnar:
 "Misc fixes:

   - EFI crash fix

   - Xen PV fixes

   - do not allow PTI on 2-level 32-bit kernels for now

   - documentation fix"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/APM: Fix build warning when PROC_FS is not enabled
  Revert "x86/mm/legacy: Populate the user page-table with user pgd's"
  x86/efi: Load fixmap GDT in efi_call_phys_epilog() before setting %cr3
  x86/xen: Disable CPU0 hotplug for Xen PV
  x86/EISA: Don't probe EISA bus for Xen PV guests
  x86/doc: Fix Documentation/x86/earlyprintk.txt

27c5a778

Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 4314daa5

Linus Torvalds authored Sep 15, 2018

Pull scheduler fixes from Ingo Molnar:
 "Misc fixes: various scheduler metrics corner case fixes, a
  sched_features deadlock fix, and a topology fix for certain NUMA
  systems"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/fair: Fix kernel-doc notation warning
  sched/fair: Fix load_balance redo for !imbalance
  sched/fair: Fix scale_rt_capacity() for SMT
  sched/fair: Fix vruntime_normalized() for remote non-migration wakeup
  sched/pelt: Fix update_blocked_averages() for RT and DL classes
  sched/topology: Set correct NUMA topology type
  sched/debug: Fix potential deadlock when writing to sched_features

4314daa5

Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · c0be92b5

Linus Torvalds authored Sep 15, 2018

Pull perf fixes from Ingo Molnar:
 "Mostly tooling fixes, but also breakpoint and x86 PMU driver fixes"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits)
  perf tools: Fix maps__find_symbol_by_name()
  tools headers uapi: Update tools's copy of linux/if_link.h
  tools headers uapi: Update tools's copy of linux/vhost.h
  tools headers uapi: Update tools's copies of kvm headers
  tools headers uapi: Update tools's copy of drm/drm.h
  tools headers uapi: Update tools's copy of asm-generic/unistd.h
  tools headers uapi: Update tools's copy of linux/perf_event.h
  perf/core: Force USER_DS when recording user stack data
  perf/UAPI: Clearly mark __PERF_SAMPLE_CALLCHAIN_EARLY as internal use
  perf/x86/intel: Add support/quirk for the MISPREDICT bit on Knights Landing CPUs
  perf annotate: Fix parsing aarch64 branch instructions after objdump update
  perf probe powerpc: Ignore SyS symbols irrespective of endianness
  perf event-parse: Use fixed size string for comms
  perf util: Fix bad memory access in trace info.
  perf tools: Streamline bpf examples and headers installation
  perf evsel: Fix potential null pointer dereference in perf_evsel__new_idx()
  perf arm64: Fix include path for asm-generic/unistd.h
  perf/hw_breakpoint: Simplify breakpoint enable in perf_event_modify_breakpoint
  perf/hw_breakpoint: Enable breakpoint in modify_user_hw_breakpoint
  perf/hw_breakpoint: Remove superfluous bp->attr.disabled = 0
  ...

c0be92b5

Merge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · ca062f8d

Linus Torvalds authored Sep 15, 2018

Pull locking fixes from Ingo Molnar:
 "Misc fixes: liblockdep fixes and ww_mutex fixes"

* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/ww_mutex: Fix spelling mistake "cylic" -> "cyclic"
  locking/lockdep: Delete unnecessary #include
  tools/lib/lockdep: Add dummy task_struct state member
  tools/lib/lockdep: Add empty nmi.h
  tools/lib/lockdep: Update Sasha Levin email to MSFT
  jump_label: Fix typo in warning message
  locking/mutex: Fix mutex debug call and ww_mutex documentation

ca062f8d

x86/APM: Fix build warning when PROC_FS is not enabled · 002b87d2

Randy Dunlap authored Sep 14, 2018

Fix build warning in apm_32.c when CONFIG_PROC_FS is not enabled:

../arch/x86/kernel/apm_32.c:1643:12: warning: 'proc_apm_show' defined but not used [-Wunused-function]
 static int proc_apm_show(struct seq_file *m, void *v)

Fixes: 3f3942ac ("proc: introduce proc_create_single{,_data}")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Jiri Kosina <jikos@kernel.org>
Link: https://lkml.kernel.org/r/be39ac12-44c2-4715-247f-4dcc3c525b8b@infradead.org

002b87d2

Merge tag '4.19-rc3-smb3-cifs' of git://git.samba.org/sfrench/cifs-2.6 · 3a5af36b

Linus Torvalds authored Sep 14, 2018

Pull cifs fixes from Steve French:
 "Fixes for four CIFS/SMB3 potential pointer overflow issues, one minor
  build fix, and a build warning cleanup"

* tag '4.19-rc3-smb3-cifs' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: read overflow in is_valid_oplock_break()
  cifs: integer overflow in in SMB2_ioctl()
  CIFS: fix wrapping bugs in num_entries()
  cifs: prevent integer overflow in nxt_dir_entry()
  fs/cifs: require sha512
  fs/cifs: suppress a string overflow warning

3a5af36b

Merge tag 'nfs-for-4.19-2' of git://git.linux-nfs.org/projects/anna/linux-nfs · 589109df

Linus Torvalds authored Sep 14, 2018

Pull NFS client bugfixes from Anna Schumaker:
 "These are a handful of fixes for problems that Trond found. Patch #1
  and #3 have the same name, a second issue was found after applying the
  first patch.

  Stable bugfixes:
   - v4.17+: Fix tracepoint Oops in initiate_file_draining()
   - v4.11+: Fix an infinite loop on I/O

  Other fixes:
   - Return errors if a waiting layoutget is killed
   - Don't open code clearing of delegation state"

* tag 'nfs-for-4.19-2' of git://git.linux-nfs.org/projects/anna/linux-nfs:
  NFS: Don't open code clearing of delegation state
  NFSv4.1 fix infinite loop on I/O.
  NFSv4: Fix a tracepoint Oops in initiate_file_draining()
  pNFS: Ensure we return the error if someone kills a waiting layoutget
  NFSv4: Fix a tracepoint Oops in initiate_file_draining()

589109df

Merge tag 'trace-v4.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace · 5b945fd2

Linus Torvalds authored Sep 14, 2018

Pull tracing fix from Steven Rostedt:
 "This fixes an issue with the build system caused by a change that
  modifies CC_FLAGS_FTRACE. The issue is that it breaks the dependencies
  and causes "make targz-pkg" to rebuild the entire world"

* tag 'trace-v4.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing/Makefile: Fix handling redefinition of CC_FLAGS_FTRACE

5b945fd2

14 Sep, 2018 10 commits

Merge tag 'devicetree-fixes-for-4.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux · 090b75bc

Linus Torvalds authored Sep 14, 2018

Pull DeviceTree fix from Rob Herring:
 "One regression for a 20 year old PowerMac:

   - Fix a regression on systems having a DT without any phandles which
     happens on a PowerMac G3"

* tag 'devicetree-fixes-for-4.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
  of: fix phandle cache creation for DTs with no phandles

090b75bc

Merge tag 'for-linus-4.19c-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip · d7c02680

Linus Torvalds authored Sep 14, 2018

Pull xen fixes from Juergen Gross:
 "This contains some minor cleanups and fixes:

   - a new knob for controlling scrubbing of pages returned by the Xen
     balloon driver to the Xen hypervisor to address a boot performance
     issue seen in large guests booted pre-ballooned

   - a fix of a regression in the gntdev driver which made it impossible
     to use fully virtualized guests (HVM guests) with a 4.19 based dom0

   - a fix in Xen cpu hotplug functionality which could be triggered by
     wrong admin commands (setting number of active vcpus to 0)

  One further note: the patches have all been under test for several
  days in another branch. This branch has been rebased in order to avoid
  merge conflicts"

* tag 'for-linus-4.19c-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen/gntdev: fix up blockable calls to mn_invl_range_start
  xen: fix GCC warning and remove duplicate EVTCHN_ROW/EVTCHN_COL usage
  xen: avoid crash in disable_hotplug_cpu
  xen/balloon: add runtime control for scrubbing ballooned out pages
  xen/manage: don't complain about an empty value in control/sysrq node

d7c02680

Merge tag 'xtensa-20180914' of git://github.com/jcmvbkbc/linux-xtensa · eae4f885

Linus Torvalds authored Sep 14, 2018

Pull Xtensa fixes and cleanups from Max Filippov:

 - don't allocate memory in platform_setup as the memory allocator is
   not initialized at that point yet;

 - remove unnecessary ifeq KBUILD_SRC from arch/xtensa/Makefile;

 - enable SG chaining in arch/xtensa/Kconfig.

* tag 'xtensa-20180914' of git://github.com/jcmvbkbc/linux-xtensa:
  xtensa: enable SG chaining in Kconfig
  xtensa: remove unnecessary KBUILD_SRC ifeq conditional
  xtensa: ISS: don't allocate memory in platform_setup

eae4f885

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux · 3e153256

Linus Torvalds authored Sep 14, 2018

Pull arm64 fixes from Will Deacon:
 "The trickle of arm64 fixes continues to come in.

  Nothing that's the end of the world, but we've got a fix for PCI IO
  port accesses, an accidental naked "asm goto" and a fix to the
  vmcoreinfo PT_NOTE merged this time around which we'd like to get
  sorted before it becomes ABI.

   - Fix ioport_map() mapping the wrong physical address for some I/O
     BARs

   - Remove direct use of "asm goto", since some compilers don't like
     that

   - Ensure kimage_voffset is always present in vmcoreinfo PT_NOTE"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  asm-generic: io: Fix ioport_map() for !CONFIG_GENERIC_IOMAP && CONFIG_INDIRECT_PIO
  arm64: kernel: arch_crash_save_vmcoreinfo() should depend on CONFIG_CRASH_CORE
  arm64: jump_label.h: use asm_volatile_goto macro instead of "asm goto"

3e153256

NFS: Don't open code clearing of delegation state · 9f0c5124

Trond Myklebust authored Sep 05, 2018

Add a helper for the case when the nfs4 open state has been set to use
a delegation stateid, and we want to revert to using the open stateid.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

9f0c5124

NFSv4.1 fix infinite loop on I/O. · 994b15b9

Trond Myklebust authored Sep 05, 2018

The previous fix broke recovery of delegated stateids because it assumes
that if we did not mark the delegation as suspect, then the delegation has
effectively been revoked, and so it removes that delegation irrespectively
of whether or not it is valid and still in use. While this is "mostly
harmless" for ordinary I/O, we've seen pNFS fail with LAYOUTGET spinning
in an infinite loop while complaining that we're using an invalid stateid
(in this case the all-zero stateid).

What we rather want to do here is ensure that the delegation is always
correctly marked as needing testing when that is the case. So we want
to close the loophole offered by nfs4_schedule_stateid_recovery(),
which marks the state as needing to be reclaimed, but not the
delegation that may be backing it.

Fixes: 0e3d3e5d ("NFSv4.1 fix infinite loop on IO BAD_STATEID error")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Cc: stable@vger.kernel.org # v4.11+
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

994b15b9

NFSv4: Fix a tracepoint Oops in initiate_file_draining() · 2edaead6

Trond Myklebust authored Sep 05, 2018

Now that the value of 'ino' can be NULL or an ERR_PTR(), we need to
change the test in the tracepoint.

Fixes: ce5624f7 ("NFSv4: Return NFS4ERR_DELAY when a layout fails...")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Cc: stable@vger.kernel.org # v4.17+
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

2edaead6

pNFS: Ensure we return the error if someone kills a waiting layoutget · d03360aa

Trond Myklebust authored Sep 05, 2018

If someone interrupts a wait on one or more outstanding layoutgets in
pnfs_update_layout() then return the ERESTARTSYS/EINTR error.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

d03360aa

NFSv4: Fix a tracepoint Oops in initiate_file_draining() · 2a534a74

Trond Myklebust authored Aug 23, 2018

Now that the value of 'ino' can be NULL or an ERR_PTR(), we need to
change the test in the tracepoint.

Fixes: ce5624f7 ("NFSv4: Return NFS4ERR_DELAY when a layout fails...")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Cc: stable@vger.kernel.org # v4.17+
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

2a534a74

Merge tag 'dmaengine-fix-4.19-rc4' of git://git.infradead.org/users/vkoul/slave-dma · f3c0b8ce

Linus Torvalds authored Sep 14, 2018

Pull dmaengine fix from Vinod Koul:
 "Fix the mic_x100_dma driver to use devm_kzalloc for driver memory, so
  that it is freed properly when it unregisters from dmaengine using
  managed API"

* tag 'dmaengine-fix-4.19-rc4' of git://git.infradead.org/users/vkoul/slave-dma:
  dmaengine: mic_x100_dma: use devm_kzalloc to fix an issue

f3c0b8ce