1. 25 Jul, 2019 3 commits
    • Linus Torvalds's avatar
      Merge tag 'riscv/for-v5.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux · a51edf75
      Linus Torvalds authored
      Pull RISC-V updates from Paul Walmsley:
       "Four minor RISC-V-related changes:
      
         - Add support for the new clone3 syscall for RV64, relying on the
           generic support
      
         - Add DT data for the gigabit Ethernet controller on the SiFive FU540
           and the HiFive Unleashed board
      
         - Update MAINTAINERS to add me to the arch/riscv maintainers' list
      
         - Add support for PCIe message-signaled interrupts by reusing the
           generic header file"
      
      * tag 'riscv/for-v5.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
        riscv: dts: Add DT node for SiFive FU540 Ethernet controller driver
        riscv: include generic support for MSI irqdomains
        MAINTAINERS: Add Paul as a RISC-V maintainer
        riscv: enable sys_clone3 syscall for rv64
      a51edf75
    • Linus Torvalds's avatar
      Merge tag 'ktest-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-ktest · da3cc2e6
      Linus Torvalds authored
      Pull ktest fixlets from Steven Rostedt:
       "This contains only simple spelling fixes"
      
      * tag 'ktest-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-ktest:
        ktest: Fix some typos in config-bisect.pl
      da3cc2e6
    • Linus Torvalds's avatar
      Merge branch 'access-creds' · a29a0a46
      Linus Torvalds authored
      The access() (and faccessat()) credentials change can cause an
      unnecessary load on the RCU machinery because every access() call ends
      up freeing the temporary access credential using RCU.
      
      This isn't really noticeable on small machines, but if you have hundreds
      of cores you can cause huge slowdowns due to RCU storms.
      
      It's easy to avoid: the temporary access crededntials aren't actually
      normally accessed using RCU at all, so we can avoid the whole issue by
      just marking them as such.
      
      * access-creds:
        access: avoid the RCU grace period for the temporary subjective credentials
      a29a0a46
  2. 24 Jul, 2019 7 commits
    • Masanari Iida's avatar
      aecea57f
    • Linus Torvalds's avatar
      access: avoid the RCU grace period for the temporary subjective credentials · d7852fbd
      Linus Torvalds authored
      It turns out that 'access()' (and 'faccessat()') can cause a lot of RCU
      work because it installs a temporary credential that gets allocated and
      freed for each system call.
      
      The allocation and freeing overhead is mostly benign, but because
      credentials can be accessed under the RCU read lock, the freeing
      involves a RCU grace period.
      
      Which is not a huge deal normally, but if you have a lot of access()
      calls, this causes a fair amount of seconday damage: instead of having a
      nice alloc/free patterns that hits in hot per-CPU slab caches, you have
      all those delayed free's, and on big machines with hundreds of cores,
      the RCU overhead can end up being enormous.
      
      But it turns out that all of this is entirely unnecessary.  Exactly
      because access() only installs the credential as the thread-local
      subjective credential, the temporary cred pointer doesn't actually need
      to be RCU free'd at all.  Once we're done using it, we can just free it
      synchronously and avoid all the RCU overhead.
      
      So add a 'non_rcu' flag to 'struct cred', which can be set by users that
      know they only use it in non-RCU context (there are other potential
      users for this).  We can make it a union with the rcu freeing list head
      that we need for the RCU case, so this doesn't need any extra storage.
      
      Note that this also makes 'get_current_cred()' clear the new non_rcu
      flag, in case we have filesystems that take a long-term reference to the
      cred and then expect the RCU delayed freeing afterwards.  It's not
      entirely clear that this is required, but it makes for clear semantics:
      the subjective cred remains non-RCU as long as you only access it
      synchronously using the thread-local accessors, but you _can_ use it as
      a generic cred if you want to.
      
      It is possible that we should just remove the whole RCU markings for
      ->cred entirely.  Only ->real_cred is really supposed to be accessed
      through RCU, and the long-term cred copies that nfs uses might want to
      explicitly re-enable RCU freeing if required, rather than have
      get_current_cred() do it implicitly.
      
      But this is a "minimal semantic changes" change for the immediate
      problem.
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarPaul E. McKenney <paulmck@linux.ibm.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Jan Glauber <jglauber@marvell.com>
      Cc: Jiri Kosina <jikos@kernel.org>
      Cc: Jayachandran Chandrasekharan Nair <jnair@marvell.com>
      Cc: Greg KH <greg@kroah.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Miklos Szeredi <miklos@szeredi.hu>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d7852fbd
    • Linus Torvalds's avatar
      Merge tag 'powerpc-5.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · bed38c3e
      Linus Torvalds authored
      Pull powerpc fixes from Michael Ellerman:
       "An assortment of non-regression fixes that have accumulated since the
        start of the merge window.
      
         - A fix for a user triggerable oops on machines where transactional
           memory is disabled, eg. Power9 bare metal, Power8 with TM disabled
           on the command line, or all Power7 or earlier machines.
      
         - Three fixes for handling of PMU and power saving registers when
           running nested KVM on Power9.
      
         - Two fixes for bugs found while stress testing the XIVE interrupt
           controller code, also on Power9.
      
         - A fix to allow guests to boot under Qemu/KVM on Power9 using the
           the Hash MMU with >= 1TB of memory.
      
         - Two fixes for bugs in the recent DMA cleanup, one of which could
           lead to checkstops.
      
         - And finally three fixes for the PAPR SCM nvdimm driver.
      
        Thanks to: Alexey Kardashevskiy, Andrea Arcangeli, Cédric Le Goater,
        Christoph Hellwig, David Gibson, Gautham R. Shenoy, Michael Neuling,
        Oliver O'Halloran, Satheesh Rajendran, Shawn Anastasio, Suraj Jitindar
        Singh, Vaibhav Jain"
      
      * tag 'powerpc-5.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        powerpc/papr_scm: Force a scm-unbind if initial scm-bind fails
        powerpc/papr_scm: Update drc_pmem_unbind() to use H_SCM_UNBIND_ALL
        powerpc/pseries: Update SCM hcall op-codes in hvcall.h
        powerpc/tm: Fix oops on sigreturn on systems without TM
        powerpc/dma: Fix invalid DMA mmap behavior
        KVM: PPC: Book3S HV: XIVE: fix rollback when kvmppc_xive_create fails
        powerpc/xive: Fix loop exit-condition in xive_find_target_in_mask()
        powerpc: fix off by one in max_zone_pfn initialization for ZONE_DMA
        KVM: PPC: Book3S HV: Save and restore guest visible PSSCR bits on pseries
        powerpc/pmu: Set pmcregs_in_use in paca when running as LPAR
        KVM: PPC: Book3S HV: Always save guest pmu for guest capable of nesting
        powerpc/mm: Limit rma_size to 1TB when running without HV mode
      bed38c3e
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 76260774
      Linus Torvalds authored
      Pull KVM fixes from Paolo Bonzini:
       "Bugfixes, a pvspinlock optimization, and documentation moving"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        KVM: X86: Boost queue head vCPU to mitigate lock waiter preemption
        Documentation: move Documentation/virtual to Documentation/virt
        KVM: nVMX: Set cached_vmcs12 and cached_shadow_vmcs12 NULL after free
        KVM: X86: Dynamically allocate user_fpu
        KVM: X86: Fix fpu state crash in kvm guest
        Revert "kvm: x86: Use task structs fpu field for user"
        KVM: nVMX: Clear pending KVM_REQ_GET_VMCS12_PAGES when leaving nested
      76260774
    • Linus Torvalds's avatar
      Merge tag 'dma-mapping-5.3-2' of git://git.infradead.org/users/hch/dma-mapping · c2626876
      Linus Torvalds authored
      Pull dma-mapping regression fix from Christoph Hellwig:
       "Ensure that dma_addressing_limited doesn't crash on devices without a
        dma mask (Eric Auger)"
      
      * tag 'dma-mapping-5.3-2' of git://git.infradead.org/users/hch/dma-mapping:
        dma-mapping: use dma_get_mask in dma_addressing_limited
      c2626876
    • Wanpeng Li's avatar
      KVM: X86: Boost queue head vCPU to mitigate lock waiter preemption · 266e85a5
      Wanpeng Li authored
      Commit 11752adb (locking/pvqspinlock: Implement hybrid PV queued/unfair locks)
      introduces hybrid PV queued/unfair locks
       - queued mode (no starvation)
       - unfair mode (good performance on not heavily contended lock)
      The lock waiter goes into the unfair mode especially in VMs with over-commit
      vCPUs since increaing over-commitment increase the likehood that the queue
      head vCPU may have been preempted and not actively spinning.
      
      However, reschedule queue head vCPU timely to acquire the lock still can get
      better performance than just depending on lock stealing in over-subscribe
      scenario.
      
      Testing on 80 HT 2 socket Xeon Skylake server, with 80 vCPUs VM 80GB RAM:
      ebizzy -M
                   vanilla     boosting    improved
       1VM          23520        25040         6%
       2VM           8000        13600        70%
       3VM           3100         5400        74%
      
      The lock holder vCPU yields to the queue head vCPU when unlock, to boost queue
      head vCPU which is involuntary preemption or the one which is voluntary halt
      due to fail to acquire the lock after a short spin in the guest.
      
      Cc: Waiman Long <longman@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: default avatarWanpeng Li <wanpengli@tencent.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      266e85a5
    • Christoph Hellwig's avatar
      Documentation: move Documentation/virtual to Documentation/virt · 2f5947df
      Christoph Hellwig authored
      Renaming docs seems to be en vogue at the moment, so fix on of the
      grossly misnamed directories.  We usually never use "virtual" as
      a shortcut for virtualization in the kernel, but always virt,
      as seen in the virt/ top-level directory.  Fix up the documentation
      to match that.
      
      Fixes: ed16648e ("Move kvm, uml, and lguest subdirectories under a common "virtual" directory, I.E:")
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      2f5947df
  3. 23 Jul, 2019 2 commits
  4. 22 Jul, 2019 21 commits
  5. 21 Jul, 2019 7 commits