1. 20 Jul, 2019 26 commits
    • Linus Torvalds's avatar
      Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · c6dd78fc
      Linus Torvalds authored
      Pull x86 fixes from Thomas Gleixner:
       "A set of x86 specific fixes and updates:
      
         - The CR2 corruption fixes which store CR2 early in the entry code
           and hand the stored address to the fault handlers.
      
         - Revert a forgotten leftover of the dropped FSGSBASE series.
      
         - Plug a memory leak in the boot code.
      
         - Make the Hyper-V assist functionality robust by zeroing the shadow
           page.
      
         - Remove a useless check for dead processes with LDT
      
         - Update paravirt and VMware maintainers entries.
      
         - A few cleanup patches addressing various compiler warnings"
      
      * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/entry/64: Prevent clobbering of saved CR2 value
        x86/hyper-v: Zero out the VP ASSIST PAGE on allocation
        x86, boot: Remove multiple copy of static function sanitize_boot_params()
        x86/boot/compressed/64: Remove unused variable
        x86/boot/efi: Remove unused variables
        x86/mm, tracing: Fix CR2 corruption
        x86/entry/64: Update comments and sanity tests for create_gap
        x86/entry/64: Simplify idtentry a little
        x86/entry/32: Simplify common_exception
        x86/paravirt: Make read_cr2() CALLEE_SAVE
        MAINTAINERS: Update PARAVIRT_OPS_INTERFACE and VMWARE_HYPERVISOR_INTERFACE
        x86/process: Delete useless check for dead process with LDT
        x86: math-emu: Hide clang warnings for 16-bit overflow
        x86/e820: Use proper booleans instead of 0/1
        x86/apic: Silence -Wtype-limits compiler warnings
        x86/mm: Free sme_early_buffer after init
        x86/boot: Fix memory leak in default_get_smp_config()
        Revert "x86/ptrace: Prevent ptrace from clearing the FS/GS selector" and fix the test
      c6dd78fc
    • Linus Torvalds's avatar
      Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 46f5c0cc
      Linus Torvalds authored
      Pull perf tooling updates from Thomas Gleixner:
       "A set of perf improvements and fixes:
      
        perf db-export:
         - Improvements in how COMM details are exported to databases for post
           processing and use in the sql-viewer.py UI.
      
         - Export switch events to the database.
      
        BPF:
         - Bump rlimit(MEMLOCK) for 'perf test bpf' and 'perf trace', just
           like selftests/bpf/bpf_rlimit.h do, which makes errors due to
           exhaustion of this limit, which are kinda cryptic (EPERM sometimes)
           less frequent.
      
        perf version:
         - Fix segfault due to missing OPT_END(), noticed on PowerPC.
      
        perf vendor events:
         - Add JSON files for IBM s/390 machine type 8561.
      
        perf cs-etm (ARM):
         - Fix two cases of error returns not bing done properly: Invalid
           ERR_PTR() use and loss of propagation error codes"
      
      * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (28 commits)
        perf version: Fix segfault due to missing OPT_END()
        perf vendor events s390: Add JSON files for machine type 8561
        perf cs-etm: Return errcode in cs_etm__process_auxtrace_info()
        perf cs-etm: Remove errnoeous ERR_PTR() usage in cs_etm__process_auxtrace_info
        perf scripts python: export-to-postgresql.py: Export switch events
        perf scripts python: export-to-sqlite.py: Export switch events
        perf db-export: Export switch events
        perf db-export: Factor out db_export__threads()
        perf script: Add scripting operation process_switch()
        perf scripts python: exported-sql-viewer.py: Use new 'has_calls' column
        perf scripts python: exported-sql-viewer.py: Remove redundant semi-colons
        perf scripts python: export-to-postgresql.py: Add has_calls column to comms table
        perf scripts python: export-to-sqlite.py: Add has_calls column to comms table
        perf db-export: Also export thread's current comm
        perf db-export: Factor out db_export__comm()
        perf scripts python: export-to-postgresql.py: Export comm details
        perf scripts python: export-to-sqlite.py: Export comm details
        perf db-export: Export comm details
        perf db-export: Fix a white space issue in db_export__sample()
        perf db-export: Move export__comm_thread into db_export__sample()
        ...
      46f5c0cc
    • Linus Torvalds's avatar
      Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · e6023adc
      Linus Torvalds authored
      Pull core fixes from Thomas Gleixner:
      
       - A collection of objtool fixes which address recent fallout partially
         exposed by newer toolchains, clang, BPF and general code changes.
      
       - Force USER_DS for user stack traces
      
      [ Note: the "objtool fixes" are not all to objtool itself, but for
        kernel code that triggers objtool warnings.
      
        Things like missing function size annotations, or code that confuses
        the unwinder etc.   - Linus]
      
      * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (27 commits)
        objtool: Support conditional retpolines
        objtool: Convert insn type to enum
        objtool: Fix seg fault on bad switch table entry
        objtool: Support repeated uses of the same C jump table
        objtool: Refactor jump table code
        objtool: Refactor sibling call detection logic
        objtool: Do frame pointer check before dead end check
        objtool: Change dead_end_function() to return boolean
        objtool: Warn on zero-length functions
        objtool: Refactor function alias logic
        objtool: Track original function across branches
        objtool: Add mcsafe_handle_tail() to the uaccess safe list
        bpf: Disable GCC -fgcse optimization for ___bpf_prog_run()
        x86/uaccess: Remove redundant CLACs in getuser/putuser error paths
        x86/uaccess: Don't leak AC flag into fentry from mcsafe_handle_tail()
        x86/uaccess: Remove ELF function annotation from copy_user_handle_tail()
        x86/head/64: Annotate start_cpu0() as non-callable
        x86/entry: Fix thunk function ELF sizes
        x86/kvm: Don't call kvm_spurious_fault() from .fixup
        x86/kvm: Replace vmx_vmenter()'s call to kvm_spurious_fault() with UD2
        ...
      e6023adc
    • Linus Torvalds's avatar
      Merge branch 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 4b01f5a4
      Linus Torvalds authored
      Pull smp fix from Thomas Gleixner:
       "Add warnings to the smp function calls so callers from wrong contexts
        get detected"
      
      * 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        smp: Warn on function calls from softirq context
      4b01f5a4
    • Linus Torvalds's avatar
      Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 70e6e1b9
      Linus Torvalds authored
      Pull CONFIG_PREEMPT_RT stub config from Thomas Gleixner:
       "The real-time preemption patch set exists for almost 15 years now and
        while the vast majority of infrastructure and enhancements have found
        their way into the mainline kernel, the final integration of RT is
        still missing.
      
        Over the course of the last few years, we have worked on reducing the
        intrusivenness of the RT patches by refactoring kernel infrastructure
        to be more real-time friendly. Almost all of these changes were
        benefitial to the mainline kernel on their own, so there was no
        objection to integrate them.
      
        Though except for the still ongoing printk refactoring, the remaining
        changes which are required to make RT a first class mainline citizen
        are not longer arguable as immediately beneficial for the mainline
        kernel. Most of them are either reordering code flows or adding RT
        specific functionality.
      
        But this now has hit a wall and turned into a classic hen and egg
        problem:
      
           Maintainers are rightfully wary vs. these changes as they make only
           sense if the final integration of RT into the mainline kernel takes
           place.
      
        Adding CONFIG_PREEMPT_RT aims to solve this as a clear sign that RT
        will be fully integrated into the mainline kernel. The final
        integration of the missing bits and pieces will be of course done with
        the same careful approach as we have used in the past.
      
        While I'm aware that you are not entirely enthusiastic about that, I
        think that RT should receive the same treatment as any other widely
        used out of tree functionality, which we have accepted into mainline
        over the years.
      
        RT has become the de-facto standard real-time enhancement and is
        shipped by enterprise, embedded and community distros. It's in use
        throughout a wide range of industries: telecommunications, industrial
        automation, professional audio, medical devices, data acquisition,
        automotive - just to name a few major use cases.
      
        RT development is backed by a Linuxfoundation project which is
        supported by major stakeholders of this technology. The funding will
        continue over the actual inclusion into mainline to make sure that the
        functionality is neither introducing regressions, regressing itself,
        nor becomes subject to bitrot. There is also a lifely user community
        around RT as well, so contrary to the grim situation 5 years ago, it's
        a healthy project.
      
        As RT is still a good vehicle to exercise rarely used code paths and
        to detect hard to trigger issues, you could at least view it as a QA
        tool if nothing else"
      
      * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        sched/rt, Kconfig: Introduce CONFIG_PREEMPT_RT
      70e6e1b9
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 07ab9d5b
      Linus Torvalds authored
      Pull more KVM updates from Paolo Bonzini:
       "Mostly bugfixes, but also:
      
         - s390 support for KVM selftests
      
         - LAPIC timer offloading to housekeeping CPUs
      
         - Extend an s390 optimization for overcommitted hosts to all
           architectures
      
         - Debugging cleanups and improvements"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (25 commits)
        KVM: x86: Add fixed counters to PMU filter
        KVM: nVMX: do not use dangling shadow VMCS after guest reset
        KVM: VMX: dump VMCS on failed entry
        KVM: x86/vPMU: refine kvm_pmu err msg when event creation failed
        KVM: s390: Use kvm_vcpu_wake_up in kvm_s390_vcpu_wakeup
        KVM: Boost vCPUs that are delivering interrupts
        KVM: selftests: Remove superfluous define from vmx.c
        KVM: SVM: Fix detection of AMD Errata 1096
        KVM: LAPIC: Inject timer interrupt via posted interrupt
        KVM: LAPIC: Make lapic timer unpinned
        KVM: x86/vPMU: reset pmc->counter to 0 for pmu fixed_counters
        KVM: nVMX: Ignore segment base for VMX memory operand when segment not FS or GS
        kvm: x86: ioapic and apic debug macros cleanup
        kvm: x86: some tsc debug cleanup
        kvm: vmx: fix coccinelle warnings
        x86: kvm: avoid constant-conversion warning
        x86: kvm: avoid -Wsometimes-uninitized warning
        KVM: x86: expose AVX512_BF16 feature to guest
        KVM: selftests: enable pgste option for the linker on s390
        KVM: selftests: Move kvm_create_max_vcpus test to generic code
        ...
      07ab9d5b
    • Linus Torvalds's avatar
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · f65420df
      Linus Torvalds authored
      Pull SCSI fixes from James Bottomley:
       "This is the final round of mostly small fixes in our initial submit.
      
        It's mostly minor fixes and driver updates. The only change of note is
        adding a virt_boundary_mask to the SCSI host and host template to
        parametrise this for NVMe devices instead of having them do a call in
        slave_alloc. It's a fairly straightforward conversion except in the
        two NVMe handling drivers that didn't set it who now have a virtual
        infinity parameter added"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (24 commits)
        scsi: megaraid_sas: set an unlimited max_segment_size
        scsi: mpt3sas: set an unlimited max_segment_size for SAS 3.0 HBAs
        scsi: IB/srp: set virt_boundary_mask in the scsi host
        scsi: IB/iser: set virt_boundary_mask in the scsi host
        scsi: storvsc: set virt_boundary_mask in the scsi host template
        scsi: ufshcd: set max_segment_size in the scsi host template
        scsi: core: take the DMA max mapping size into account
        scsi: core: add a host / host template field for the virt boundary
        scsi: core: Fix race on creating sense cache
        scsi: sd_zbc: Fix compilation warning
        scsi: libfc: fix null pointer dereference on a null lport
        scsi: zfcp: fix GCC compiler warning emitted with -Wmaybe-uninitialized
        scsi: zfcp: fix request object use-after-free in send path causing wrong traces
        scsi: zfcp: fix request object use-after-free in send path causing seqno errors
        scsi: megaraid_sas: Update driver version to 07.710.50.00
        scsi: megaraid_sas: Add module parameter for FW Async event logging
        scsi: megaraid_sas: Enable msix_load_balance for Invader and later controllers
        scsi: megaraid_sas: Fix calculation of target ID
        scsi: lpfc: reduce stack size with CONFIG_GCC_PLUGIN_STRUCTLEAK_VERBOSE
        scsi: devinfo: BLIST_TRY_VPD_PAGES for SanDisk Cruzer Blade
        ...
      f65420df
    • Linus Torvalds's avatar
      Merge tag 'kbuild-v5.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild · 168c7997
      Linus Torvalds authored
      Pull more Kbuild updates from Masahiro Yamada:
      
       - match the directory structure of the linux-libc-dev package to that
         of Debian-based distributions
      
       - fix incorrect include/config/auto.conf generation when Kconfig
         creates it along with the .config file
      
       - remove misleading $(AS) from documents
      
       - clean up precious tag files by distclean instead of mrproper
      
       - add a new coccinelle patch for devm_platform_ioremap_resource
         migration
      
       - refactor module-related scripts to read modules.order instead of
         $(MODVERDIR)/*.mod files to get the list of created modules
      
       - remove MODVERDIR
      
       - update list of header compile-test
      
       - add -fcf-protection=none flag to avoid conflict with the retpoline
         flags when CONFIG_RETPOLINE=y
      
       - misc cleanups
      
      * tag 'kbuild-v5.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (25 commits)
        kbuild: add -fcf-protection=none when using retpoline flags
        kbuild: update compile-test header list for v5.3-rc1
        kbuild: split out *.mod out of {single,multi}-used-m rules
        kbuild: remove 'prepare1' target
        kbuild: remove the first line of *.mod files
        kbuild: create *.mod with full directory path and remove MODVERDIR
        kbuild: export_report: read modules.order instead of .tmp_versions/*.mod
        kbuild: modpost: read modules.order instead of $(MODVERDIR)/*.mod
        kbuild: modsign: read modules.order instead of $(MODVERDIR)/*.mod
        kbuild: modinst: read modules.order instead of $(MODVERDIR)/*.mod
        scsi: remove pointless $(MODVERDIR)/$(obj)/53c700.ver
        kbuild: remove duplication from modules.order in sub-directories
        kbuild: get rid of kernel/ prefix from in-tree modules.{order,builtin}
        kbuild: do not create empty modules.order in the prepare stage
        coccinelle: api: add devm_platform_ioremap_resource script
        kbuild: compile-test headers listed in header-test-m as well
        kbuild: remove unused hostcc-option
        kbuild: remove tag files by distclean instead of mrproper
        kbuild: add --hash-style= and --build-id unconditionally
        kbuild: get rid of misleading $(AS) from documents
        ...
      168c7997
    • Linus Torvalds's avatar
      Merge branch 'work.dcache2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 18253e03
      Linus Torvalds authored
      Pull dcache and mountpoint updates from Al Viro:
       "Saner handling of refcounts to mountpoints.
      
        Transfer the counting reference from struct mount ->mnt_mountpoint
        over to struct mountpoint ->m_dentry. That allows us to get rid of the
        convoluted games with ordering of mount shutdowns.
      
        The cost is in teaching shrink_dcache_{parent,for_umount} to cope with
        mixed-filesystem shrink lists, which we'll also need for the Slab
        Movable Objects patchset"
      
      * 'work.dcache2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        switch the remnants of releasing the mountpoint away from fs_pin
        get rid of detach_mnt()
        make struct mountpoint bear the dentry reference to mountpoint, not struct mount
        Teach shrink_dcache_parent() to cope with mixed-filesystem shrink lists
        fs/namespace.c: shift put_mountpoint() to callers of unhash_mnt()
        __detach_mounts(): lookup_mountpoint() can't return ERR_PTR() anymore
        nfs: dget_parent() never returns NULL
        ceph: don't open-code the check for dead lockref
      18253e03
    • Thomas Gleixner's avatar
      x86/entry/64: Prevent clobbering of saved CR2 value · 6879298b
      Thomas Gleixner authored
      The recent fix for CR2 corruption introduced a new way to reliably corrupt
      the saved CR2 value.
      
      CR2 is saved early in the entry code in RDX, which is the third argument to
      the fault handling functions. But it missed that between saving and
      invoking the fault handler enter_from_user_mode() can be called. RDX is a
      caller saved register so the invoked function can freely clobber it with
      the obvious consequences.
      
      The TRACE_IRQS_OFF call is safe as it calls through the thunk which
      preserves RDX, but TRACE_IRQS_OFF_DEBUG is not because it also calls into
      C-code outside of the thunk.
      
      Store CR2 in R12 instead which is a callee saved register and move R12 to
      RDX just before calling the fault handler.
      
      Fixes: a0d14b89 ("x86/mm, tracing: Fix CR2 corruption")
      Reported-by: default avatarSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1907201020540.1782@nanos.tec.linutronix.de
      6879298b
    • Peter Zijlstra's avatar
      smp: Warn on function calls from softirq context · 19dbdcb8
      Peter Zijlstra authored
      It's clearly documented that smp function calls cannot be invoked from
      softirq handling context. Unfortunately nothing enforces that or emits a
      warning.
      
      A single function call can be invoked from softirq context only via
      smp_call_function_single_async().
      
      The only legit context is task context, so add a warning to that effect.
      Reported-by: default avatarluferry <luferry@163.com>
      Signed-off-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Link: https://lkml.kernel.org/r/20190718160601.GP3402@hirez.programming.kicks-ass.net
      19dbdcb8
    • Eric Hankland's avatar
      KVM: x86: Add fixed counters to PMU filter · 30cd8604
      Eric Hankland authored
      Updates KVM_CAP_PMU_EVENT_FILTER so it can also whitelist or blacklist
      fixed counters.
      Signed-off-by: default avatarEric Hankland <ehankland@google.com>
      [No need to check padding fields for zero. - Paolo]
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      30cd8604
    • Paolo Bonzini's avatar
      KVM: nVMX: do not use dangling shadow VMCS after guest reset · 88dddc11
      Paolo Bonzini authored
      If a KVM guest is reset while running a nested guest, free_nested will
      disable the shadow VMCS execution control in the vmcs01.  However,
      on the next KVM_RUN vmx_vcpu_run would nevertheless try to sync
      the VMCS12 to the shadow VMCS which has since been freed.
      
      This causes a vmptrld of a NULL pointer on my machime, but Jan reports
      the host to hang altogether.  Let's see how much this trivial patch fixes.
      Reported-by: default avatarJan Kiszka <jan.kiszka@siemens.com>
      Cc: Liran Alon <liran.alon@oracle.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      88dddc11
    • Paolo Bonzini's avatar
      KVM: VMX: dump VMCS on failed entry · 3b20e03a
      Paolo Bonzini authored
      This is useful for debugging, and is ratelimited nowadays.
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      3b20e03a
    • Like Xu's avatar
      KVM: x86/vPMU: refine kvm_pmu err msg when event creation failed · 6fc3977c
      Like Xu authored
      If a perf_event creation fails due to any reason of the host perf
      subsystem, it has no chance to log the corresponding event for guest
      which may cause abnormal sampling data in guest result. In debug mode,
      this message helps to understand the state of vPMC and we may not
      limit the number of occurrences but not in a spamming style.
      Suggested-by: default avatarJoe Perches <joe@perches.com>
      Signed-off-by: default avatarLike Xu <like.xu@linux.intel.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      6fc3977c
    • Wanpeng Li's avatar
      KVM: s390: Use kvm_vcpu_wake_up in kvm_s390_vcpu_wakeup · d9847409
      Wanpeng Li authored
      Use kvm_vcpu_wake_up() in kvm_s390_vcpu_wakeup().
      Suggested-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: default avatarWanpeng Li <wanpengli@tencent.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      d9847409
    • Wanpeng Li's avatar
      KVM: Boost vCPUs that are delivering interrupts · d73eb57b
      Wanpeng Li authored
      Inspired by commit 9cac38dd (KVM/s390: Set preempted flag during
      vcpu wakeup and interrupt delivery), we want to also boost not just
      lock holders but also vCPUs that are delivering interrupts. Most
      smp_call_function_many calls are synchronous, so the IPI target vCPUs
      are also good yield candidates.  This patch introduces vcpu->ready to
      boost vCPUs during wakeup and interrupt delivery time; unlike s390 we do
      not reuse vcpu->preempted so that voluntarily preempted vCPUs are taken
      into account by kvm_vcpu_on_spin, but vmx_vcpu_pi_put is not affected
      (VT-d PI handles voluntary preemption separately, in pi_pre_block).
      
      Testing on 80 HT 2 socket Xeon Skylake server, with 80 vCPUs VM 80GB RAM:
      ebizzy -M
      
                  vanilla     boosting    improved
      1VM          21443       23520         9%
      2VM           2800        8000       180%
      3VM           1800        3100        72%
      
      Testing on my Haswell desktop 8 HT, with 8 vCPUs VM 8GB RAM, two VMs,
      one running ebizzy -M, the other running 'stress --cpu 2':
      
      w/ boosting + w/o pv sched yield(vanilla)
      
                  vanilla     boosting   improved
                    1570         4000      155%
      
      w/ boosting + w/ pv sched yield(vanilla)
      
                  vanilla     boosting   improved
                    1844         5157      179%
      
      w/o boosting, perf top in VM:
      
       72.33%  [kernel]       [k] smp_call_function_many
        4.22%  [kernel]       [k] call_function_i
        3.71%  [kernel]       [k] async_page_fault
      
      w/ boosting, perf top in VM:
      
       38.43%  [kernel]       [k] smp_call_function_many
        6.31%  [kernel]       [k] async_page_fault
        6.13%  libc-2.23.so   [.] __memcpy_avx_unaligned
        4.88%  [kernel]       [k] call_function_interrupt
      
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Paul Mackerras <paulus@ozlabs.org>
      Cc: Marc Zyngier <maz@kernel.org>
      Signed-off-by: default avatarWanpeng Li <wanpengli@tencent.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      d73eb57b
    • Thomas Huth's avatar
      KVM: selftests: Remove superfluous define from vmx.c · 2417c870
      Thomas Huth authored
      The code in vmx.c does not use "program_invocation_name", so there
      is no need to "#define _GNU_SOURCE" here.
      Signed-off-by: default avatarThomas Huth <thuth@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      2417c870
    • Liran Alon's avatar
      KVM: SVM: Fix detection of AMD Errata 1096 · 118154bd
      Liran Alon authored
      When CPU raise #NPF on guest data access and guest CR4.SMAP=1, it is
      possible that CPU microcode implementing DecodeAssist will fail
      to read bytes of instruction which caused #NPF. This is AMD errata
      1096 and it happens because CPU microcode reading instruction bytes
      incorrectly attempts to read code as implicit supervisor-mode data
      accesses (that is, just like it would read e.g. a TSS), which are
      susceptible to SMAP faults. The microcode reads CS:RIP and if it is
      a user-mode address according to the page tables, the processor
      gives up and returns no instruction bytes.  In this case,
      GuestIntrBytes field of the VMCB on a VMEXIT will incorrectly
      return 0 instead of the correct guest instruction bytes.
      
      Current KVM code attemps to detect and workaround this errata, but it
      has multiple issues:
      
      1) It mistakenly checks if guest CR4.SMAP=0 instead of guest CR4.SMAP=1,
      which is required for encountering a SMAP fault.
      
      2) It assumes SMAP faults can only occur when guest CPL==3.
      However, in case guest CR4.SMEP=0, the guest can execute an instruction
      which reside in a user-accessible page with CPL<3 priviledge. If this
      instruction raise a #NPF on it's data access, then CPU DecodeAssist
      microcode will still encounter a SMAP violation.  Even though no sane
      OS will do so (as it's an obvious priviledge escalation vulnerability),
      we still need to handle this semanticly correct in KVM side.
      
      Note that (2) *is* a useful optimization, because CR4.SMAP=1 is an easy
      triggerable condition and guests usually enable SMAP together with SMEP.
      If the vCPU has CR4.SMEP=1, the errata could indeed be encountered onlt
      at guest CPL==3; otherwise, the CPU would raise a SMEP fault to guest
      instead of #NPF.  We keep this condition to avoid false positives in
      the detection of the errata.
      
      In addition, to avoid future confusion and improve code readbility,
      include details of the errata in code and not just in commit message.
      
      Fixes: 05d5a486 ("KVM: SVM: Workaround errata#1096 (insn_len maybe zero on SMAP violation)")
      Cc: Singh Brijesh <brijesh.singh@amd.com>
      Cc: Sean Christopherson <sean.j.christopherson@intel.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Reviewed-by: default avatarBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Signed-off-by: default avatarLiran Alon <liran.alon@oracle.com>
      Reviewed-by: default avatarBrijesh Singh <brijesh.singh@amd.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      118154bd
    • Wanpeng Li's avatar
      KVM: LAPIC: Inject timer interrupt via posted interrupt · 0c5f81da
      Wanpeng Li authored
      Dedicated instances are currently disturbed by unnecessary jitter due
      to the emulated lapic timers firing on the same pCPUs where the
      vCPUs reside.  There is no hardware virtual timer on Intel for guest
      like ARM, so both programming timer in guest and the emulated timer fires
      incur vmexits.  This patch tries to avoid vmexit when the emulated timer
      fires, at least in dedicated instance scenario when nohz_full is enabled.
      
      In that case, the emulated timers can be offload to the nearest busy
      housekeeping cpus since APICv has been found for several years in server
      processors. The guest timer interrupt can then be injected via posted interrupts,
      which are delivered by the housekeeping cpu once the emulated timer fires.
      
      The host should tuned so that vCPUs are placed on isolated physical
      processors, and with several pCPUs surplus for busy housekeeping.
      If disabled mwait/hlt/pause vmexits keep the vCPUs in non-root mode,
      ~3% redis performance benefit can be observed on Skylake server, and the
      number of external interrupt vmexits drops substantially.  Without patch
      
                  VM-EXIT  Samples  Samples%  Time%   Min Time  Max Time   Avg time
      EXTERNAL_INTERRUPT    42916    49.43%   39.30%   0.47us   106.09us   0.71us ( +-   1.09% )
      
      While with patch:
      
                  VM-EXIT  Samples  Samples%  Time%   Min Time  Max Time         Avg time
      EXTERNAL_INTERRUPT    6871     9.29%     2.96%   0.44us    57.88us   0.72us ( +-   4.02% )
      
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Signed-off-by: default avatarWanpeng Li <wanpengli@tencent.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      0c5f81da
    • Seth Forshee's avatar
      kbuild: add -fcf-protection=none when using retpoline flags · 29be86d7
      Seth Forshee authored
      The gcc -fcf-protection=branch option is not compatible with
      -mindirect-branch=thunk-extern. The latter is used when
      CONFIG_RETPOLINE is selected, and this will fail to build with
      a gcc which has -fcf-protection=branch enabled by default. Adding
      -fcf-protection=none when building with retpoline enabled
      prevents such build failures.
      Signed-off-by: default avatarSeth Forshee <seth.forshee@canonical.com>
      Signed-off-by: default avatarMasahiro Yamada <yamada.masahiro@socionext.com>
      29be86d7
    • Masahiro Yamada's avatar
      kbuild: update compile-test header list for v5.3-rc1 · 67bf4745
      Masahiro Yamada authored
       - Some headers graduated from the blacklist
      
       - hyperv_timer.h joined the header-test when CONFIG_X86=y
      
       - nf_tables*.h joined the header-test when CONFIG_NF_TABLES is
         enabled.
      
       - The entry for nf_tables_offload.h was added to fix build error for
         the combination of CONFIG_NF_TABLES=n and CONFIG_KERNEL_HEADER_TEST=y.
      
       - The entry for iomap.h was added because this header is supposed to
         be included only when CONFIG_BLOCK=y
      Signed-off-by: default avatarMasahiro Yamada <yamada.masahiro@socionext.com>
      67bf4745
    • Linus Torvalds's avatar
      Merge tag 'armsoc-defconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc · abdfd52a
      Linus Torvalds authored
      Pull ARM SoC defconfig updates from Olof Johansson:
       "We keep this in a separate branch to avoid cross-branch conflicts, but
        most of the material here is fairly boring -- some new drivers turned
        on for hardware since they were merged, and some refreshed files due
        to time having moved a lot of entries around"
      
      * tag 'armsoc-defconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (47 commits)
        ARM: configs: multi_v5: Remove duplicate ASPEED options
        arm64: defconfig: Enable CONFIG_KEYBOARD_SNVS_PWRKEY as module
        ARM: imx_v6_v7_defconfig: Enable CONFIG_ARM_IMX_CPUFREQ_DT
        defconfig: arm64: enable i.MX8 SCU octop driver
        arm64: defconfig: Add i.MX SCU SoC info driver
        arm64: defconfig: Enable CONFIG_QORIQ_THERMAL
        ARM: imx_v6_v7_defconfig: Select CONFIG_NVMEM_SNVS_LPGPR
        arm64: defconfig: ARM_IMX_CPUFREQ_DT=m
        ARM: imx_v6_v7_defconfig: Add TPM PWM support by default
        ARM: imx_v6_v7_defconfig: Enable the OV2680 camera driver
        ARM: imx_v6_v7_defconfig: Enable CONFIG_THERMAL_STATISTICS
        arm64: defconfig: NVMEM_IMX_OCOTP=y for imx8m
        ARM: multi_v7_defconfig: enable STMFX pinctrl support
        arm64 defconfig: enable LVM support
        ARM: configs: multi_v5: Add more ASPEED devices
        arm64: defconfig: Add Tegra194 PCIe driver
        ARM: configs: aspeed: Add new drivers
        ARM: exynos_defconfig: Enable Panfrost and Lima drivers
        ARM: multi_v7_defconfig: Enable Panfrost and Lima drivers
        arm64 defconfig: enable Mellanox cards
        ...
      abdfd52a
    • Linus Torvalds's avatar
      Merge tag 'armsoc-dt' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc · af6af87d
      Linus Torvalds authored
      Pull ARM Devicetree updates from Olof Johansson:
       "We continue to see a lot of new material. I've highlighted some of it
        below, but there's been more beyond that as well.
      
        One of the sweeping changes is that many boards have seen their ARM
        Mali GPU devices added to device trees, since the DRM drivers have now
        been merged.
      
        So, with the caveat that I have surely missed several great
        contributions, here's a collection of the material this time around:
      
        New SoCs:
      
         - Mediatek mt8183 (4x Cortex-A73 + 4x Cortex-A53)
      
         - TI J721E (2x Cortex-A72 + 3x Cortex-R5F + 3 DSPs + MMA)
      
         - Amlogic G12B (4x Cortex-A73 + 2x Cortex-A53)
      
        New Boards / platforms:
      
         - Aspeed BMC support for a number of new server platforms
      
         - Kontron SMARC SoM (several i.MX6 versions)
      
         - Novtech's Meerkat96 (i.MX7)
      
         - ST Micro Avenger96 board
      
         - Hardkernel ODROID-N2 (Amlogic G12B)
      
         - Purism Librem5 devkit (i.MX8MQ)
      
         - Google Cheza (Qualcomm SDM845)
      
         - Qualcomm Dragonboard 845c (Qualcomm SDM845)
      
         - Hugsun X99 TV Box (Rockchip RK3399)
      
         - Khadas Edge/Edge-V/Captain (Rockchip RK3399)
      
        Updated / expanded boards and platforms:
      
         - Renesas r7s9210 has a lot of new peripherals added
      
         - Fixes and polish for Rockchip-based Chromebooks
      
         - Amlogic G12A has a lot of peripherals added
      
         - Nvidia Jetson Nano sees various fixes and improvements, and is now
           at feature parity with TX1"
      
      * tag 'armsoc-dt' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (586 commits)
        ARM: dts: gemini: Set DIR-685 SPI CS as active low
        ARM: dts: exynos: Adjust buck[78] regulators to supported values on Arndale Octa
        ARM: dts: exynos: Adjust buck[78] regulators to supported values on Odroid XU3 family
        ARM: dts: exynos: Move Mali400 GPU node to "/soc"
        ARM: dts: exynos: Fix imprecise abort on Mali GPU probe on Exynos4210
        arm64: dts: qcom: qcs404: Add missing space for cooling-cells property
        arm64: dts: rockchip: Fix USB3 Type-C on rk3399-sapphire
        arm64: dts: rockchip: Update DWC3 modules on RK3399 SoCs
        arm64: dts: rockchip: enable rk3328 watchdog clock
        ARM: dts: rockchip: add display nodes for rk322x
        ARM: dts: rockchip: fix vop iommu-cells on rk322x
        arm64: dts: rockchip: Add support for Hugsun X99 TV Box
        arm64: dts: rockchip: Define values for the IPA governor for rock960
        arm64: dts: rockchip: Fix multiple thermal zones conflict in rk3399.dtsi
        arm64: dts: rockchip: add core dtsi file for RK3399Pro SoCs
        arm64: dts: rockchip: improve rk3328-roc-cc rgmii performance.
        Revert "ARM: dts: rockchip: set PWM delay backlight settings for Minnie"
        ARM: dts: rockchip: Configure BT_DEV_WAKE in on rk3288-veyron
        arm64: dts: qcom: sdm845-cheza: add initial cheza dt
        ARM: dts: msm8974-FP2: Add vibration motor
        ...
      af6af87d
    • Linus Torvalds's avatar
      Merge tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc · 8362fd64
      Linus Torvalds authored
      Pull ARM SoC-related driver updates from Olof Johansson:
       "Various driver updates for platforms and a couple of the small driver
        subsystems we merge through our tree:
      
         - A driver for SCU (system control) on NXP i.MX8QXP
      
         - Qualcomm Always-on Subsystem messaging driver (AOSS QMP)
      
         - Qualcomm PM support for MSM8998
      
         - Support for a newer version of DRAM PHY driver for Broadcom (DPFE)
      
         - Reset controller support for Bitmain BM1880
      
         - TI SCI (System Control Interface) support for CPU control on AM654
           processors
      
         - More TI sysc refactoring and rework"
      
      * tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (84 commits)
        reset: remove redundant null check on pointer dev
        soc: rockchip: work around clang warning
        dt-bindings: reset: imx7: Fix the spelling of 'indices'
        soc: imx: Add i.MX8MN SoC driver support
        soc: aspeed: lpc-ctrl: Fix probe error handling
        soc: qcom: geni: Add support for ACPI
        firmware: ti_sci: Fix gcc unused-but-set-variable warning
        firmware: ti_sci: Use the correct style for SPDX License Identifier
        soc: imx8: Use existing of_root directly
        soc: imx8: Fix potential kernel dump in error path
        firmware/psci: psci_checker: Park kthreads before stopping them
        memory: move jedec_ddr.h from include/memory to drivers/memory/
        memory: move jedec_ddr_data.c from lib/ to drivers/memory/
        MAINTAINERS: Remove myself as qcom maintainer
        soc: aspeed: lpc-ctrl: make parameter optional
        soc: qcom: apr: Don't use reg for domain id
        soc: qcom: fix QCOM_AOSS_QMP dependency and build errors
        memory: tegra: Fix -Wunused-const-variable
        firmware: tegra: Early resume BPMP
        soc/tegra: Select pinctrl for Tegra194
        ...
      8362fd64
    • Linus Torvalds's avatar
      Merge tag 'armsoc-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc · 24e44913
      Linus Torvalds authored
      Pull ARM SoC platform updates from Olof Johansson:
       "SoC platform changes. Main theme this merge window:
      
         - The Netx platform (Netx 100/500) platform is removed by Linus
           Walleij-- the SoC doesn't have active maintainers with hardware,
           and in discussions with the vendor the agreement was that it's OK
           to remove.
      
         - Russell King has a series of patches that cleans up and refactors
           SA1101 and RiscPC support"
      
      * tag 'armsoc-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (47 commits)
        ARM: stm32: use "depends on" instead of "if" after prompt
        ARM: sa1100: convert to common clock framework
        ARM: exynos: Cleanup cppcheck shifting warning
        ARM: pxa/lubbock: remove lubbock_set_misc_wr() from global view
        ARM: exynos: Only build MCPM support if used
        arm: add missing include platform-data/atmel.h
        ARM: davinci: Use GPIO lookup table for DA850 LEDs
        ARM: OMAP2: drop explicit assembler architecture
        ARM: use arch_extension directive instead of arch argument
        ARM: imx: Switch imx7d to imx-cpufreq-dt for speed-grading
        ARM: bcm: Enable PINCTRL for ARCH_BRCMSTB
        ARM: bcm: Enable ARCH_HAS_RESET_CONTROLLER for ARCH_BRCMSTB
        ARM: riscpc: enable chained scatterlist support
        ARM: riscpc: reduce IRQ handling code
        ARM: riscpc: move RiscPC assembly files from arch/arm/lib to mach-rpc
        ARM: riscpc: parse video information from tagged list
        ARM: riscpc: add ecard quirk for Atomwide 3port serial card
        MAINTAINERS: mvebu: Add git entry
        soc: ti: pm33xx: Add a print while entering RTC only mode with DDR in self-refresh
        ARM: OMAP2+: Make some variables static
        ...
      24e44913
  2. 19 Jul, 2019 14 commits
    • Linus Torvalds's avatar
      Merge tag 'drm-next-2019-07-19' of git://anongit.freedesktop.org/drm/drm · 31cc088a
      Linus Torvalds authored
      Pull drm fixes from Daniel Vetter:
       "Dave is back in shape, but now family got it so I'm doing the pull.
        Two things worthy of note:
      
         - nouveau feature pull was way too late, Dave&me decided to not take
           that, so Ben spun up a pull with just the fixes.
      
         - after some chatting with the arm display maintainers we decided to
           change a bit how that's maintained, for more oversight/review and
           cross vendor collab.
      
        More details below:
      
        nouveau:
         - bugfixes
         - TU116 enabling (minor iteration) :w
      
        amdgpu:
         - large pile of fixes for new hw support this release (navi, vega20)
         - audio hotplug fix
         - bunch of corner cases and small fixes all over for amdgpu/kfd
      
        komeda:
         - back out some new properties (from this merge window) that needs
           more pondering.
      
        bochs:
         - fb pitch setup
      
        core:
         - a new panel quirk
         - misc fixes"
      
      * tag 'drm-next-2019-07-19' of git://anongit.freedesktop.org/drm/drm: (73 commits)
        drm/nouveau/secboot/gp102-: remove WAR for SEC2 RTOS start bug
        drm/nouveau/flcn/gp102-: improve implementation of bind_context() on SEC2/GSP
        drm/nouveau: fix memory leak in nouveau_conn_reset()
        drm/nouveau/dmem: missing mutex_lock in error path
        drm/nouveau/hwmon: return EINVAL if the GPU is powered down for sensors reads
        drm/nouveau: fix bogus GPL-2 license header
        drm/nouveau: fix bogus GPL-2 license header
        drm/nouveau/i2c: Enable i2c pads & busses during preinit
        drm/nouveau/disp/tu102-: wire up scdc parameter setter
        drm/nouveau/core: recognise TU116 chipset
        drm/nouveau/kms: disallow dual-link harder if hdmi connection detected
        drm/nouveau/disp/nv50-: fix center/aspect-corrected scaling
        drm/nouveau/disp/nv50-: force scaler for any non-default LVDS/eDP modes
        drm/nouveau/mcp89/mmu: Use mcp77_mmu_new instead of g84_mmu_new on MCP89.
        drm/amd/display: init res_pool dccg_ref, dchub_ref with xtalin_freq
        drm/amdgpu/pm: remove check for pp funcs in freq sysfs handlers
        drm/amd/display: Force uclk to max for every state
        drm/amdkfd: Remove GWS from process during uninit
        drm/amd/amdgpu: Fix offset for vmid selection in debugfs interface
        drm/amd/powerplay: update vega20 driver if to fit latest SMU firmware
        ...
      31cc088a
    • Linus Torvalds's avatar
      Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 · dd4542d2
      Linus Torvalds authored
      Pull crypto fixes from Herbert Xu:
      
       - Fix missed wake-up race in padata
      
       - Use crypto_memneq in ccp
      
       - Fix version check in ccp
      
       - Fix fuzz test failure in ccp
      
       - Fix potential double free in crypto4xx
      
       - Fix compile warning in stm32
      
      * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
        padata: use smp_mb in padata_reorder to avoid orphaned padata jobs
        crypto: ccp - Fix SEV_VERSION_GREATER_OR_EQUAL
        crypto: ccp/gcm - use const time tag comparison.
        crypto: ccp - memset structure fields to zero before reuse
        crypto: crypto4xx - fix a potential double free in ppc4xx_trng_probe
        crypto: stm32/hash - Fix incorrect printk modifier for size_t
      dd4542d2
    • Dave Jones's avatar
      Remove references to dead website. · 40ef768a
      Dave Jones authored
      This fell into disrepair a while ago, and the majority of hits to the
      snapshots were from bots, so it's more trouble to keep running than it's worth.
      Signed-off-by: default avatarDave Jones <davej@codemonkey.org.uk>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      40ef768a
    • Linus Torvalds's avatar
      Merge tag 'trace-v5.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace · 41ba485e
      Linus Torvalds authored
      Pull tracing fix from Steven Rostedt:
       "Eiichi Tsukata found a small bug from the fixup of the stack code
      
        Removing ULONG_MAX as the marker for the user stack trace end, made
        the tracing code not know where the end is. The end is now marked with
        a zero (NULL) pointer. Eiichi fixed this in the tracing code"
      
      * tag 'trace-v5.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
        tracing: Fix user stack trace "??" output
      41ba485e
    • Linus Torvalds's avatar
      Merge tag 'csky-for-linus-5.3-rc1' of git://github.com/c-sky/csky-linux · a84d2d29
      Linus Torvalds authored
      Pull arch/csky pupdates from Guo Ren:
       "This round of csky subsystem gives two features (ASID algorithm
        update, Perf pmu record support) and some fixups.
      
        ASID updates:
         - Revert mmu ASID mechanism
         - Add new asid lib code from arm
         - Use generic asid algorithm to implement switch_mm
         - Improve tlb operation with help of asid
      
        Perf pmu record support:
         - Init pmu as a device
         - Add count-width property for csky pmu
         - Add pmu interrupt support
         - Fix perf record in kernel/user space
         - dt-bindings: Add csky PMU bindings
      
        Fixes:
         - Fixup no panic in kernel for some traps
         - Fixup some error count in 810 & 860.
         - Fixup abiv1 memset error"
      
      * tag 'csky-for-linus-5.3-rc1' of git://github.com/c-sky/csky-linux:
        csky: Fixup abiv1 memset error
        csky: Improve tlb operation with help of asid
        csky: Use generic asid algorithm to implement switch_mm
        csky: Add new asid lib code from arm
        csky: Revert mmu ASID mechanism
        dt-bindings: csky: Add csky PMU bindings
        dt-bindings: interrupt-controller: Update csky mpintc
        csky: Fixup some error count in 810 & 860.
        csky: Fix perf record in kernel/user space
        csky: Add pmu interrupt support
        csky: Add count-width property for csky pmu
        csky: Init pmu as a device
        csky: Fixup no panic in kernel for some traps
        csky: Select intc & timer drivers
      a84d2d29
    • Linus Torvalds's avatar
      Merge tag 'for-linus-5.3a-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip · b5d72dda
      Linus Torvalds authored
      Pull xen updates from Juergen Gross:
       "Fixes and features:
      
         - A series to introduce a common command line parameter for disabling
           paravirtual extensions when running as a guest in virtualized
           environment
      
         - A fix for int3 handling in Xen pv guests
      
         - Removal of the Xen-specific tmem driver as support of tmem in Xen
           has been dropped (and it was experimental only)
      
         - A security fix for running as Xen dom0 (XSA-300)
      
         - A fix for IRQ handling when offlining cpus in Xen guests
      
         - Some small cleanups"
      
      * tag 'for-linus-5.3a-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
        xen: let alloc_xenballooned_pages() fail if not enough memory free
        xen/pv: Fix a boot up hang revealed by int3 self test
        x86/xen: Add "nopv" support for HVM guest
        x86/paravirt: Remove const mark from x86_hyper_xen_hvm variable
        xen: Map "xen_nopv" parameter to "nopv" and mark it obsolete
        x86: Add "nopv" parameter to disable PV extensions
        x86/xen: Mark xen_hvm_need_lapic() and xen_x2apic_para_available() as __init
        xen: remove tmem driver
        Revert "x86/paravirt: Set up the virt_spin_lock_key after static keys get initialized"
        xen/events: fix binding user event channels to cpus
      b5d72dda
    • Linus Torvalds's avatar
      Merge tag 'iomap-5.3-merge-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux · 26473f83
      Linus Torvalds authored
      Pull iomap split/cleanup from Darrick Wong:
       "As promised, here's the second part of the iomap merge for 5.3, in
        which we break up iomap.c into smaller files grouped by functional
        area so that it'll be easier in the long run to maintain cohesiveness
        of code units and to review incoming patches. There are no functional
        changes and fs/iomap.c split cleanly.
      
        Summary:
      
         - Regroup the fs/iomap.c code by major functional area so that we can
           start development for 5.4 from a more stable base"
      
      * tag 'iomap-5.3-merge-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
        iomap: move internal declarations into fs/iomap/
        iomap: move the main iteration code into a separate file
        iomap: move the buffered IO code into a separate file
        iomap: move the direct IO code into a separate file
        iomap: move the SEEK_HOLE code into a separate file
        iomap: move the file mapping reporting code into a separate file
        iomap: move the swapfile code into a separate file
        iomap: start moving code to fs/iomap/
      26473f83
    • Linus Torvalds's avatar
      Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 4f5ed131
      Linus Torvalds authored
      Pull misc vfs updates from Al Viro:
       "Assorted stuff"
      
      * 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        perf_event_get(): don't bother with fget_raw()
        vfs: update d_make_root() description
      4f5ed131
    • Linus Torvalds's avatar
      Merge branch 'work.adfs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · d2fbf4b6
      Linus Torvalds authored
      Pull adfs updates from Al Viro:
       "More ADFS patches from Russell King"
      
      * 'work.adfs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        fs/adfs: add time stamp and file type helpers
        fs/adfs: super: limit idlen according to directory type
        fs/adfs: super: fix use-after-free bug
        fs/adfs: super: safely update options on remount
        fs/adfs: super: correct superblock flags
        fs/adfs: clean up indirect disc addresses and fragment IDs
        fs/adfs: clean up error message printing
        fs/adfs: use %pV for error messages
        fs/adfs: use format_version from disc_record
        fs/adfs: add helper to get filesystem size
        fs/adfs: add helper to get discrecord from map
        fs/adfs: correct disc record structure
      d2fbf4b6
    • Linus Torvalds's avatar
      Merge branch 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 933a90bf
      Linus Torvalds authored
      Pull vfs mount updates from Al Viro:
       "The first part of mount updates.
      
        Convert filesystems to use the new mount API"
      
      * 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits)
        mnt_init(): call shmem_init() unconditionally
        constify ksys_mount() string arguments
        don't bother with registering rootfs
        init_rootfs(): don't bother with init_ramfs_fs()
        vfs: Convert smackfs to use the new mount API
        vfs: Convert selinuxfs to use the new mount API
        vfs: Convert securityfs to use the new mount API
        vfs: Convert apparmorfs to use the new mount API
        vfs: Convert openpromfs to use the new mount API
        vfs: Convert xenfs to use the new mount API
        vfs: Convert gadgetfs to use the new mount API
        vfs: Convert oprofilefs to use the new mount API
        vfs: Convert ibmasmfs to use the new mount API
        vfs: Convert qib_fs/ipathfs to use the new mount API
        vfs: Convert efivarfs to use the new mount API
        vfs: Convert configfs to use the new mount API
        vfs: Convert binfmt_misc to use the new mount API
        convenience helper: get_tree_single()
        convenience helper get_tree_nodev()
        vfs: Kill sget_userns()
        ...
      933a90bf
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 5f4fc6d4
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Fix AF_XDP cq entry leak, from Ilya Maximets.
      
       2) Fix handling of PHY power-down on RTL8411B, from Heiner Kallweit.
      
       3) Add some new PCI IDs to iwlwifi, from Ihab Zhaika.
      
       4) Fix handling of neigh timers wrt. entries added by userspace, from
          Lorenzo Bianconi.
      
       5) Various cases of missing of_node_put(), from Nishka Dasgupta.
      
       6) The new NET_ACT_CT needs to depend upon NF_NAT, from Yue Haibing.
      
       7) Various RDS layer fixes, from Gerd Rausch.
      
       8) Fix some more fallout from TCQ_F_CAN_BYPASS generalization, from
          Cong Wang.
      
       9) Fix FIB source validation checks over loopback, also from Cong Wang.
      
      10) Use promisc for unsupported number of filters, from Justin Chen.
      
      11) Missing sibling route unlink on failure in ipv6, from Ido Schimmel.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (90 commits)
        tcp: fix tcp_set_congestion_control() use from bpf hook
        ag71xx: fix return value check in ag71xx_probe()
        ag71xx: fix error return code in ag71xx_probe()
        usb: qmi_wwan: add D-Link DWM-222 A2 device ID
        bnxt_en: Fix VNIC accounting when enabling aRFS on 57500 chips.
        net: dsa: sja1105: Fix missing unlock on error in sk_buff()
        gve: replace kfree with kvfree
        selftests/bpf: fix test_xdp_noinline on s390
        selftests/bpf: fix "valid read map access into a read-only array 1" on s390
        net/mlx5: Replace kfree with kvfree
        MAINTAINERS: update netsec driver
        ipv6: Unlink sibling route in case of failure
        liquidio: Replace vmalloc + memset with vzalloc
        udp: Fix typo in net/ipv4/udp.c
        net: bcmgenet: use promisc for unsupported filters
        ipv6: rt6_check should return NULL if 'from' is NULL
        tipc: initialize 'validated' field of received packets
        selftests: add a test case for rp_filter
        fib: relax source validation check for loopback packets
        mlxsw: spectrum: Do not process learned records with a dummy FID
        ...
      5f4fc6d4
    • Linus Torvalds's avatar
      Merge branch 'akpm' (patches from Andrew) · 249be851
      Linus Torvalds authored
      Merge yet more updates from Andrew Morton:
       "The rest of MM and a kernel-wide procfs cleanup.
      
        Summary of the more significant patches:
      
         - Patch series "mm/memory_hotplug: Factor out memory block
           devicehandling", v3. David Hildenbrand.
      
           Some spring-cleaning of the memory hotplug code, notably in
           drivers/base/memory.c
      
         - "mm: thp: fix false negative of shmem vma's THP eligibility". Yang
           Shi.
      
           Fix /proc/pid/smaps output for THP pages used in shmem.
      
         - "resource: fix locking in find_next_iomem_res()" + 1. Nadav Amit.
      
           Bugfix and speedup for kernel/resource.c
      
         - Patch series "mm: Further memory block device cleanups", David
           Hildenbrand.
      
           More spring-cleaning of the memory hotplug code.
      
         - Patch series "mm: Sub-section memory hotplug support". Dan
           Williams.
      
           Generalise the memory hotplug code so that pmem can use it more
           completely. Then remove the hacks from the libnvdimm code which
           were there to work around the memory-hotplug code's constraints.
      
         - "proc/sysctl: add shared variables for range check", Matteo Croce.
      
           We have about 250 instances of
      
                int zero;
                ...
                        .extra1 = &zero,
      
           in the tree. This is a tree-wide sweep to make all those private
           "zero"s and "one"s use global variables.
      
           Alas, it isn't practical to make those two global integers const"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (38 commits)
        proc/sysctl: add shared variables for range check
        mm: migrate: remove unused mode argument
        mm/sparsemem: cleanup 'section number' data types
        libnvdimm/pfn: stop padding pmem namespaces to section alignment
        libnvdimm/pfn: fix fsdax-mode namespace info-block zero-fields
        mm/devm_memremap_pages: enable sub-section remap
        mm: document ZONE_DEVICE memory-model implications
        mm/sparsemem: support sub-section hotplug
        mm/sparsemem: prepare for sub-section ranges
        mm: kill is_dev_zone() helper
        mm/hotplug: kill is_dev_zone() usage in __remove_pages()
        mm/sparsemem: convert kmalloc_section_memmap() to populate_section_memmap()
        mm/hotplug: prepare shrink_{zone, pgdat}_span for sub-section removal
        mm/sparsemem: add helpers track active portions of a section at boot
        mm/sparsemem: introduce a SECTION_IS_EARLY flag
        mm/sparsemem: introduce struct mem_section_usage
        drivers/base/memory.c: get rid of find_memory_block_hinted()
        mm/memory_hotplug: move and simplify walk_memory_blocks()
        mm/memory_hotplug: rename walk_memory_range() and pass start+size instead of pfns
        mm: make register_mem_sect_under_node() static
        ...
      249be851
    • Eiichi Tsukata's avatar
      tracing: Fix user stack trace "??" output · 6d54ceb5
      Eiichi Tsukata authored
      Commit c5c27a0a ("x86/stacktrace: Remove the pointless ULONG_MAX
      marker") removes ULONG_MAX marker from user stack trace entries but
      trace_user_stack_print() still uses the marker and it outputs unnecessary
      "??".
      
      For example:
      
                  less-1911  [001] d..2    34.758944: <user stack trace>
         =>  <00007f16f2295910>
         => ??
         => ??
         => ??
         => ??
         => ??
         => ??
         => ??
      
      The user stack trace code zeroes the storage before saving the stack, so if
      the trace is shorter than the maximum number of entries it can terminate
      the print loop if a zero entry is detected.
      
      Link: http://lkml.kernel.org/r/20190630085438.25545-1-devel@etsukata.com
      
      Cc: stable@vger.kernel.org
      Fixes: 4285f2fc ("tracing: Remove the ULONG_MAX stack trace hackery")
      Signed-off-by: default avatarEiichi Tsukata <devel@etsukata.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      6d54ceb5
    • Dexuan Cui's avatar
      x86/hyper-v: Zero out the VP ASSIST PAGE on allocation · e320ab3c
      Dexuan Cui authored
      The VP ASSIST PAGE is an "overlay" page (see Hyper-V TLFS's Section
      5.2.1 "GPA Overlay Pages" for the details) and here is an excerpt:
      
      "The hypervisor defines several special pages that "overlay" the guest's
       Guest Physical Addresses (GPA) space. Overlays are addressed GPA but are
       not included in the normal GPA map maintained internally by the hypervisor.
       Conceptually, they exist in a separate map that overlays the GPA map.
      
       If a page within the GPA space is overlaid, any SPA page mapped to the
       GPA page is effectively "obscured" and generally unreachable by the
       virtual processor through processor memory accesses.
      
       If an overlay page is disabled, the underlying GPA page is "uncovered",
       and an existing mapping becomes accessible to the guest."
      
      SPA = System Physical Address = the final real physical address.
      
      When a CPU (e.g. CPU1) is onlined, hv_cpu_init() allocates the VP ASSIST
      PAGE and enables the EOI optimization for this CPU by writing the MSR
      HV_X64_MSR_VP_ASSIST_PAGE. From now on, hvp->apic_assist belongs to the
      special SPA page, and this CPU *always* uses hvp->apic_assist (which is
      shared with the hypervisor) to decide if it needs to write the EOI MSR.
      
      When a CPU is offlined then on the outgoing CPU:
      1. hv_cpu_die() disables the EOI optimizaton for this CPU, and from
         now on hvp->apic_assist belongs to the original "normal" SPA page;
      2. the remaining work of stopping this CPU is done
      3. this CPU is completely stopped.
      
      Between 1 and 3, this CPU can still receive interrupts (e.g. reschedule
      IPIs from CPU0, and Local APIC timer interrupts), and this CPU *must* write
      the EOI MSR for every interrupt received, otherwise the hypervisor may not
      deliver further interrupts, which may be needed to completely stop the CPU.
      
      So, after the EOI optimization is disabled in hv_cpu_die(), it's required
      that the hvp->apic_assist's bit0 is zero, which is not guaranteed by the
      current allocation mode because it lacks __GFP_ZERO. As a consequence the
      bit might be set and interrupt handling would not write the EOI MSR causing
      interrupt delivery to become stuck.
      
      Add the missing __GFP_ZERO to the allocation.
      
      Note 1: after the "normal" SPA page is allocted and zeroed out, neither the
      hypervisor nor the guest writes into the page, so the page remains with
      zeros.
      
      Note 2: see Section 10.3.5 "EOI Assist" for the details of the EOI
      optimization. When the optimization is enabled, the guest can still write
      the EOI MSR register irrespective of the "No EOI required" value, but
      that's slower than the optimized assist based variant.
      
      Fixes: ba696429 ("x86/hyper-v: Implement EOI assist")
      Signed-off-by: default avatarDexuan Cui <decui@microsoft.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/ <PU1P153MB0169B716A637FABF07433C04BFCB0@PU1P153MB0169.APCP153.PROD.OUTLOOK.COM
      e320ab3c