1. 30 Apr, 2020 2 commits
    • Barret Rhoden's avatar
      perf: Add cond_resched() to task_function_call() · 2ed6edd3
      Barret Rhoden authored
      Under rare circumstances, task_function_call() can repeatedly fail and
      cause a soft lockup.
      
      There is a slight race where the process is no longer running on the cpu
      we targeted by the time remote_function() runs.  The code will simply
      try again.  If we are very unlucky, this will continue to fail, until a
      watchdog fires.  This can happen in a heavily loaded, multi-core virtual
      machine.
      
      Reported-by: syzbot+bb4935a5c09b5ff79940@syzkaller.appspotmail.com
      Signed-off-by: default avatarBarret Rhoden <brho@google.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lkml.kernel.org/r/20200414222920.121401-1-brho@google.com
      2ed6edd3
    • CodyYao-oc's avatar
      x86/perf: Add hardware performance events support for Zhaoxin CPU. · 3a4ac121
      CodyYao-oc authored
      Zhaoxin CPU has provided facilities for monitoring performance
      via PMU (Performance Monitor Unit), but the functionality is unused so far.
      Therefore, add support for zhaoxin pmu to make performance related
      hardware events available.
      
      The PMU is mostly an Intel Architectural PerfMon-v2 with a novel
      errata for the ZXC line. It supports the following events:
      
        -----------------------------------------------------------------------------------------------------------------------------------
        Event                      | Event  | Umask |          Description
      			     | Select |       |
        -----------------------------------------------------------------------------------------------------------------------------------
        cpu-cycles                 |  82h   |  00h  | unhalt core clock
        instructions               |  00h   |  00h  | number of instructions at retirement.
        cache-references           |  15h   |  05h  | number of fillq pushs at the current cycle.
        cache-misses               |  1ah   |  05h  | number of l2 miss pushed by fillq.
        branch-instructions        |  28h   |  00h  | counts the number of branch instructions retired.
        branch-misses              |  29h   |  00h  | mispredicted branch instructions at retirement.
        bus-cycles                 |  83h   |  00h  | unhalt bus clock
        stalled-cycles-frontend    |  01h   |  01h  | Increments each cycle the # of Uops issued by the RAT to RS.
        stalled-cycles-backend     |  0fh   |  04h  | RS0/1/2/3/45 empty
        L1-dcache-loads            |  68h   |  05h  | number of retire/commit load.
        L1-dcache-load-misses      |  4bh   |  05h  | retired load uops whose data source followed an L1 miss.
        L1-dcache-stores           |  69h   |  06h  | number of retire/commit Store,no LEA
        L1-dcache-store-misses     |  62h   |  05h  | cache lines in M state evicted out of L1D due to Snoop HitM or dirty line replacement.
        L1-icache-loads            |  00h   |  03h  | number of l1i cache access for valid normal fetch,including un-cacheable access.
        L1-icache-load-misses      |  01h   |  03h  | number of l1i cache miss for valid normal fetch,including un-cacheable miss.
        L1-icache-prefetches       |  0ah   |  03h  | number of prefetch.
        L1-icache-prefetch-misses  |  0bh   |  03h  | number of prefetch miss.
        dTLB-loads                 |  68h   |  05h  | number of retire/commit load
        dTLB-load-misses           |  2ch   |  05h  | number of load operations miss all level tlbs and cause a tablewalk.
        dTLB-stores                |  69h   |  06h  | number of retire/commit Store,no LEA
        dTLB-store-misses          |  30h   |  05h  | number of store operations miss all level tlbs and cause a tablewalk.
        dTLB-prefetches            |  64h   |  05h  | number of hardware pte prefetch requests dispatched out of the prefetch FIFO.
        dTLB-prefetch-misses       |  65h   |  05h  | number of hardware pte prefetch requests miss the l1d data cache.
        iTLB-load                  |  00h   |  00h  | actually counter instructions.
        iTLB-load-misses           |  34h   |  05h  | number of code operations miss all level tlbs and cause a tablewalk.
        -----------------------------------------------------------------------------------------------------------------------------------
      Reported-by: default avatarkbuild test robot <lkp@intel.com>
      Signed-off-by: default avatarCodyYao-oc <CodyYao-oc@zhaoxin.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lkml.kernel.org/r/1586747669-4827-1-git-send-email-CodyYao-oc@zhaoxin.com
      3a4ac121
  2. 22 Apr, 2020 1 commit
    • Ingo Molnar's avatar
      Merge tag 'perf-core-for-mingo-5.8-20200420' of... · 87cfeb19
      Ingo Molnar authored
      Merge tag 'perf-core-for-mingo-5.8-20200420' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
      
      Pull perf/core fixes and improvements from Arnaldo Carvalho de Melo:
      
      kernel + tools/perf:
      
        Alexey Budankov:
      
        - Introduce CAP_PERFMON to kernel and user space.
      
      callchains:
      
        Adrian Hunter:
      
        - Allow using Intel PT to synthesize callchains for regular events.
      
        Kan Liang:
      
        - Stitch LBR records from multiple samples to get deeper backtraces,
          there are caveats, see the csets for details.
      
      perf script:
      
        Andreas Gerstmayr:
      
        - Add flamegraph.py script
      
      BPF:
      
        Jiri Olsa:
      
        - Synthesize bpf_trampoline/dispatcher ksymbol events.
      
      perf stat:
      
        Arnaldo Carvalho de Melo:
      
        - Honour --timeout for forked workloads.
      
        Stephane Eranian:
      
        - Force error in fallback on :k events, to avoid counting nothing when
          the user asks for kernel events but is not allowed to.
      
      perf bench:
      
        Ian Rogers:
      
        - Add event synthesis benchmark.
      
      tools api fs:
      
        Stephane Eranian:
      
       - Make xxx__mountpoint() more scalable
      
      libtraceevent:
      
        He Zhe:
      
        - Handle return value of asprintf.
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      87cfeb19
  3. 21 Apr, 2020 22 commits
  4. 20 Apr, 2020 9 commits
  5. 19 Apr, 2020 6 commits
    • Linus Torvalds's avatar
      Linux 5.7-rc2 · ae83d0b4
      Linus Torvalds authored
      ae83d0b4
    • Brian Geffon's avatar
      mm: Fix MREMAP_DONTUNMAP accounting on VMA merge · dadbd85f
      Brian Geffon authored
      When remapping a mapping where a portion of a VMA is remapped
      into another portion of the VMA it can cause the VMA to become
      split. During the copy_vma operation the VMA can actually
      be remerged if it's an anonymous VMA whose pages have not yet
      been faulted. This isn't normally a problem because at the end
      of the remap the original portion is unmapped causing it to
      become split again.
      
      However, MREMAP_DONTUNMAP leaves that original portion in place which
      means that the VMA which was split and then remerged is not actually
      split at the end of the mremap. This patch fixes a bug where
      we don't detect that the VMAs got remerged and we end up
      putting back VM_ACCOUNT on the next mapping which is completely
      unreleated. When that next mapping is unmapped it results in
      incorrectly unaccounting for the memory which was never accounted,
      and eventually we will underflow on the memory comittment.
      
      There is also another issue which is similar, we're currently
      accouting for the number of pages in the new_vma but that's wrong.
      We need to account for the length of the remap operation as that's
      all that is being added. If there was a mapping already at that
      location its comittment would have been adjusted as part of
      the munmap at the start of the mremap.
      
      A really simple repro can be seen in:
      https://gist.github.com/bgaff/e101ce99da7d9a8c60acc641d07f312c
      
      Fixes: e346b381 ("mm/mremap: add MREMAP_DONTUNMAP to mremap()")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarBrian Geffon <bgeffon@google.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      dadbd85f
    • Linus Torvalds's avatar
      Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux · 86cc3398
      Linus Torvalds authored
      Pull clk fixes from Stephen Boyd:
       "Two build fixes for a couple clk drivers and a fix for the Unisoc
        serial clk where we want to keep it on for earlycon"
      
      * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
        clk: sprd: don't gate uart console clock
        clk: mmp2: fix link error without mmp2
        clk: asm9260: fix __clk_hw_register_fixed_rate_with_accuracy typo
      86cc3398
    • Linus Torvalds's avatar
      Merge tag 'x86-urgent-2020-04-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 0fe5f9ca
      Linus Torvalds authored
      Pull x86 and objtool fixes from Thomas Gleixner:
       "A set of fixes for x86 and objtool:
      
        objtool:
      
         - Ignore the double UD2 which is emitted in BUG() when
           CONFIG_UBSAN_TRAP is enabled.
      
         - Support clang non-section symbols in objtool ORC dump
      
         - Fix switch table detection in .text.unlikely
      
         - Make the BP scratch register warning more robust.
      
        x86:
      
         - Increase microcode maximum patch size for AMD to cope with new CPUs
           which have a larger patch size.
      
         - Fix a crash in the resource control filesystem when the removal of
           the default resource group is attempted.
      
         - Preserve Code and Data Prioritization enabled state accross CPU
           hotplug.
      
         - Update split lock cpu matching to use the new X86_MATCH macros.
      
         - Change the split lock enumeration as Intel finaly decided that the
           IA32_CORE_CAPABILITIES bits are not architectural contrary to what
           the SDM claims. !@#%$^!
      
         - Add Tremont CPU models to the split lock detection cpu match.
      
         - Add a missing static attribute to make sparse happy"
      
      * tag 'x86-urgent-2020-04-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/split_lock: Add Tremont family CPU models
        x86/split_lock: Bits in IA32_CORE_CAPABILITIES are not architectural
        x86/resctrl: Preserve CDP enable over CPU hotplug
        x86/resctrl: Fix invalid attempt at removing the default resource group
        x86/split_lock: Update to use X86_MATCH_INTEL_FAM6_MODEL()
        x86/umip: Make umip_insns static
        x86/microcode/AMD: Increase microcode PATCH_MAX_SIZE
        objtool: Make BP scratch register warning more robust
        objtool: Fix switch table detection in .text.unlikely
        objtool: Support Clang non-section symbols in ORC generation
        objtool: Support Clang non-section symbols in ORC dump
        objtool: Fix CONFIG_UBSAN_TRAP unreachable warnings
      0fe5f9ca
    • Linus Torvalds's avatar
      Merge tag 'timers-urgent-2020-04-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 3e0dea57
      Linus Torvalds authored
      Pull time namespace fix from Thomas Gleixner:
       "An update for the proc interface of time namespaces: Use symbolic
        names instead of clockid numbers. The usability nuisance of numbers
        was noticed by Michael when polishing the man page"
      
      * tag 'timers-urgent-2020-04-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        proc, time/namespace: Show clock symbolic names in /proc/pid/timens_offsets
      3e0dea57
    • Linus Torvalds's avatar
      Merge tag 'perf-urgent-2020-04-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · b7374586
      Linus Torvalds authored
      Pull perf tooling fixes and updates from Thomas Gleixner:
      
       - Fix the header line of perf stat output for '--metric-only --per-socket'
      
       - Fix the python build with clang
      
       - The usual tools UAPI header synchronization
      
      * tag 'perf-urgent-2020-04-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        tools headers: Synchronize linux/bits.h with the kernel sources
        tools headers: Adopt verbatim copy of compiletime_assert() from kernel sources
        tools headers: Update x86's syscall_64.tbl with the kernel sources
        tools headers UAPI: Sync drm/i915_drm.h with the kernel sources
        tools headers UAPI: Update tools's copy of drm.h headers
        tools headers kvm: Sync linux/kvm.h with the kernel sources
        tools headers UAPI: Sync linux/fscrypt.h with the kernel sources
        tools include UAPI: Sync linux/vhost.h with the kernel sources
        tools arch x86: Sync asm/cpufeatures.h with the kernel sources
        tools headers UAPI: Sync linux/mman.h with the kernel
        tools headers UAPI: Sync sched.h with the kernel
        tools headers: Update linux/vdso.h and grab a copy of vdso/const.h
        perf stat: Fix no metric header if --per-socket and --metric-only set
        perf python: Check if clang supports -fno-semantic-interposition
        tools arch x86: Sync the msr-index.h copy with the kernel sources
      b7374586