1. 22 Dec, 2022 7 commits
    • Linus Torvalds's avatar
      Merge tag 'perf-tools-for-v6.2-2-2022-12-22' of... · d1ac1a2b
      Linus Torvalds authored
      Merge tag 'perf-tools-for-v6.2-2-2022-12-22' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
      
      Pull more perf tools updates from Arnaldo Carvalho de Melo:
       "perf tools fixes and improvements:
      
         - Don't stop building perf if python setuptools isn't installed, just
           disable the affected perf feature.
      
         - Remove explicit reference to python 2.x devel files, that warning
           is about python-devel, no matter what version, being unavailable
           and thus disabling the linking with libpython.
      
         - Don't use -Werror=switch-enum when building the python support that
           handles libtraceevent enumerations, as there is no good way to test
           if some specific enum entry is available with the libtraceevent
           installed on the system.
      
         - Introduce 'perf lock contention' --type-filter and --lock-filter,
           to filter by lock type and lock name:
      
              $ sudo ./perf lock record -a -- ./perf bench sched messaging
      
              $ sudo ./perf lock contention -E 5 -Y spinlock
               contended  total wait   max wait  avg wait      type  caller
      
                     802     1.26 ms   11.73 us   1.58 us  spinlock  __wake_up_common_lock+0x62
                      13   787.16 us  105.44 us  60.55 us  spinlock  remove_wait_queue+0x14
                      12   612.96 us   78.70 us  51.08 us  spinlock  prepare_to_wait+0x27
                     114   340.68 us   12.61 us   2.99 us  spinlock  try_to_wake_up+0x1f5
                      83   226.38 us    9.15 us   2.73 us  spinlock  folio_lruvec_lock_irqsave+0x5e
      
              $ sudo ./perf lock contention -l
               contended  total wait  max wait  avg wait           address  symbol
      
                      57     1.11 ms  42.83 us  19.54 us  ffff9f4140059000
                      15   280.88 us  23.51 us  18.73 us  ffffffff9d007a40  jiffies_lock
                       1    20.49 us  20.49 us  20.49 us  ffffffff9d0d50c0  rcu_state
                       1     9.02 us   9.02 us   9.02 us  ffff9f41759e9ba0
      
              $ sudo ./perf lock contention -L jiffies_lock,rcu_state
               contended  total wait  max wait  avg wait      type  caller
      
                      15   280.88 us  23.51 us  18.73 us  spinlock  tick_sched_do_timer+0x93
                       1    20.49 us  20.49 us  20.49 us  spinlock  __softirqentry_text_start+0xeb
      
              $ sudo ./perf lock contention -L ffff9f4140059000
               contended  total wait  max wait  avg wait      type  caller
      
                      38   779.40 us  42.83 us  20.51 us  spinlock  worker_thread+0x50
                      11   216.30 us  39.87 us  19.66 us  spinlock  queue_work_on+0x39
                       8   118.13 us  20.51 us  14.77 us  spinlock  kthread+0xe5
      
         - Fix splitting CC into compiler and options when checking if a
           option is present in clang to build the python binding, needed in
           systems such as yocto that set CC to, e.g.: "gcc --sysroot=/a/b/c".
      
         - Refresh metris and events for Intel systems: alderlake.
           alderlake-n, bonnell, broadwell, broadwellde, broadwellx,
           cascadelakex, elkhartlake, goldmont, goldmontplus, haswell,
           haswellx, icelake, icelakex, ivybridge, ivytown, jaketown,
           knightslanding, meteorlake, nehalemep, nehalemex, sandybridge,
           sapphirerapids, silvermont, skylake, skylakex, snowridgex,
           tigerlake, westmereep-dp, westmereep-sp, westmereex.
      
         - Add vendor events files (JSON) for AMD Zen 4, from sections
           2.1.15.4 "Core Performance Monitor Counters", 2.1.15.5 "L3 Cache
           Performance Monitor Counter"s and Section 7.1 "Fabric Performance
           Monitor Counter (PMC) Events" in the Processor Programming
           Reference (PPR) for AMD Family 19h Model 11h Revision B1
           processors.
      
           This constitutes events which capture op dispatch, execution and
           retirement, branch prediction, L1 and L2 cache activity, TLB
           activity, L3 cache activity and data bandwidth for various links
           and interfaces in the Data Fabric.
      
         - Also, from the same PPR are metrics taken from Section 2.1.15.2
           "Performance Measurement", including pipeline utilization, which
           are new to Zen 4 processors and useful for finding performance
           bottlenecks by analyzing activity at different stages of the
           pipeline.
      
         - Greatly improve the 'srcline', 'srcline_from', 'srcline_to' and
           'srcfile' sort keys performance by postponing calling the external
           addr2line utility to the collapse phase of histogram bucketing.
      
         - Fix 'perf test' "all PMU test" to skip parametrized events, that
           requires setting up and are not supported by this test.
      
         - Update tools/ copies of kernel headers: features,
           disabled-features, fscrypt.h, i915_drm.h, msr-index.h, power pc
           syscall table and kvm.h.
      
         - Add .DELETE_ON_ERROR special Makefile target to clean up partially
           updated files on error.
      
         - Simplify the mksyscalltbl script for arm64 by avoiding to run the
           host compiler to create the syscall table, do it all just with the
           shell script.
      
         - Further fixes to honour quiet mode (-q)"
      
      * tag 'perf-tools-for-v6.2-2-2022-12-22' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (67 commits)
        perf python: Fix splitting CC into compiler and options
        perf scripting python: Don't be strict at handling libtraceevent enumerations
        perf arm64: Simplify mksyscalltbl
        perf build: Remove explicit reference to python 2.x devel files
        perf vendor events amd: Add Zen 4 mapping
        perf vendor events amd: Add Zen 4 metrics
        perf vendor events amd: Add Zen 4 uncore events
        perf vendor events amd: Add Zen 4 core events
        perf vendor events intel: Refresh westmereex events
        perf vendor events intel: Refresh westmereep-sp events
        perf vendor events intel: Refresh westmereep-dp events
        perf vendor events intel: Refresh tigerlake metrics and events
        perf vendor events intel: Refresh snowridgex events
        perf vendor events intel: Refresh skylakex metrics and events
        perf vendor events intel: Refresh skylake metrics and events
        perf vendor events intel: Refresh silvermont events
        perf vendor events intel: Refresh sapphirerapids metrics and events
        perf vendor events intel: Refresh sandybridge metrics and events
        perf vendor events intel: Refresh nehalemex events
        perf vendor events intel: Refresh nehalemep events
        ...
      d1ac1a2b
    • Arnaldo Carvalho de Melo's avatar
      perf python: Fix splitting CC into compiler and options · 09e6f9f9
      Arnaldo Carvalho de Melo authored
      Noticed this build failure on archlinux:base when building with clang:
      
        clang-14: error: optimization flag '-ffat-lto-objects' is not supported [-Werror,-Wignored-optimization-argument]
      
      In tools/perf/util/setup.py we check if clang supports that option, but
      since commit 3cad53a6 ("perf python: Account for multiple words
      in CC") this got broken as in the common case where CC="clang":
      
        >>> cc="clang"
        >>> print(cc.split()[0])
        clang
        >>> option="-ffat-lto-objects"
        >>> print(str(cc.split()[1:]) + option)
        []-ffat-lto-objects
        >>>
      
      And then the Popen will call clang with that bogus option name that in
      turn will not produce the b"unknown argument" or b"is not supported"
      that this function uses to detect if the option is not available and
      thus later on clang will be called with an unknown/unsupported option.
      
      Fix it by looking if really there are options in the provided CC
      variable, and if so override 'cc' with the first token and append the
      options to the 'option' variable.
      
      Fixes: 3cad53a6 ("perf python: Account for multiple words in CC")
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Fangrui Song <maskray@google.com>
      Cc: Florian Fainelli <f.fainelli@gmail.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Keeping <john@metanate.com>
      Cc: Khem Raj <raj.khem@gmail.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Nathan Chancellor <nathan@kernel.org>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Sedat Dilek <sedat.dilek@gmail.com>
      Link: http://lore.kernel.org/lkml/Y6Rq5F5NI0v1QQHM@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      09e6f9f9
    • Linus Torvalds's avatar
      Merge tag 'trace-v6.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace · 9d2f6060
      Linus Torvalds authored
      Pull tracing fix from Steven Rostedt:
       "I missed this minor hardening of the kernel in the first pull.
      
         - Make monitor structures read only"
      
      * tag 'trace-v6.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
        rv/monitors: Move monitor structure in rodata
      9d2f6060
    • Linus Torvalds's avatar
      Merge tag 'trace-probes-v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace · af9b3fa1
      Linus Torvalds authored
      Pull trace probes updates from Steven Rostedt:
      
       - New "symstr" type for dynamic events that writes the name of the
         function+offset into the ring buffer and not just the address
      
       - Prevent kernel symbol processing on addresses in user space probes
         (uprobes).
      
       - And minor fixes and clean ups
      
      * tag 'trace-probes-v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
        tracing/probes: Reject symbol/symstr type for uprobe
        tracing/probes: Add symstr type for dynamic events
        kprobes: kretprobe events missing on 2-core KVM guest
        kprobes: Fix check for probe enabled in kill_kprobe()
        test_kprobes: Fix implicit declaration error of test_kprobes
        tracing: Fix race where eprobes can be called before the event
      af9b3fa1
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 7a5189c5
      Linus Torvalds authored
      Pull RISC-V kvm updates from Paolo Bonzini:
      
       - Allow unloading KVM module
      
       - Allow KVM user-space to set mvendorid, marchid, and mimpid
      
       - Several fixes and cleanups
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        RISC-V: KVM: Add ONE_REG interface for mvendorid, marchid, and mimpid
        RISC-V: KVM: Save mvendorid, marchid, and mimpid when creating VCPU
        RISC-V: Export sbi_get_mvendorid() and friends
        RISC-V: KVM: Move sbi related struct and functions to kvm_vcpu_sbi.h
        RISC-V: KVM: Use switch-case in kvm_riscv_vcpu_set/get_reg()
        RISC-V: KVM: Remove redundant includes of asm/csr.h
        RISC-V: KVM: Remove redundant includes of asm/kvm_vcpu_timer.h
        RISC-V: KVM: Fix reg_val check in kvm_riscv_vcpu_set_reg_config()
        RISC-V: KVM: Simplify kvm_arch_prepare_memory_region()
        RISC-V: KVM: Exit run-loop immediately if xfer_to_guest fails
        RISC-V: KVM: use vma_lookup() instead of find_vma_intersection()
        RISC-V: KVM: Add exit logic to main.c
      7a5189c5
    • Linus Torvalds's avatar
      Merge tag 'block-6.2-2022-12-19' of git://git.kernel.dk/linux · 569c3a28
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
      
       - Various fixes for BFQ (Yu, Yuwei)
      
       - Fix for loop command line parsing (Isaac)
      
       - No need to specifically clear REQ_ALLOC_CACHE on IOPOLL downgrade
         anymore (me)
      
       - blk-iocost enum fix for newer gcc (Jiri)
      
       - UAF fix for queue release (Ming)
      
       - blk-iolatency error handling memory leak fix (Tejun)
      
      * tag 'block-6.2-2022-12-19' of git://git.kernel.dk/linux:
        block: don't clear REQ_ALLOC_CACHE for non-polled requests
        block: fix use-after-free of q->q_usage_counter
        block, bfq: only do counting of pending-request for BFQ_GROUP_IOSCHED
        blk-iolatency: Fix memory leak on add_disk() failures
        loop: Fix the max_loop commandline argument treatment when it is set to 0
        block/blk-iocost (gcc13): keep large values in a new enum
        block, bfq: replace 0/1 with false/true in bic apis
        block, bfq: don't return bfqg from __bfq_bic_change_cgroup()
        block, bfq: fix possible uaf for 'bfqq->bic'
      569c3a28
    • Linus Torvalds's avatar
      Merge tag 'io_uring-6.2-2022-12-19' of git://git.kernel.dk/linux · 5d4740fc
      Linus Torvalds authored
      Pull io_uring fixes from Jens Axboe:
      
       - Improve the locking for timeouts. This was originally queued up for
         the initial pull, but I messed up and it got missed. (Pavel)
      
       - Fix an issue with running task_work from the wait path, causing some
         inefficiencies (me)
      
       - Add a clear of ->free_iov upfront in the 32-bit compat data
         importing, so we ensure that it's always sane at completion time (me)
      
       - Use call_rcu_hurry() for the eventfd signaling (Dylan)
      
       - Ordering fix for multishot recv completions (Pavel)
      
       - Add the io_uring trace header to the MAINTAINERS entry (Ammar)
      
      * tag 'io_uring-6.2-2022-12-19' of git://git.kernel.dk/linux:
        MAINTAINERS: io_uring: Add include/trace/events/io_uring.h
        io_uring/net: fix cleanup after recycle
        io_uring/net: ensure compat import handlers clear free_iov
        io_uring: include task_work run after scheduling in wait for events
        io_uring: don't use TIF_NOTIFY_SIGNAL to test for availability of task_work
        io_uring: use call_rcu_hurry if signaling an eventfd
        io_uring: fix overflow handling regression
        io_uring: ease timeout flush locking requirements
        io_uring: revise completion_lock locking
        io_uring: protect cq_timeouts with timeout_lock
      5d4740fc
  2. 21 Dec, 2022 33 commits