1. 16 Nov, 2020 1 commit
    • Namhyung Kim's avatar
      perf data: Allow to use stdio functions for pipe mode · 60136667
      Namhyung Kim authored
      When perf data is in a pipe, it reads each event separately using
      read(2) syscall.  This is a huge performance bottleneck when
      processing large data like in perf inject.  Also perf inject needs to
      use write(2) syscall for the output.
      
      So convert it to use buffer I/O functions in stdio library for pipe
      data.  This makes inject-build-id bench time drops from 20ms to 8ms.
      
        $ perf bench internals inject-build-id
        # Running 'internals/inject-build-id' benchmark:
          Average build-id injection took: 8.074 msec (+- 0.013 msec)
          Average time per event: 0.792 usec (+- 0.001 usec)
          Average memory usage: 8328 KB (+- 0 KB)
          Average build-id-all injection took: 5.490 msec (+- 0.008 msec)
          Average time per event: 0.538 usec (+- 0.001 usec)
          Average memory usage: 7563 KB (+- 0 KB)
      
      This patch enables it just for perf inject when used with pipe (it's a
      default behavior).  Maybe we could do it for perf record and/or report
      later..
      
      Committer testing:
      
      Before:
      
        $ perf stat -r 5 perf bench internals inject-build-id
        # Running 'internals/inject-build-id' benchmark:
          Average build-id injection took: 13.605 msec (+- 0.064 msec)
          Average time per event: 1.334 usec (+- 0.006 usec)
          Average memory usage: 12220 KB (+- 7 KB)
          Average build-id-all injection took: 11.458 msec (+- 0.058 msec)
          Average time per event: 1.123 usec (+- 0.006 usec)
          Average memory usage: 11546 KB (+- 8 KB)
        # Running 'internals/inject-build-id' benchmark:
          Average build-id injection took: 13.673 msec (+- 0.057 msec)
          Average time per event: 1.341 usec (+- 0.006 usec)
          Average memory usage: 12508 KB (+- 8 KB)
          Average build-id-all injection took: 11.437 msec (+- 0.046 msec)
          Average time per event: 1.121 usec (+- 0.004 usec)
          Average memory usage: 11812 KB (+- 7 KB)
        # Running 'internals/inject-build-id' benchmark:
          Average build-id injection took: 13.641 msec (+- 0.069 msec)
          Average time per event: 1.337 usec (+- 0.007 usec)
          Average memory usage: 12302 KB (+- 8 KB)
          Average build-id-all injection took: 10.820 msec (+- 0.106 msec)
          Average time per event: 1.061 usec (+- 0.010 usec)
          Average memory usage: 11616 KB (+- 7 KB)
        # Running 'internals/inject-build-id' benchmark:
          Average build-id injection took: 13.379 msec (+- 0.074 msec)
          Average time per event: 1.312 usec (+- 0.007 usec)
          Average memory usage: 12334 KB (+- 8 KB)
          Average build-id-all injection took: 11.288 msec (+- 0.071 msec)
          Average time per event: 1.107 usec (+- 0.007 usec)
          Average memory usage: 11657 KB (+- 8 KB)
        # Running 'internals/inject-build-id' benchmark:
          Average build-id injection took: 13.534 msec (+- 0.058 msec)
          Average time per event: 1.327 usec (+- 0.006 usec)
          Average memory usage: 12264 KB (+- 8 KB)
          Average build-id-all injection took: 11.557 msec (+- 0.076 msec)
          Average time per event: 1.133 usec (+- 0.007 usec)
          Average memory usage: 11593 KB (+- 8 KB)
      
         Performance counter stats for 'perf bench internals inject-build-id' (5 runs):
      
                  4,060.05 msec task-clock:u              #    1.566 CPUs utilized            ( +-  0.65% )
                         0      context-switches:u        #    0.000 K/sec
                         0      cpu-migrations:u          #    0.000 K/sec
                   101,888      page-faults:u             #    0.025 M/sec                    ( +-  0.12% )
             3,745,833,163      cycles:u                  #    0.923 GHz                      ( +-  0.10% )  (83.22%)
               194,346,613      stalled-cycles-frontend:u #    5.19% frontend cycles idle     ( +-  0.57% )  (83.30%)
               708,495,034      stalled-cycles-backend:u  #   18.91% backend cycles idle      ( +-  0.48% )  (83.48%)
             5,629,328,628      instructions:u            #    1.50  insn per cycle
                                                          #    0.13  stalled cycles per insn  ( +-  0.21% )  (83.57%)
             1,236,697,927      branches:u                #  304.602 M/sec                    ( +-  0.16% )  (83.44%)
                17,564,877      branch-misses:u           #    1.42% of all branches          ( +-  0.23% )  (82.99%)
      
                    2.5934 +- 0.0128 seconds time elapsed  ( +-  0.49% )
      
        $
      
      After:
      
        $ perf stat -r 5 perf bench internals inject-build-id
        # Running 'internals/inject-build-id' benchmark:
          Average build-id injection took: 8.560 msec (+- 0.125 msec)
          Average time per event: 0.839 usec (+- 0.012 usec)
          Average memory usage: 12520 KB (+- 8 KB)
          Average build-id-all injection took: 5.789 msec (+- 0.054 msec)
          Average time per event: 0.568 usec (+- 0.005 usec)
          Average memory usage: 11919 KB (+- 9 KB)
        # Running 'internals/inject-build-id' benchmark:
          Average build-id injection took: 8.639 msec (+- 0.111 msec)
          Average time per event: 0.847 usec (+- 0.011 usec)
          Average memory usage: 12732 KB (+- 8 KB)
          Average build-id-all injection took: 5.647 msec (+- 0.069 msec)
          Average time per event: 0.554 usec (+- 0.007 usec)
          Average memory usage: 12093 KB (+- 7 KB)
        # Running 'internals/inject-build-id' benchmark:
          Average build-id injection took: 8.551 msec (+- 0.096 msec)
          Average time per event: 0.838 usec (+- 0.009 usec)
          Average memory usage: 12739 KB (+- 8 KB)
          Average build-id-all injection took: 5.617 msec (+- 0.061 msec)
          Average time per event: 0.551 usec (+- 0.006 usec)
          Average memory usage: 12105 KB (+- 7 KB)
        # Running 'internals/inject-build-id' benchmark:
          Average build-id injection took: 8.403 msec (+- 0.097 msec)
          Average time per event: 0.824 usec (+- 0.010 usec)
          Average memory usage: 12770 KB (+- 8 KB)
          Average build-id-all injection took: 5.611 msec (+- 0.085 msec)
          Average time per event: 0.550 usec (+- 0.008 usec)
          Average memory usage: 12134 KB (+- 8 KB)
        # Running 'internals/inject-build-id' benchmark:
          Average build-id injection took: 8.518 msec (+- 0.102 msec)
          Average time per event: 0.835 usec (+- 0.010 usec)
          Average memory usage: 12518 KB (+- 10 KB)
          Average build-id-all injection took: 5.503 msec (+- 0.073 msec)
          Average time per event: 0.540 usec (+- 0.007 usec)
          Average memory usage: 11882 KB (+- 8 KB)
      
         Performance counter stats for 'perf bench internals inject-build-id' (5 runs):
      
                  2,394.88 msec task-clock:u              #    1.577 CPUs utilized            ( +-  0.83% )
                         0      context-switches:u        #    0.000 K/sec
                         0      cpu-migrations:u          #    0.000 K/sec
                   103,181      page-faults:u             #    0.043 M/sec                    ( +-  0.11% )
             3,548,172,030      cycles:u                  #    1.482 GHz                      ( +-  0.30% )  (83.26%)
                81,537,700      stalled-cycles-frontend:u #    2.30% frontend cycles idle     ( +-  1.54% )  (83.24%)
               876,631,544      stalled-cycles-backend:u  #   24.71% backend cycles idle      ( +-  1.14% )  (83.45%)
             5,960,361,707      instructions:u            #    1.68  insn per cycle
                                                          #    0.15  stalled cycles per insn  ( +-  0.27% )  (83.26%)
             1,269,413,491      branches:u                #  530.054 M/sec                    ( +-  0.10% )  (83.48%)
                11,372,453      branch-misses:u           #    0.90% of all branches          ( +-  0.52% )  (83.31%)
      
                   1.51874 +- 0.00642 seconds time elapsed  ( +-  0.42% )
      
        $
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Acked-by: default avatarJiri Olsa <jolsa@redhat.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lore.kernel.org/lkml/20201030054742.87740-1-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      60136667
  2. 11 Nov, 2020 14 commits
  3. 04 Nov, 2020 19 commits
  4. 03 Nov, 2020 6 commits
    • Linus Torvalds's avatar
      Merge tag 'perf-tools-for-v5.10-2020-11-03' of... · 4ef8451b
      Linus Torvalds authored
      Merge tag 'perf-tools-for-v5.10-2020-11-03' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
      
      Pull perf tools fixes from Arnaldo Carvalho de Melo:
       "Only fixes and a sync of the headers so that the perf build is silent:
      
         - Fix visibility attribute in python module init code with newer gcc
      
         - Fix DRAM_BW_Use 0 issue for CLX/SKX in intel JSON vendor event
           files
      
         - Fix the build on new fedora by removing LTO compiler options when
           building perl support
      
         - Remove broken __no_tail_call attribute
      
         - Fix segfault when trying to trace events by cgroup
      
         - Fix crash with non-jited BPF progs
      
         - Increase buffer size in TUI browser, fixing format truncation
      
         - Fix printing of build-id for objects lacking one
      
         - Fix byte swapping for ino_generation field in MMAP2 perf.data
           records
      
         - Fix byte swapping for CGROUP perf.data records, for cross arch
           analysis of perf.data files
      
         - Fix the fast path of feature detection
      
         - Update kernel header copies"
      
      * tag 'perf-tools-for-v5.10-2020-11-03' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (23 commits)
        tools feature: Fixup fast path feature detection
        perf tools: Add missing swap for cgroup events
        perf tools: Add missing swap for ino_generation
        perf tools: Initialize output buffer in build_id__sprintf
        perf hists browser: Increase size of 'buf' in perf_evsel__hists_browse()
        tools include UAPI: Update linux/mount.h copy
        tools headers UAPI: Update tools's copy of linux/perf_event.h
        tools kvm headers: Update KVM headers from the kernel sources
        tools UAPI: Update copy of linux/mman.h from the kernel sources
        tools arch x86: Sync the msr-index.h copy with the kernel sources
        tools x86 headers: Update required-features.h header from the kernel
        tools x86 headers: Update cpufeatures.h headers copies
        tools headers UAPI: Update fscrypt.h copy
        tools headers UAPI: Sync drm/i915_drm.h with the kernel sources
        tools headers UAPI: Sync prctl.h with the kernel sources
        perf scripting python: Avoid declaring function pointers with a visibility attribute
        perf tools: Remove broken __no_tail_call attribute
        perf vendor events: Fix DRAM_BW_Use 0 issue for CLX/SKX
        perf trace: Fix segfault when trying to trace events by cgroup
        perf tools: Fix crash with non-jited bpf progs
        ...
      4ef8451b
    • Linus Torvalds's avatar
      Merge tag 'docs-5.10-warnings' of git://git.lwn.net/linux · e6b0bd61
      Linus Torvalds authored
      Pull documentation build warning fixes from Jonathan Corbet:
       "This contains a series of warning fixes from Mauro; once applied, the
        number of warnings from the once-noisy docs build process is nearly
        zero.
      
        Getting to this point has required a lot of work; once there,
        hopefully we can keep things that way.
      
        I have packaged this as a separate pull because it does a fair amount
        of reaching outside of Documentation/. The changes are all in comments
        and in code placement. It's all been in linux-next since last week"
      
      * tag 'docs-5.10-warnings' of git://git.lwn.net/linux: (24 commits)
        docs: SafeSetID: fix a warning
        amdgpu: fix a few kernel-doc markup issues
        selftests: kselftest_harness.h: fix kernel-doc markups
        drm: amdgpu_dm: fix a typo
        gpu: docs: amdgpu.rst: get rid of wrong kernel-doc markups
        drm: amdgpu: kernel-doc: update some adev parameters
        docs: fs: api-summary.rst: get rid of kernel-doc include
        IB/srpt: docs: add a description for cq_size member
        locking/refcount: move kernel-doc markups to the proper place
        docs: lockdep-design: fix some warning issues
        MAINTAINERS: fix broken doc refs due to yaml conversion
        ice: docs fix a devlink info that broke a table
        crypto: sun8x-ce*: update entries to its documentation
        net: phy: remove kernel-doc duplication
        mm: pagemap.h: fix two kernel-doc markups
        blk-mq: docs: add kernel-doc description for a new struct member
        docs: userspace-api: add iommu.rst to the index file
        docs: hwmon: mp2975.rst: address some html build warnings
        docs: net: statistics.rst: remove a duplicated kernel-doc
        docs: kasan.rst: add two missing blank lines
        ...
      e6b0bd61
    • Linus Torvalds's avatar
      Merge tag 'docs-5.10-3' of git://git.lwn.net/linux · ce2e33ba
      Linus Torvalds authored
      Pull documentation fixes from Jonathan Corbet:
       "A small number of fixes, plus a build tweak to respect the desire for
        silence in V=0 builds"
      
      * tag 'docs-5.10-3' of git://git.lwn.net/linux:
        docs: fix automarkup regression on Python 2
        documentation: arm: sunxi: add Allwinner H6 documents
        scripts: kernel-doc: split typedef complex regex
        scripts: kernel-doc: fix typedef parsing
        docs: Makefile: honor V=0 for docs building
      ce2e33ba
    • Linus Torvalds's avatar
      Merge tag 'x86_seves_for_v5.10_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 43c83418
      Linus Torvalds authored
      Pull x86 SEV-ES fixes from Borislav Petkov:
       "A couple of changes to the SEV-ES code to perform more stringent
        hypervisor checks before enabling encryption (Joerg Roedel)"
      
      * tag 'x86_seves_for_v5.10_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/sev-es: Do not support MMIO to/from encrypted memory
        x86/head/64: Check SEV encryption before switching to kernel page-table
        x86/boot/compressed/64: Check SEV encryption in 64-bit boot-path
        x86/boot/compressed/64: Sanity-check CPUID results in the early #VC handler
        x86/boot/compressed/64: Introduce sev_status
      43c83418
    • David Howells's avatar
      afs: Fix incorrect freeing of the ACL passed to the YFS ACL store op · f4c79144
      David Howells authored
      The cleanup for the yfs_store_opaque_acl2_operation calls the wrong
      function to destroy the ACL content buffer.  It's an afs_acl struct, not
      a yfs_acl struct - and the free function for latter may pass invalid
      pointers to kfree().
      
      Fix this by using the afs_acl_put() function.  The yfs_acl_put()
      function is then no longer used and can be removed.
      
      	general protection fault, probably for non-canonical address 0x7ebde00000000: 0000 [#1] SMP PTI
      	...
      	RIP: 0010:compound_head+0x0/0x11
      	...
      	Call Trace:
      	 virt_to_cache+0x8/0x51
      	 kfree+0x5d/0x79
      	 yfs_free_opaque_acl+0x16/0x29
      	 afs_put_operation+0x60/0x114
      	 __vfs_setxattr+0x67/0x72
      	 __vfs_setxattr_noperm+0x66/0xe9
      	 vfs_setxattr+0x67/0xce
      	 setxattr+0x14e/0x184
      	 __do_sys_fsetxattr+0x66/0x8f
      	 do_syscall_64+0x2d/0x3a
      	 entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Fixes: e49c7b2f ("afs: Build an abstraction around an "operation" concept")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f4c79144
    • David Howells's avatar
      afs: Fix warning due to unadvanced marshalling pointer · c80afa1d
      David Howells authored
      When using the afs.yfs.acl xattr to change an AuriStor ACL, a warning
      can be generated when the request is marshalled because the buffer
      pointer isn't increased after adding the last element, thereby
      triggering the check at the end if the ACL wasn't empty.  This just
      causes something like the following warning, but doesn't stop the call
      from happening successfully:
      
          kAFS: YFS.StoreOpaqueACL2: Request buffer underflow (36<108)
      
      Fix this simply by increasing the count prior to the check.
      
      Fixes: f5e45463 ("afs: Implement YFS ACL setting")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c80afa1d