1. 01 Jul, 2021 23 commits
    • Adrian Hunter's avatar
      perf dlfilter: Add resolve_address() to perf_dlfilter_fns · f645744c
      Adrian Hunter authored
      Add a function, for use by dlfilters, to resolve addresses from branch
      stacks or callchains.
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210627131818.810-7-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f645744c
    • Adrian Hunter's avatar
      perf build: Install perf_dlfilter.h · 0beb2183
      Adrian Hunter authored
      Users of the --dlfilter option need to include perf_dlfilter.h
      in their filters. Install it to the include path.
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210627131818.810-6-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      0beb2183
    • Adrian Hunter's avatar
      perf script: Add option to pass arguments to dlfilters · 3d032a25
      Adrian Hunter authored
      Add option --dlarg to pass arguments to dlfilters. The --dlarg option can
      be repeated to pass more than 1 argument.
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210627131818.810-5-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      3d032a25
    • Adrian Hunter's avatar
      perf script: Add option to list dlfilters · 638e2b99
      Adrian Hunter authored
      Add option --list-dlfilters to list dlfilters in the current directory or
      the exec-path e.g. ~/libexec/perf-core/dlfilters. Use with option -v (must
      come before option --list-dlfilters) to show long descriptions.
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210627131818.810-4-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      638e2b99
    • Adrian Hunter's avatar
      perf script: Add dlfilter__filter_event_early() · 9bde93a7
      Adrian Hunter authored
      filter_event_early() can be more than 30% faster than filter_event()
      because it is called before internal filtering. In other respects it
      is the same as filter_event(), except that it will be passed events
      that have yet to be filtered out.
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210627131818.810-3-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      9bde93a7
    • Adrian Hunter's avatar
      perf script: Add API for filtering via dynamically loaded shared object · 291961fc
      Adrian Hunter authored
      In some cases, users want to filter very large amounts of data (e.g.
      from AUX area tracing like Intel PT) looking for something specific.
      While scripting such as Python can be used, Python is 10 to 20 times
      slower than C. So define a C API so that custom filters can be written
      and loaded.
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210627131818.810-2-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      291961fc
    • Arnaldo Carvalho de Melo's avatar
      perf llvm: Return -ENOMEM when asprintf() fails · c435c166
      Arnaldo Carvalho de Melo authored
      Zhihao sent a patch but it made llvm__compile_bpf() return what
      asprintf() returns on error, which is just -1, but since this function
      returns -errno, fix it by returning -ENOMEM for this case instead.
      
      Fixes: cb763714 ("perf llvm: Allow passing options to llc ...")
      Fixes: 5eab5a7e ("perf llvm: Display eBPF compiling command ...")
      Reported-by: default avatarHulk Robot <hulkci@huawei.com>
      Reported-by: default avatarZhihao Cheng <chengzhihao1@huawei.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Andrii Nakryiko <andrii@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Nathan Chancellor <nathan@kernel.org>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Yu Kuai <yukuai3@huawei.com>
      Cc: clang-built-linux@googlegroups.com
      Link: http://lore.kernel.org/lkml/20210609115945.2193194-1-chengzhihao1@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c435c166
    • James Clark's avatar
      perf cs-etm: Delay decode of non-timeless data until cs_etm__flush_events() · 0323dea3
      James Clark authored
      Currently, timeless mode starts the decode on PERF_RECORD_EXIT, and
      non-timeless mode starts decoding on the fist PERF_RECORD_AUX record.
      
      This can cause the "data has no samples!" error if the first
      PERF_RECORD_AUX record comes before the first (or any relevant)
      PERF_RECORD_MMAP2 record because the mmaps are required by the decoder
      to access the binary data.
      
      This change pushes the start of non-timeless decoding to the very end of
      parsing the file. The PERF_RECORD_EXIT event can't be used because it
      might not exist in system-wide or snapshot modes.
      
      I have not been able to find the exact cause for the events to be
      intermittently in the wrong order in the basic scenario:
      
      	perf record -e cs_etm/@tmc_etr0/u top
      
      But it can be made to happen every time with the --delay option. This is
      because "enable_on_exec" is disabled, which causes tracing to start
      before the process to be launched is exec'd. For example:
      
      	perf record -e cs_etm/@tmc_etr0/u --delay=1 top
      	perf report -D | grep 'AUX\|MAP'
      
      	0 16714475632740 0x520 [0x40]: PERF_RECORD_AUX offset: 0 size: 0x30 flags: 0 []
      	0 16714476494960 0x5d0 [0x40]: PERF_RECORD_AUX offset: 0x30 size: 0x30 flags: 0 []
      	0 16714478208900 0x660 [0x40]: PERF_RECORD_AUX offset: 0x60 size: 0x30 flags: 0 []
      	4294967295 16714478293340 0x700 [0x70]: PERF_RECORD_MMAP2 8712/8712: [0x557a460000(0x54000) @ 0 00:17 5329258 0]: r-xp /usr/bin/top
      	4294967295 16714478353020 0x770 [0x88]: PERF_RECORD_MMAP2 8712/8712: [0x7f86f72000(0x34000) @ 0 00:17 5214354 0]: r-xp /usr/lib/aarch64-linux-gnu/ld-2.31.so
      
      Another scenario in which decoding from the first aux record fails is a
      workload that forks. Although the aux record comes after 'bash', it
      comes before 'top', which is what we are interested in. For example:
      
      	perf record -e cs_etm/@tmc_etr0/u -- bash -c top
      	perf report -D | grep 'AUX\|MAP'
      
      	4294967295 16853946421300 0x510 [0x70]: PERF_RECORD_MMAP2 8723/8723: [0x558f280000(0x142000) @ 0 00:17 5213953 0]: r-xp /usr/bin/bash
      	4294967295 16853946543560 0x580 [0x88]: PERF_RECORD_MMAP2 8723/8723: [0x7fbba6e000(0x34000) @ 0 00:17 5214354 0]: r-xp /usr/lib/aarch64-linux-gnu/ld-2.31.so
      	4294967295 16853946628420 0x608 [0x68]: PERF_RECORD_MMAP2 8723/8723: [0x7fbba9e000(0x1000) @ 0 00:00 0 0]: r-xp [vdso]
      	0 16853947067300 0x690 [0x40]: PERF_RECORD_AUX offset: 0 size: 0x3a60 flags: 0 []
      	...
      	0 16853966602580 0x1758 [0x40]: PERF_RECORD_AUX offset: 0xc2470 size: 0x30 flags: 0 []
      	4294967295 16853967119860 0x1818 [0x70]: PERF_RECORD_MMAP2 8723/8723: [0x5559e70000(0x54000) @ 0 00:17 5329258 0]: r-xp /usr/bin/top
      	4294967295 16853967181620 0x1888 [0x88]: PERF_RECORD_MMAP2 8723/8723: [0x7f9ed06000(0x34000) @ 0 00:17 5214354 0]: r-xp /usr/lib/aarch64-linux-gnu/ld-2.31.so
      	4294967295 16853967237180 0x1910 [0x68]: PERF_RECORD_MMAP2 8723/8723: [0x7f9ed36000(0x1000) @ 0 00:00 0 0]: r-xp [vdso]
      
      A third scenario is when the majority of time is spent in a shared
      library that is not loaded at startup. For example a dynamically loaded
      plugin.
      
      Testing
      =======
      
      Testing was done by checking if any samples that are present in the
      old output are missing from the new output. Timestamps must be
      stripped out with awk because now they are set to the last AUX sample,
      rather than the first:
      
      	./perf script $4 | awk '!($4="")' > new.script
      	./perf-default script $4 | awk '!($4="")' > default.script
      	comm -13 <(sort -u new.script) <(sort -u default.script)
      
      Testing showed that the new output is a superset of the old. When lines
      appear in the comm output, it is not because they are missing but
      because [unknown] is now resolved to sensible locations. For example
      last putp branch here now resolves to libtinfo, so it's not missing
      from the output, but is actually improved:
      
      Old:
      	top 305 [001]  1 branches:uH: 402830 _init+0x30 (/usr/bin/top.procps) => 404a1c [unknown] (/usr/bin/top.procps)
      	top 305 [001]  1 branches:uH: 404a20 [unknown] (/usr/bin/top.procps) => 402970 putp@plt+0x0 (/usr/bin/top.procps)
      	top 305 [001]  1 branches:uH: 40297c putp@plt+0xc (/usr/bin/top.procps) => 0 [unknown] ([unknown])
      New:
      	top 305 [001]  1 branches:uH: 402830 _init+0x30 (/usr/bin/top.procps) => 404a1c [unknown] (/usr/bin/top.procps)
      	top 305 [001]  1 branches:uH: 404a20 [unknown] (/usr/bin/top.procps) => 402970 putp@plt+0x0 (/usr/bin/top.procps)
      	top 305 [001]  1 branches:uH: 40297c putp@plt+0xc (/usr/bin/top.procps) => 7f8ab39208 putp+0x0 (/lib/libtinfo.so.5.9)
      
      In the following two modes, decoding now works and the "data has no
      samples!" error is not displayed any more:
      
      	perf record -e cs_etm/@tmc_etr0/u -- bash -c top
      	perf record -e cs_etm/@tmc_etr0/u --delay=1 top
      
      In snapshot mode, there is also an improvement to decoding. Previously
      samples for the 'kill' process that was used to send SIGUSR2 were
      completely missing, because the process hadn't started yet. But now
      there are additional samples present:
      
      	perf record -e cs_etm/@tmc_etr0/u --snapshot -a
      	perf script
      
      		stress 19380 [003] 161627.938153:    1000000    instructions:uH:      aaaabb612fb4 [unknown] (/usr/bin/stress)
      		  kill 19644 [000] 161627.938153:    1000000    instructions:uH:      ffffae0ef210 [unknown] (/lib/aarch64-linux-gnu/ld-2.27.so)
      		stress 19380 [003] 161627.938153:    1000000    instructions:uH:      ffff9e754d40 random_r+0x20 (/lib/aarch64-linux-gnu/libc-2.27.so)
      
      Also tested was the round trip of 'perf inject' followed by 'perf
      report' which has the same differences and improvements.
      Signed-off-by: default avatarJames Clark <james.clark@arm.com>
      Reviewed-by: default avatarLeo Yan <leo.yan@linaro.org>
      Tested-by: default avatarLeo Yan <leo.yan@linaro.org>
      Acked-by: default avatarMathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Al Grant <al.grant@arm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Anshuman Khandual <anshuman.khandual@arm.com>
      Cc: Branislav Rankov <branislav.rankov@arm.com>
      Cc: Denis Nikitin <denik@chromium.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Cc: coresight@lists.linaro.org
      Cc: linux-arm-kernel@lists.infradead.org
      Link: http://lore.kernel.org/lkml/20210609130421.13934-1-james.clark@arm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      0323dea3
    • Arnaldo Carvalho de Melo's avatar
      tools headers UAPI: Synch KVM's svm.h header with the kernel · f88bb1cb
      Arnaldo Carvalho de Melo authored
      To pick up the changes from:
      
        59d21d67 ("KVM: SVM: Software reserved fields")
      
      Picking the new SVM_EXIT_SW exit reasons.
      
      Addressing this perf build warning:
      
        Warning: Kernel ABI header at 'tools/arch/x86/include/uapi/asm/svm.h' differs from latest version at 'arch/x86/include/uapi/asm/svm.h'
        diff -u tools/arch/x86/include/uapi/asm/svm.h arch/x86/include/uapi/asm/svm.h
      
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Vineeth Pillai <viremana@linux.microsoft.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f88bb1cb
    • Arnaldo Carvalho de Melo's avatar
      tools kvm headers arm64: Update KVM headers from the kernel sources · 795c4ab8
      Arnaldo Carvalho de Melo authored
      To pick the changes from:
      
        f0376edb ("KVM: arm64: Add ioctl to fetch/store tags in a guest")
      
      That don't causes any changes in tooling (when built on x86), only
      addresses this perf build warning:
      
        Warning: Kernel ABI header at 'tools/arch/arm64/include/uapi/asm/kvm.h' differs from latest version at 'arch/arm64/include/uapi/asm/kvm.h'
        diff -u tools/arch/arm64/include/uapi/asm/kvm.h arch/arm64/include/uapi/asm/kvm.h
      
      Cc: Marc Zyngier <maz@kernel.org>
      Cc: Steven Price <steven.price@arm.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      795c4ab8
    • Arnaldo Carvalho de Melo's avatar
      tools headers UAPI: Sync linux/kvm.h with the kernel sources · e48f62ae
      Arnaldo Carvalho de Melo authored
      To pick the changes in:
      
        19238e75 ("kvm: x86: Allow userspace to handle emulation errors")
        cb082bfa ("KVM: stats: Add fd-based API to read binary stats data")
        b87cc116 ("KVM: PPC: Book3S HV: Add KVM_CAP_PPC_RPT_INVALIDATE capability")
        f0376edb ("KVM: arm64: Add ioctl to fetch/store tags in a guest")
        0dbb1123 ("KVM: X86: Introduce KVM_HC_MAP_GPA_RANGE hypercall")
        6dba9403 ("KVM: x86: Introduce KVM_GET_SREGS2 / KVM_SET_SREGS2")
        644f7067 ("KVM: x86: hyper-v: Introduce KVM_CAP_HYPERV_ENFORCE_CPUID")
      
      That automatically adds support for these new ioctls:
      
        $ tools/perf/trace/beauty/kvm_ioctl.sh > before
        $ cp include/uapi/linux/kvm.h tools/include/uapi/linux/kvm.h
        $ tools/perf/trace/beauty/kvm_ioctl.sh > after
        $ diff -u before after
        --- before	2021-07-01 13:42:07.006387354 -0300
        +++ after	2021-07-01 13:45:16.051649301 -0300
        @@ -95,6 +95,9 @@
         	[0xc9] = "XEN_HVM_SET_ATTR",
         	[0xca] = "XEN_VCPU_GET_ATTR",
         	[0xcb] = "XEN_VCPU_SET_ATTR",
        +	[0xcc] = "GET_SREGS2",
        +	[0xcd] = "SET_SREGS2",
        +	[0xce] = "GET_STATS_FD",
         	[0xe0] = "CREATE_DEVICE",
         	[0xe1] = "SET_DEVICE_ATTR",
         	[0xe2] = "GET_DEVICE_ATTR",
        $
      
      This silences these perf build warning:
      
        Warning: Kernel ABI header at 'tools/arch/x86/include/uapi/asm/kvm.h' differs from latest version at 'arch/x86/include/uapi/asm/kvm.h'
        diff -u tools/arch/x86/include/uapi/asm/kvm.h arch/x86/include/uapi/asm/kvm.h
        Warning: Kernel ABI header at 'tools/include/uapi/linux/kvm.h' differs from latest version at 'include/uapi/linux/kvm.h'
        diff -u tools/include/uapi/linux/kvm.h include/uapi/linux/kvm.h
      
      Cc: Aaron Lewis <aaronlewis@google.com>
      Cc: Ashish Kalra <ashish.kalra@amd.com>
      Cc: Bharata B Rao <bharata@linux.ibm.com>
      Cc: Jing Zhang <jingzhangos@google.com>
      Cc: Marc Zyngier <maz@kernel.org>
      Cc: Maxim Levitsky <mlevitsk@redhat.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Steven Price <steven.price@arm.com>
      Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e48f62ae
    • Arnaldo Carvalho de Melo's avatar
      tools headers cpufeatures: Sync with the kernel sources · cc200a7d
      Arnaldo Carvalho de Melo authored
      To pick the changes from:
      
        1348924b ("x86/msr: Define new bits in TSX_FORCE_ABORT MSR")
        cbcddaa3 ("perf/x86/rapl: Use CPUID bit on AMD and Hygon parts")
      
      This only causes these perf files to be rebuilt:
      
        CC       /tmp/build/perf/bench/mem-memcpy-x86-64-asm.o
        CC       /tmp/build/perf/bench/mem-memset-x86-64-asm.o
      
      And addresses this perf build warning:
      
        Warning: Kernel ABI header at 'tools/arch/x86/include/asm/cpufeatures.h' differs from latest version at 'arch/x86/include/asm/cpufeatures.h'
        diff -u tools/arch/x86/include/asm/cpufeatures.h arch/x86/include/asm/cpufeatures.h
      
      Cc: Andrew Cooper <andrew.cooper3@citrix.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      cc200a7d
    • Arnaldo Carvalho de Melo's avatar
      tools include UAPI: Update linux/mount.h copy · 14c6ef2b
      Arnaldo Carvalho de Melo authored
      To pick the changes from:
      
        dd8b477f ("mount: Support "nosymfollow" in new mount api")
      
      That ends up adding support for the new MOUNT_ATTR_NOSYMFOLLOW mount
      attribute:
      
        $ tools/perf/trace/beauty/fsmount.sh > before
        $ cp include/uapi/linux/mount.h tools/include/uapi/linux/mount.h
        $ tools/perf/trace/beauty/fsmount.sh > after
        $ diff -u before after
        --- before	2021-07-01 13:34:04.542517355 -0300
        +++ after	2021-07-01 13:34:12.423694537 -0300
        @@ -7,4 +7,5 @@
         	[ilog2(0x00000020) + 1] = "STRICTATIME",
         	[ilog2(0x00000080) + 1] = "NODIRATIME",
         	[ilog2(0x00100000) + 1] = "IDMAP",
        +	[ilog2(0x00200000) + 1] = "NOSYMFOLLOW",
         };
        $
      
      So now one can use it in --filter expressions for tracepoints.
      
      This silences this perf build warnings:
      
        Warning: Kernel ABI header at 'tools/include/uapi/linux/mount.h' differs from latest version at 'include/uapi/linux/mount.h'
        diff -u tools/include/uapi/linux/mount.h include/uapi/linux/mount.h
      
      Cc: Christian Brauner <christian.brauner@ubuntu.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      14c6ef2b
    • Arnaldo Carvalho de Melo's avatar
      tools arch x86: Sync the msr-index.h copy with the kernel sources · 04df0dc1
      Arnaldo Carvalho de Melo authored
      To pick up the changes from these csets:
      
        1348924b ("x86/msr: Define new bits in TSX_FORCE_ABORT MSR")
      
      That cause no changes to tooling:
      
        $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > before
        $ cp arch/x86/include/asm/msr-index.h tools/arch/x86/include/asm/msr-index.h
        $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > after
        $ diff -u before after
        $
      
      Just silences this perf build warning:
      
        Warning: Kernel ABI header at 'tools/arch/x86/include/asm/msr-index.h' differs from latest version at 'arch/x86/include/asm/msr-index.h'
        diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h
      
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      04df0dc1
    • Leo Yan's avatar
      perf arm-spe: Don't wait for PERF_RECORD_EXIT event · 8941ba50
      Leo Yan authored
      When decode Arm SPE trace, it waits for PERF_RECORD_EXIT event (the last
      perf event) for processing trace data, which is needless and even might
      cause logic error, e.g. it might fail to correlate perf events with Arm
      SPE events correctly.
      
      So this patch removes the condition checking for PERF_RECORD_EXIT event.
      Signed-off-by: default avatarLeo Yan <leo.yan@linaro.org>
      Reviewed-by: default avatarJames Clark <james.clark@arm.com>
      Tested-by: default avatarJames Clark <james.clark@arm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Al Grant <Al.Grant@arm.com>
      Cc: Dave Martin <Dave.Martin@arm.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Will Deacon <will@kernel.org>
      Link: https://lore.kernel.org/r/20210519071939.1598923-6-leo.yan@linaro.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      8941ba50
    • Leo Yan's avatar
      perf arm-spe: Bail out if the trace is later than perf event · afb5e9e4
      Leo Yan authored
      It's possible that record in Arm SPE trace is later than perf event and
      vice versa.  This asks to correlate the perf events and Arm SPE
      synthesized events to be processed in the manner of correct timing.
      
      To achieve the time ordering, this patch reverses the flow, it firstly
      calls arm_spe_sample() and then calls arm_spe_decode().  By comparing
      the timestamp value and detect the perf event is coming earlier than Arm
      SPE trace data, it bails out from the decoding loop, the last record is
      pushed into auxtrace stack and is deferred to generate sample.  To track
      the timestamp, everytime it updates timestamp for the latest record.
      Signed-off-by: default avatarLeo Yan <leo.yan@linaro.org>
      Reviewed-by: default avatarJames Clark <james.clark@arm.com>
      Tested-by: default avatarJames Clark <james.clark@arm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Al Grant <Al.Grant@arm.com>
      Cc: Dave Martin <Dave.Martin@arm.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Will Deacon <will@kernel.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: https://lore.kernel.org/r/20210519071939.1598923-5-leo.yan@linaro.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      afb5e9e4
    • Leo Yan's avatar
      perf arm-spe: Assign kernel time to synthesized event · 85498f75
      Leo Yan authored
      In current code, it assigns the arch timer counter to the synthesized
      samples Arm SPE trace, thus the samples don't contain the kernel time
      but only contain the raw counter value.
      
      To fix the issue, this patch converts the timer counter to kernel time
      and assigns it to sample timestamp.
      Signed-off-by: default avatarLeo Yan <leo.yan@linaro.org>
      Reviewed-by: default avatarJames Clark <james.clark@arm.com>
      Tested-by: default avatarJames Clark <james.clark@arm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Al Grant <Al.Grant@arm.com>
      Cc: Dave Martin <Dave.Martin@arm.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Will Deacon <will@kernel.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: https://lore.kernel.org/r/20210519071939.1598923-4-leo.yan@linaro.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      85498f75
    • Leo Yan's avatar
      perf arm-spe: Convert event kernel time to counter value · 63051901
      Leo Yan authored
      When handle a perf event, Arm SPE decoder needs to decide if this perf
      event is earlier or later than the samples from Arm SPE trace data; to
      do comparision, it needs to use the same unit for the time.
      
      This patch converts the event kernel time to arch timer's counter value,
      thus it can be used to compare with counter value contained in Arm SPE
      Timestamp packet.
      Signed-off-by: default avatarLeo Yan <leo.yan@linaro.org>
      Reviewed-by: default avatarJames Clark <james.clark@arm.com>
      Tested-by: default avatarJames Clark <james.clark@arm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Al Grant <Al.Grant@arm.com>
      Cc: Dave Martin <Dave.Martin@arm.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Will Deacon <will@kernel.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: https://lore.kernel.org/r/20210519071939.1598923-3-leo.yan@linaro.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      63051901
    • Leo Yan's avatar
      perf arm-spe: Save clock parameters from TIME_CONV event · c210c306
      Leo Yan authored
      During the recording phase, "perf record" tool synthesizes event
      PERF_RECORD_TIME_CONV for the hardware clock parameters and saves the
      event into the data file.
      
      Afterwards, when processing the data file, the event TIME_CONV will be
      processed at the very early time and is stored into session context.
      
      This patch extracts these parameters from the session context and saves
      into the structure "spe->tc" with the type perf_tsc_conversion, so that
      the parameters are ready for conversion between clock counter and time
      stamp.
      Signed-off-by: default avatarLeo Yan <leo.yan@linaro.org>
      Reviewed-by: default avatarJames Clark <james.clark@arm.com>
      Tested-by: default avatarJames Clark <james.clark@arm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Al Grant <Al.Grant@arm.com>
      Cc: Dave Martin <Dave.Martin@arm.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Will Deacon <will@kernel.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: https://lore.kernel.org/r/20210519071939.1598923-2-leo.yan@linaro.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c210c306
    • Leo Yan's avatar
      perf cs-etm: Remove callback cs_etm_find_snapshot() · 2f01c200
      Leo Yan authored
      The callback cs_etm_find_snapshot() is invoked for snapshot mode, its
      main purpose is to find the correct AUX trace data and returns "head"
      and "old" (we can call "old" as "old head") to the caller, the caller
      __auxtrace_mmap__read() uses these two pointers to decide the AUX trace
      data size.
      
      This patch removes cs_etm_find_snapshot() with below reasons:
      
      - The first thing in cs_etm_find_snapshot() is to check if the head has
        wrapped around, if it is not, directly bails out.  The checking is
        pointless, this is because the "head" and "old" pointers both are
        monotonical increasing so they never wrap around.
      
      - cs_etm_find_snapshot() adjusts the "head" and "old" pointers and
        assumes the AUX ring buffer is fully filled with the hardware trace
        data, so it always subtracts the difference "mm->len" from "head" to
        get "old".  Let's imagine the snapshot is taken in very short
        interval, the tracers only fill a small chunk of the trace data into
        the AUX ring buffer, in this case, it's wrongly to copy the whole the
        AUX ring buffer to perf file.
      
      - As the "head" and "old" pointers are monotonically increased, the
        function __auxtrace_mmap__read() handles these two pointers properly.
        It calculates the reminders for these two pointers, and the size is
        clamped to be never more than "snapshot_size".  We can simply reply on
        the function __auxtrace_mmap__read() to calculate the correct result
        for data copying, it's not necessary to add Arm CoreSight specific
        callback.
      Signed-off-by: default avatarLeo Yan <leo.yan@linaro.org>
      Reviewed-by: default avatarJames Clark <james.clark@arm.com>
      Tested-by: default avatarJames Clark <james.clark@arm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Daniel Kiss <daniel.kiss@arm.com>
      Cc: Denis Nikitin <denik@google.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: coresight@lists.linaro.org
      Link: http://lore.kernel.org/lkml/20210701093537.90759-3-leo.yan@linaro.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      2f01c200
    • Namhyung Kim's avatar
      perf bpf_counter: Move common functions to bpf_counter.h · d6a735ef
      Namhyung Kim authored
      Some helper functions will be used for cgroup counting too.  Move them
      to a header file for sharing.
      
      Committer notes:
      
      Fix the build on older systems with:
      
        -       struct bpf_map_info map_info = {0};
        +       struct bpf_map_info map_info = { .id = 0, };
      
      This wasn't breaking the build in such systems as bpf_counter.c isn't
      built due to:
      
      tools/perf/util/Build:
      
        perf-$(CONFIG_PERF_BPF_SKEL) += bpf_counter.o
      
      The bpf_counter.h file on the other hand is included from places that
      are built everywhere.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Acked-by: default avatarSong Liu <songliubraving@fb.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lore.kernel.org/lkml/20210625071826.608504-4-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      d6a735ef
    • Namhyung Kim's avatar
      perf tools: Add cgroup_is_v2() helper · 21bcc726
      Namhyung Kim authored
      The cgroup_is_v2() is to check if the given subsystem is mounted on
      cgroup v2 or not.  It'll be used by BPF cgroup code later.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Acked-by: default avatarIan Rogers <irogers@google.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lore.kernel.org/lkml/20210625071826.608504-3-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      21bcc726
    • Namhyung Kim's avatar
      perf tools: Add read_cgroup_id() function · 69e874db
      Namhyung Kim authored
      The read_cgroup_id() is to read a cgroup id from a file handle using
      name_to_handle_at(2) for the given cgroup.  It'll be used by bperf
      cgroup stat later.
      
      Committer notes:
      
        -int read_cgroup_id(struct cgroup *cgrp)
        +static inline int read_cgroup_id(struct cgroup *cgrp __maybe_unused)
      
      To fix the build when HAVE_FILE_HANDLE is not defined.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lore.kernel.org/lkml/20210625071826.608504-2-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      69e874db
  2. 30 Jun, 2021 8 commits
    • Alexey Bayduraev's avatar
      tools lib: Adopt bitmap_intersects() operation from the kernel sources · f20510d5
      Alexey Bayduraev authored
      Adopt bitmap_intersects() routine that tests whether bitmaps bitmap1 and
      bitmap2 intersects. This routine will be used during thread masks
      initialization.
      Signed-off-by: default avatarAlexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
      Acked-by: default avatarAndi Kleen <ak@linux.intel.com>
      Acked-by: default avatarNamhyung Kim <namhyung@gmail.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Antonov <alexander.antonov@linux.intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Budankov <abudankov@huawei.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Riccardo Mancini <rickyman7@gmail.com>
      Link: http://lore.kernel.org/lkml/f75aa738d8ff8f9cffd7532d671f3ef3deb97a7c.1625065643.git.alexey.v.bayduraev@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f20510d5
    • Arnaldo Carvalho de Melo's avatar
      857286e4
    • Linus Torvalds's avatar
      Merge tag 'dlm-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm · 007b350a
      Linus Torvalds authored
      Pull dlm updates from David Teigland:
       "This is a major dlm networking enhancement that adds message
        retransmission so that the dlm can reliably continue operating when
        network connections fail and nodes reconnect.
      
        Previously, this would result in lost messages which could only be
        handled as a node failure"
      
      * tag 'dlm-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm: (26 commits)
        fs: dlm: invalid buffer access in lookup error
        fs: dlm: fix race in mhandle deletion
        fs: dlm: rename socket and app buffer defines
        fs: dlm: introduce proto values
        fs: dlm: move dlm allow conn
        fs: dlm: use alloc_ordered_workqueue
        fs: dlm: fix memory leak when fenced
        fs: dlm: fix lowcomms_start error case
        fs: dlm: Fix spelling mistake "stucked" -> "stuck"
        fs: dlm: Fix memory leak of object mh
        fs: dlm: don't allow half transmitted messages
        fs: dlm: add midcomms debugfs functionality
        fs: dlm: add reliable connection if reconnect
        fs: dlm: add union in dlm header for lockspace id
        fs: dlm: move out some hash functionality
        fs: dlm: add functionality to re-transmit a message
        fs: dlm: make buffer handling per msg
        fs: dlm: add more midcomms hooks
        fs: dlm: public header in out utility
        fs: dlm: fix connection tcp EOF handling
        ...
      007b350a
    • Linus Torvalds's avatar
      Merge tag 'gfs2-v5.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2 · 8418dabd
      Linus Torvalds authored
      Pull gfs2 updates from Andreas Gruenbacher:
       "Various minor gfs2 cleanups and fixes"
      
      * tag 'gfs2-v5.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
        gfs2: Clean up gfs2_unstuff_dinode
        gfs2: Unstuff before locking page in gfs2_page_mkwrite
        gfs2: Clean up the error handling in gfs2_page_mkwrite
        gfs2: Fix error handling in init_statfs
        gfs2: Fix underflow in gfs2_page_mkwrite
        gfs2: Use list_move_tail instead of list_del/list_add_tail
        gfs2: Fix do_gfs2_set_flags description
      8418dabd
    • Linus Torvalds's avatar
      Merge tag '5.14-rc-smb3-fixes-part1' of git://git.samba.org/sfrench/cifs-2.6 · bbd91626
      Linus Torvalds authored
      Pull cifs updates from Steve French:
      
       - improve fallocate emulation
      
       - DFS fixes
      
       - minor multichannel fixes
      
       - various cleanup patches, many to address Coverity warnings
      
      * tag '5.14-rc-smb3-fixes-part1' of git://git.samba.org/sfrench/cifs-2.6: (38 commits)
        smb3: prevent races updating CurrentMid
        cifs: fix missing spinlock around update to ses->status
        cifs: missing null pointer check in cifs_mount
        smb3: fix possible access to uninitialized pointer to DACL
        cifs: missing null check for newinode pointer
        cifs: remove two cases where rc is set unnecessarily in sid_to_id
        SMB3: Add new info level for query directory
        cifs: fix NULL dereference in smb2_check_message()
        smbdirect: missing rc checks while waiting for rdma events
        cifs: Avoid field over-reading memcpy()
        smb311: remove dead code for non compounded posix query info
        cifs: fix SMB1 error path in cifs_get_file_info_unix
        smb3: fix uninitialized value for port in witness protocol move
        cifs: fix unneeded null check
        cifs: use SPDX-Licence-Identifier
        cifs: convert list_for_each to entry variant in cifs_debug.c
        cifs: convert list_for_each to entry variant in smb2misc.c
        cifs: avoid extra calls in posix_info_parse
        cifs: retry lookup and readdir when EAGAIN is returned.
        cifs: fix check of dfs interlinks
        ...
      bbd91626
    • Linus Torvalds's avatar
      Merge tag 'fs.openat2.unknown_flags.v5.14' of... · b97902b6
      Linus Torvalds authored
      Merge tag 'fs.openat2.unknown_flags.v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux
      
      Pull openat2 fixes from Christian Brauner:
      
       - Remove the unused VALID_UPGRADE_FLAGS define we carried from an
         extension to openat2() that we haven't merged. Aleksa might be
         getting back to it at some point but just not right now.
      
       - openat2() used to accidently ignore unknown flag values in the upper
         32 bits.
      
         The new openat2() syscall verifies that no unknown O-flag values are
         set and returns an error to userspace if they are while the older
         open syscalls like open() and openat() simply ignore unknown flag
         values:
      
            #define O_FLAG_CURRENTLY_INVALID (1 << 31)
            struct open_how how = {
                  .flags = O_RDONLY | O_FLAG_CURRENTLY_INVALID,
                  .resolve = 0,
            };
      
            /* fails */
            fd = openat2(-EBADF, "/dev/null", &how, sizeof(how));
      
            /* succeeds */
            fd = openat(-EBADF, "/dev/null", O_RDONLY | O_FLAG_CURRENTLY_INVALID);
      
         However, openat2() silently truncates the upper 32 bits meaning:
      
            #define O_FLAG_CURRENTLY_INVALID_LOWER32 (1 << 31)
            #define O_FLAG_CURRENTLY_INVALID_UPPER32 (1 << 40)
      
            struct open_how how_lowe32 = {
                  .flags = O_RDONLY | O_FLAG_CURRENTLY_INVALID_LOWER32,
            };
      
            struct open_how how_upper32 = {
                  .flags = O_RDONLY | O_FLAG_CURRENTLY_INVALID_UPPER32,
            };
      
            /* fails */
            fd = openat2(-EBADF, "/dev/null", &how_lower32, sizeof(how_lower32));
      
            /* succeeds */
            fd = openat2(-EBADF, "/dev/null", &how_upper32, sizeof(how_upper32));
      
         Fix this by preventing the immediate truncation in build_open_flags()
         and add a compile-time check to catch when we add flags in the upper
         32 bit range.
      
      * tag 'fs.openat2.unknown_flags.v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
        test: add openat2() test for invalid upper 32 bit flag value
        open: don't silently ignore unknown O-flags in openat2()
        fcntl: remove unused VALID_UPGRADE_FLAGS
      b97902b6
    • Linus Torvalds's avatar
      Merge tag 'fs.mount_setattr.nosymfollow.v5.14' of... · 30d1a556
      Linus Torvalds authored
      Merge tag 'fs.mount_setattr.nosymfollow.v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux
      
      Pull mount_setattr updates from Christian Brauner:
       "A few releases ago the old mount API gained support for a mount
        options which prevents following symlinks on a given mount. This adds
        support for it in the new mount api through the MOUNT_ATTR_NOSYMFOLLOW
        flag via mount_setattr() and fsmount(). With mount_setattr() that flag
        can even be applied recursively.
      
        There's an additional ack from Ross Zwisler who originally authored
        the nosymfollow patch. As I've already had the patches in my for-next
        I didn't add his ack explicitly"
      
      * tag 'fs.mount_setattr.nosymfollow.v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
        tests: test MOUNT_ATTR_NOSYMFOLLOW with mount_setattr()
        mount: Support "nosymfollow" in new mount api
      30d1a556
    • Linus Torvalds's avatar
      Merge branch 'akpm' (patches from Andrew) · 65090f30
      Linus Torvalds authored
      Merge misc updates from Andrew Morton:
       "191 patches.
      
        Subsystems affected by this patch series: kthread, ia64, scripts,
        ntfs, squashfs, ocfs2, kernel/watchdog, and mm (gup, pagealloc, slab,
        slub, kmemleak, dax, debug, pagecache, gup, swap, memcg, pagemap,
        mprotect, bootmem, dma, tracing, vmalloc, kasan, initialization,
        pagealloc, and memory-failure)"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (191 commits)
        mm,hwpoison: make get_hwpoison_page() call get_any_page()
        mm,hwpoison: send SIGBUS with error virutal address
        mm/page_alloc: split pcp->high across all online CPUs for cpuless nodes
        mm/page_alloc: allow high-order pages to be stored on the per-cpu lists
        mm: replace CONFIG_FLAT_NODE_MEM_MAP with CONFIG_FLATMEM
        mm: replace CONFIG_NEED_MULTIPLE_NODES with CONFIG_NUMA
        docs: remove description of DISCONTIGMEM
        arch, mm: remove stale mentions of DISCONIGMEM
        mm: remove CONFIG_DISCONTIGMEM
        m68k: remove support for DISCONTIGMEM
        arc: remove support for DISCONTIGMEM
        arc: update comment about HIGHMEM implementation
        alpha: remove DISCONTIGMEM and NUMA
        mm/page_alloc: move free_the_page
        mm/page_alloc: fix counting of managed_pages
        mm/page_alloc: improve memmap_pages dbg msg
        mm: drop SECTION_SHIFT in code comments
        mm/page_alloc: introduce vm.percpu_pagelist_high_fraction
        mm/page_alloc: limit the number of pages on PCP lists when reclaim is active
        mm/page_alloc: scale the number of pages that are batch freed
        ...
      65090f30
  3. 29 Jun, 2021 9 commits
    • Linus Torvalds's avatar
      Merge tag 'devprop-5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 349a2d52
      Linus Torvalds authored
      Pull device properties framework updates from Rafael Wysocki:
       "These unify device properties access in some pieces of code and make
        related changes.
      
        Specifics:
      
         - Handle device properties with software node API in the ACPI IORT
           table parsing code (Heikki Krogerus).
      
         - Unify of_node access in the common device properties code, constify
           the acpi_dma_supported() argument pointer and fix up CONFIG_ACPI=n
           stubs of some functions related to device properties (Andy
           Shevchenko)"
      
      * tag 'devprop-5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        device property: Unify access to of_node
        ACPI: scan: Constify acpi_dma_supported() helper function
        ACPI: property: Constify stubs for CONFIG_ACPI=n case
        ACPI: IORT: Handle device properties with software node API
        device property: Retrieve fwnode from of_node via accessor
      349a2d52
    • Linus Torvalds's avatar
      Merge tag 'pnp-5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 72ad9f9d
      Linus Torvalds authored
      Pull PNP updates from Rafael Wysocki:
       "These get rid of unnecessary local variables and function, reduce code
        duplication and clean up message printing.
      
        Specifics:
      
         - Remove unnecessary local variables from isapnp_proc_attach_device()
           (Anupama K Patil).
      
         - Make the callers of pnp_alloc() use kzalloc() directly and drop the
           former (Heiner Kallweit).
      
         - Make two pieces of code use dev_dbg() instead of dev_printk() with
           the KERN_DEBUG message level (Heiner Kallweit).
      
         - Use DEVICE_ATTR_RO() instead of full DEVICE_ATTR() in some places
           in card.c (Zhen Lei).
      
         - Use list_for_each_entry() instead of list_for_each() in
           insert_device() (Zou Wei)"
      
      * tag 'pnp-5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        PNP: pnpbios: Use list_for_each_entry() instead of list_for_each()
        PNP: use DEVICE_ATTR_RO macro
        PNP: Switch over to dev_dbg()
        PNP: Remove pnp_alloc()
        drivers: pnp: isapnp: proc.c: Remove unnecessary local variables
      72ad9f9d
    • Linus Torvalds's avatar
      Merge tag 'acpi-5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 5e692824
      Linus Torvalds authored
      Pull ACPI updates from Rafael Wysocki:
       "These update the ACPICA code in the kernel to the 20210604 upstream
        revision, add preliminary support for the Platform Runtime Mechanism
        (PRM), address issues related to the handling of device dependencies
        in the ACPI device eunmeration code, improve the tracking of ACPI
        power resource states, improve the ACPI support for suspend-to-idle on
        AMD systems, continue the unification of message printing in the ACPI
        code, address assorted issues and clean up the code in a number of
        places.
      
        Specifics:
      
         - Update ACPICA code in the kernel to upstrea revision 20210604
           including the following changes:
      
            - Add defines for the CXL Host Bridge Structureand and add the
              CFMWS structure definition to CEDT (Alison Schofield).
            - iASL: Finish support for the IVRS ACPI table (Bob Moore).
            - iASL: Add support for the SVKL table (Bob Moore).
            - iASL: Add full support for RGRT ACPI table (Bob Moore).
            - iASL: Add support for the BDAT ACPI table (Bob Moore).
            - iASL: add disassembler support for PRMT (Erik Kaneda).
            - Fix memory leak caused by _CID repair function (Erik Kaneda).
            - Add support for PlatformRtMechanism OpRegion (Erik Kaneda).
            - Add PRMT module header to facilitate parsing (Erik Kaneda).
            - Add _PLD panel positions (Fabian Wüthrich).
            - MADT: add Multiprocessor Wakeup Mailbox Structure and the SVKL
              table headers (Kuppuswamy Sathyanarayanan).
            - Use ACPI_FALLTHROUGH (Wei Ming Chen).
      
         - Add preliminary support for the Platform Runtime Mechanism (PRM) to
           allow the AML interpreter to call PRM functions (Erik Kaneda).
      
         - Address some issues related to the handling of device dependencies
           reported by _DEP in the ACPI device enumeration code and clean up
           some related pieces of it (Rafael Wysocki).
      
         - Improve the tracking of states of ACPI power resources (Rafael
           Wysocki).
      
         - Improve ACPI support for suspend-to-idle on AMD systems (Alex
           Deucher, Mario Limonciello, Pratik Vishwakarma).
      
         - Continue the unification and cleanup of message printing in the
           ACPI code (Hanjun Guo, Heiner Kallweit).
      
         - Fix possible buffer overrun issue with the description_show() sysfs
           attribute method (Krzysztof Wilczyński).
      
         - Improve the acpi_mask_gpe kernel command line parameter handling
           and clean up the core ACPI code related to sysfs (Andy Shevchenko,
           Baokun Li, Clayton Casciato).
      
         - Postpone bringing devices in the general ACPI PM domain to D0
           during resume from system-wide suspend until they are really needed
           (Dmitry Torokhov).
      
         - Make the ACPI processor driver fix up C-state latency if not
           ordered (Mario Limonciello).
      
         - Add support for identifying devices depening on the given one that
           are not its direct descendants with the help of _DEP (Daniel
           Scally).
      
         - Extend the checks related to ACPI IRQ overrides on x86 in order to
           avoid false-positives (Hui Wang).
      
         - Add battery DPTF participant for Intel SoCs (Sumeet Pawnikar).
      
         - Rearrange the ACPI fan driver and device power management code to
           use a common list of device IDs (Rafael Wysocki).
      
         - Fix clang CFI violation in the ACPI BGRT table parsing code and
           clean it up (Nathan Chancellor).
      
         - Add GPE-related quirks for some laptops to the EC driver (Chris
           Chiu, Zhang Rui).
      
         - Make the ACPI PPTT table parsing code populate the cache-id value
           if present in the firmware (James Morse).
      
         - Remove redundant clearing of context->ret.pointer from
           acpi_run_osc() (Hans de Goede).
      
         - Add missing acpi_put_table() in acpi_init_fpdt() (Jing Xiangfeng).
      
         - Make ACPI APEI handle ARM Processor Error CPER records like Memory
           Error ones to avoid user space task lockups (Xiaofei Tan).
      
         - Stop warning about disabled ACPI in APEI (Jon Hunter).
      
         - Fix fall-through warning for Clang in the SBSHC driver (Gustavo A.
           R. Silva).
      
         - Add custom DSDT file as Makefile prerequisite (Richard Fitzgerald).
      
         - Initialize local variable to avoid garbage being returned (Colin
           Ian King).
      
         - Simplify assorted pieces of code, address assorted coding style and
           documentation issues and comment typos (Baokun Li, Christophe
           JAILLET, Clayton Casciato, Liu Shixin, Shaokun Zhang, Wei Yongjun,
           Yang Li, Zhen Lei)"
      
      * tag 'acpi-5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (97 commits)
        ACPI: PM: postpone bringing devices to D0 unless we need them
        ACPI: tables: Add custom DSDT file as makefile prerequisite
        ACPI: bgrt: Use sysfs_emit
        ACPI: bgrt: Fix CFI violation
        ACPI: EC: trust DSDT GPE for certain HP laptop
        ACPI: scan: Simplify acpi_table_events_fn()
        ACPI: PM: Adjust behavior for field problems on AMD systems
        ACPI: PM: s2idle: Add support for new Microsoft UUID
        ACPI: PM: s2idle: Add support for multiple func mask
        ACPI: PM: s2idle: Refactor common code
        ACPI: PM: s2idle: Use correct revision id
        ACPI: sysfs: Remove tailing return statement in void function
        ACPI: sysfs: Use __ATTR_RO() and __ATTR_RW() macros
        ACPI: sysfs: Sort headers alphabetically
        ACPI: sysfs: Refactor param_get_trace_state() to drop dead code
        ACPI: sysfs: Unify pattern of memory allocations
        ACPI: sysfs: Allow bitmap list to be supplied to acpi_mask_gpe
        ACPI: sysfs: Make sparse happy about address space in use
        ACPI: scan: Fix race related to dropping dependencies
        ACPI: scan: Reorganize acpi_device_add()
        ...
      5e692824
    • Linus Torvalds's avatar
      Merge tag 'pm-5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 3563f55c
      Linus Torvalds authored
      Pull power management updates from Rafael Wysocki:
       "These add hybrid processors support to the intel_pstate driver and
        make it work with more processor models when HWP is disabled, make the
        intel_idle driver use special C6 idle state paremeters when package
        C-states are disabled, add cooling support to the tegra30 devfreq
        driver, rework the TEO (timer events oriented) cpuidle governor,
        extend the OPP (operating performance points) framework to use the
        required-opps DT property in more cases, fix some issues and clean up
        a number of assorted pieces of code.
      
        Specifics:
      
         - Make intel_pstate support hybrid processors using abstract
           performance units in the HWP interface (Rafael Wysocki).
      
         - Add Icelake servers and Cometlake support in no-HWP mode to
           intel_pstate (Giovanni Gherdovich).
      
         - Make cpufreq_online() error path be consistent with the CPU device
           removal path in cpufreq (Rafael Wysocki).
      
         - Clean up 3 cpufreq drivers and the statistics code (Hailong Liu,
           Randy Dunlap, Shaokun Zhang).
      
         - Make intel_idle use special idle state parameters for C6 when
           package C-states are disabled (Chen Yu).
      
         - Rework the TEO (timer events oriented) cpuidle governor to address
           some theoretical shortcomings in it (Rafael Wysocki).
      
         - Drop unneeded semicolon from the TEO governor (Wan Jiabing).
      
         - Modify the runtime PM framework to accept unassigned suspend and
           resume callback pointers (Ulf Hansson).
      
         - Improve pm_runtime_get_sync() documentation (Krzysztof Kozlowski).
      
         - Improve device performance states support in the generic power
           domains (genpd) framework (Ulf Hansson).
      
         - Fix some documentation issues in genpd (Yang Yingliang).
      
         - Make the operating performance points (OPP) framework use the
           required-opps DT property in use cases that are not related to
           genpd (Hsin-Yi Wang).
      
         - Make lazy_link_required_opp_table() use list_del_init instead of
           list_del/INIT_LIST_HEAD (Yang Yingliang).
      
         - Simplify wake IRQs handling in the core system-wide sleep support
           code and clean up some coding style inconsistencies in it (Tian
           Tao, Zhen Lei).
      
         - Add cooling support to the tegra30 devfreq driver and improve its
           DT bindings (Dmitry Osipenko).
      
         - Fix some assorted issues in the devfreq core and drivers (Chanwoo
           Choi, Dong Aisheng, YueHaibing)"
      
      * tag 'pm-5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (39 commits)
        PM / devfreq: passive: Fix get_target_freq when not using required-opp
        cpufreq: Make cpufreq_online() call driver->offline() on errors
        opp: Allow required-opps to be used for non genpd use cases
        cpuidle: teo: remove unneeded semicolon in teo_select()
        dt-bindings: devfreq: tegra30-actmon: Add cooling-cells
        dt-bindings: devfreq: tegra30-actmon: Convert to schema
        PM / devfreq: userspace: Use DEVICE_ATTR_RW macro
        PM: runtime: Clarify documentation when callbacks are unassigned
        PM: runtime: Allow unassigned ->runtime_suspend|resume callbacks
        PM: runtime: Improve path in rpm_idle() when no callback
        PM: hibernate: remove leading spaces before tabs
        PM: sleep: remove trailing spaces and tabs
        PM: domains: Drop/restore performance state votes for devices at runtime PM
        PM: domains: Return early if perf state is already set for the device
        PM: domains: Split code in dev_pm_genpd_set_performance_state()
        cpuidle: teo: Use kerneldoc documentation in admin-guide
        cpuidle: teo: Rework most recent idle duration values treatment
        cpuidle: teo: Change the main idle state selection logic
        cpuidle: teo: Cosmetic modification of teo_select()
        cpuidle: teo: Cosmetic modifications of teo_update()
        ...
      3563f55c
    • Linus Torvalds's avatar
      Merge tag 'x86-entry-2021-06-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 1dfb0f47
      Linus Torvalds authored
      Pull x86 entry code related updates from Thomas Gleixner:
      
       - Consolidate the macros for .byte ... opcode sequences
      
       - Deduplicate register offset defines in include files
      
       - Simplify the ia32,x32 compat handling of the related syscall tables
         to get rid of #ifdeffery.
      
       - Clear all EFLAGS which are not required for syscall handling
      
       - Consolidate the syscall tables and switch the generation over to the
         generic shell script and remove the CFLAGS tweaks which are not
         longer required.
      
       - Use 'int' type for system call numbers to match the generic code.
      
       - Add more selftests for syscalls
      
      * tag 'x86-entry-2021-06-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/syscalls: Don't adjust CFLAGS for syscall tables
        x86/syscalls: Remove -Wno-override-init for syscall tables
        x86/uml/syscalls: Remove array index from syscall initializers
        x86/syscalls: Clear 'offset' and 'prefix' in case they are set in env
        x86/entry: Use int everywhere for system call numbers
        x86/entry: Treat out of range and gap system calls the same
        x86/entry/64: Sign-extend system calls on entry to int
        selftests/x86/syscall: Add tests under ptrace to syscall_numbering_64
        selftests/x86/syscall: Simplify message reporting in syscall_numbering
        selftests/x86/syscall: Update and extend syscall_numbering_64
        x86/syscalls: Switch to generic syscallhdr.sh
        x86/syscalls: Use __NR_syscalls instead of __NR_syscall_max
        x86/unistd: Define X32_NR_syscalls only for 64-bit kernel
        x86/syscalls: Stop filling syscall arrays with *_sys_ni_syscall
        x86/syscalls: Switch to generic syscalltbl.sh
        x86/entry/x32: Rename __x32_compat_sys_* to __x64_compat_sys_*
      1dfb0f47
    • Linus Torvalds's avatar
      Merge tag 'x86-irq-2021-06-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · a22c3f61
      Linus Torvalds authored
      Pull x86 interrupt related updates from Thomas Gleixner:
      
       - Consolidate the VECTOR defines and the usage sites.
      
       - Cleanup GDT/IDT related code and replace open coded ASM with proper
         native helper functions.
      
      * tag 'x86-irq-2021-06-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/kexec: Set_[gi]dt() -> native_[gi]dt_invalidate() in machine_kexec_*.c
        x86: Add native_[ig]dt_invalidate()
        x86/idt: Remove address argument from idt_invalidate()
        x86/irq: Add and use NR_EXTERNAL_VECTORS and NR_SYSTEM_VECTORS
        x86/irq: Remove unused vectors defines
      a22c3f61
    • Linus Torvalds's avatar
      Merge tag 'timers-core-2021-06-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · a941a034
      Linus Torvalds authored
      Pull timer updates from Thomas Gleixner:
       "Time and clocksource/clockevent related updates:
      
        Core changes:
      
         - Infrastructure to support per CPU "broadcast" devices for per CPU
           clockevent devices which stop in deep idle states. This allows us
           to utilize the more efficient architected timer on certain ARM SoCs
           for normal operation instead of permanentely using the slow to
           access SoC specific clockevent device.
      
         - Print the name of the broadcast/wakeup device in /proc/timer_list
      
         - Make the clocksource watchdog more robust against delays between
           reading the current active clocksource and the watchdog
           clocksource. Such delays can be caused by NMIs, SMIs and vCPU
           preemption.
      
           Handle this by reading the watchdog clocksource twice, i.e. before
           and after reading the current active clocksource. In case that the
           two watchdog reads shows an excessive time delta, the read sequence
           is repeated up to 3 times.
      
         - Improve the debug output and add a test module for the watchdog
           mechanism.
      
         - Reimplementation of the venerable time64_to_tm() function with a
           faster and significantly smaller version. Straight from the source,
           i.e. the author of the related research paper contributed this!
      
        Driver changes:
      
         - No new drivers, not even new device tree bindings!
      
         - Fixes, improvements and cleanups and all over the place"
      
      * tag 'timers-core-2021-06-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (30 commits)
        time/kunit: Add missing MODULE_LICENSE()
        time: Improve performance of time64_to_tm()
        clockevents: Use list_move() instead of list_del()/list_add()
        clocksource: Print deviation in nanoseconds when a clocksource becomes unstable
        clocksource: Provide kernel module to test clocksource watchdog
        clocksource: Reduce clocksource-skew threshold
        clocksource: Limit number of CPUs checked for clock synchronization
        clocksource: Check per-CPU clock synchronization when marked unstable
        clocksource: Retry clock read if long delays detected
        clockevents: Add missing parameter documentation
        clocksource/drivers/timer-ti-dm: Drop unnecessary restore
        clocksource/arm_arch_timer: Improve Allwinner A64 timer workaround
        clocksource/drivers/arm_global_timer: Remove duplicated argument in arm_global_timer
        clocksource/drivers/arm_global_timer: Make symbol 'gt_clk_rate_change_nb' static
        arm: zynq: don't disable CONFIG_ARM_GLOBAL_TIMER due to CONFIG_CPU_FREQ anymore
        clocksource/drivers/arm_global_timer: Implement rate compensation whenever source clock changes
        clocksource/drivers/ingenic: Rename unreasonable array names
        clocksource/drivers/timer-ti-dm: Save and restore timer TIOCP_CFG
        clocksource/drivers/mediatek: Ack and disable interrupts on suspend
        clocksource/drivers/samsung_pwm: Constify source IO memory
        ...
      a941a034
    • Linus Torvalds's avatar
      Merge tag 'irq-core-2021-06-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 21edf509
      Linus Torvalds authored
      Pull irq updates from Thomas Gleixner:
       "Updates for the interrupt subsystem:
      
        Core changes:
      
         - Cleanup and simplification of common code to invoke the low level
           interrupt flow handlers when this invocation requires irqdomain
           resolution. Add the necessary core infrastructure.
      
         - Provide a proper interface for modular PMU drivers to set the
           interrupt affinity.
      
         - Add a request flag which allows to exclude interrupts from spurious
           interrupt detection. Useful especially for IPI handlers which
           always return IRQ_HANDLED which turns the spurious interrupt
           detection into a pointless waste of CPU cycles.
      
        Driver changes:
      
         - Bulk convert interrupt chip drivers to the new irqdomain low level
           flow handler invocation mechanism.
      
         - Add device tree bindings for the Renesas R-Car M3-W+ SoC
      
         - Enable modular build of the Qualcomm PDC driver
      
         - The usual small fixes and improvements"
      
      * tag 'irq-core-2021-06-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (38 commits)
        dt-bindings: interrupt-controller: arm,gic-v3: Describe GICv3 optional properties
        irqchip: gic-pm: Remove redundant error log of clock bulk
        irqchip/sun4i: Remove unnecessary oom message
        irqchip/irq-imx-gpcv2: Remove unnecessary oom message
        irqchip/imgpdc: Remove unnecessary oom message
        irqchip/gic-v3-its: Remove unnecessary oom message
        irqchip/gic-v2m: Remove unnecessary oom message
        irqchip/exynos-combiner: Remove unnecessary oom message
        irqchip: Bulk conversion to generic_handle_domain_irq()
        genirq: Move non-irqdomain handle_domain_irq() handling into ARM's handle_IRQ()
        genirq: Add generic_handle_domain_irq() helper
        irqchip/nvic: Convert from handle_IRQ() to handle_domain_irq()
        irqdesc: Fix __handle_domain_irq() comment
        genirq: Use irq_resolve_mapping() to implement __handle_domain_irq() and co
        irqdomain: Introduce irq_resolve_mapping()
        irqdomain: Protect the linear revmap with RCU
        irqdomain: Cache irq_data instead of a virq number in the revmap
        irqdomain: Use struct_size() helper when allocating irqdomain
        irqdomain: Make normal and nomap irqdomains exclusive
        powerpc: Move the use of irq_domain_add_nomap() behind a config option
        ...
      21edf509
    • Linus Torvalds's avatar
      Merge tag 'smp-urgent-2021-06-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 62180152
      Linus Torvalds authored
      Pull CPU hotplug fix from Thomas Gleixner:
       "A fix for the CPU hotplug and cpusets interaction:
      
        cpusets delegate the hotplug work to a workqueue to prevent a lock
        order inversion vs. the CPU hotplug lock. The work is not flushed
        before the hotplug operation returns which creates user visible
        inconsistent state. Prevent this by flushing the work after dropping
        CPU hotplug lock and before releasing the outer mutex which serializes
        the CPU hotplug related sysfs interface operations"
      
      * tag 'smp-urgent-2021-06-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        cpu/hotplug: Cure the cpusets trainwreck
      62180152