1. 09 Mar, 2019 4 commits
    • Ingo Molnar's avatar
      Merge tag 'perf-core-for-mingo-5.1-20190307' of... · b339da48
      Ingo Molnar authored
      Merge tag 'perf-core-for-mingo-5.1-20190307' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
      
      Pull perf/core changes from Arnaldo Carvalho de Melo:
      
      perf bpf:
      
        Arnaldo Carvalho de Melo:
      
        - Automatically add BTF ELF markers to 'perf trace' BPF programs, so that
          tools such as 'bpftool map dump' can pretty print map keys and values.
      
      perf c2c:
      
        Jiri Olsa:
      
        - Fix report for empty NUMA node.
      
      perf diff:
      
        Jin Yao:
      
        - Support --time, --cpu, --pid and --tid filter options.
      
      perf probe:
      
        Arnaldo Carvalho de Melo:
      
        - Clarify error message about not finding kernel modules debuginfo.
      
      perf record:
      
        Jiri Olsa:
      
        - Fixup probing for max attr.precise_ip.
      
      perf trace:
      
        Arnaldo Carvalho de Melo:
      
        - Add missing %s lost in the 'msg_flags' recvmmsg arg when adding prefix suppression logic.
      
      perf annotate:
      
        Arnaldo Carvalho de Melo:
      
        - Calculate the max instruction name, align column to that, removing the
          hardcoded max 6 chars and cope with instructions with names longer than that,
          such as vpmovmskb, vpcmpeqb, etc.
      
      kernel:
      
        Song Liu:
      
        - Consider events with attr.bpf_event set as side-band.
      
        Gustavo A. R. Silva:
      
        - Mark expected switch fall-through in perf_event_parse_addr_filter().
      
      Libraries:
      
        Jiri Olsa:
      
        - Fix leaks and double frees on error paths.
      
      libtraceevent:
      
        Tony Jones:
      
        - Fix buffer overflow in arg_eval().
      
      python scripting:
      
        Tony Jones:
      
        - More python3 fixes.
      
      Trivial:
      
        Yang Wei:
      
        - Remove needless extra semicolon in clang C++ glue code.
      
      Intel PT/BTS:
      
        Adrian Hunter:
      
        - Improve auxtrace address filter error message when there is no DSO.
      
        - Fix divide by zero when TSC is not available.
      
        - Further improvements to the export to sqlite/posgresql python scripts
          and to the GUI sqlviewer, exporting 'parent_id' so that we have enable
          the creation of call trees.
      
        Andi Kleen:
      
        - Generalize function to copy from thread addr space from intel-bts code.
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      b339da48
    • Gustavo A. R. Silva's avatar
      perf/core: Mark expected switch fall-through · 43aa378b
      Gustavo A. R. Silva authored
      In preparation to enabling -Wimplicit-fallthrough, mark switch cases
      where we are expecting to fall through.
      
      This patch fixes the following warning:
      
        kernel/events/core.c: In function ‘perf_event_parse_addr_filter’:
        kernel/events/core.c:9154:11: warning: this statement may fall through [-Wimplicit-fallthrough=]
            kernel = 1;
            ~~~~~~~^~~
        kernel/events/core.c:9156:3: note: here
           case IF_SRC_FILEADDR:
           ^~~~
      
      Warning level 3 was used: -Wimplicit-fallthrough=3
      
      This patch is part of the ongoing efforts to enable -Wimplicit-fallthrough.
      Signed-off-by: default avatarGustavo A. R. Silva <gustavo@embeddedor.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@surriel.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Link: https://lkml.kernel.org/r/20190212205430.GA8446@embeddedorSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      43aa378b
    • Kan Liang's avatar
      perf/x86/intel/uncore: Fix client IMC events return huge result · 8041ffd3
      Kan Liang authored
      The client IMC bandwidth events currently return very large values:
      
        $ perf stat -e uncore_imc/data_reads/ -e uncore_imc/data_writes/ -I 10000 -a
      
        10.000117222 34,788.76 MiB uncore_imc/data_reads/
        10.000117222 8.26 MiB uncore_imc/data_writes/
        20.000374584 34,842.89 MiB uncore_imc/data_reads/
        20.000374584 10.45 MiB uncore_imc/data_writes/
        30.000633299 37,965.29 MiB uncore_imc/data_reads/
        30.000633299 323.62 MiB uncore_imc/data_writes/
        40.000891548 41,012.88 MiB uncore_imc/data_reads/
        40.000891548 6.98 MiB uncore_imc/data_writes/
        50.001142480 1,125,899,906,621,494.75 MiB uncore_imc/data_reads/
        50.001142480 6.97 MiB uncore_imc/data_writes/
      
      The client IMC events are freerunning counters. They still use the
      old event encoding format (0x1 for data_read and 0x2 for data write).
      The counter bit width is calculated by common code, which assume that
      the standard encoding format is used for the freerunning counters.
      Error bit width information is calculated.
      
      The patch intends to convert the old client IMC event encoding to the
      standard encoding format.
      
      Current common code uses event->attr.config which directly copy from
      user space. We should not implicitly modify it for a converted event.
      The event->hw.config is used to replace the event->attr.config in
      common code.
      
      For client IMC events, the event->attr.config is used to calculate a
      converted event with standard encoding format in the custom
      event_init(). The converted event is stored in event->hw.config.
      For other events of freerunning counters, they already use the standard
      encoding format. The same value as event->attr.config is assigned to
      event->hw.config in common event_init().
      Reported-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Tested-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Signed-off-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@surriel.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: stable@kernel.org # v4.18+
      Fixes: 9aae1780 ("perf/x86/intel/uncore: Clean up client IMC uncore")
      Link: https://lkml.kernel.org/r/20190227165729.1861-1-kan.liang@linux.intel.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      8041ffd3
    • Alexander Shishkin's avatar
      perf/ring_buffer: Use high order allocations for AUX buffers optimistically · 5768402f
      Alexander Shishkin authored
      Currently, the AUX buffer allocator will use high-order allocations
      for PMUs that don't support hardware scatter-gather chaining to ensure
      large contiguous blocks of pages, and always use an array of single
      pages otherwise.
      
      There is, however, a tangible performance benefit in using larger chunks
      of contiguous memory even in the latter case, that comes from not having
      to fetch the next page's address at every page boundary. In particular,
      a task running under Intel PT on an Atom CPU shows 1.5%-2% less runtime
      penalty with a single multi-page output region in snapshot mode (no PMI)
      than with multiple single-page output regions, from ~6% down to ~4%. For
      the snapshot mode it does make a difference as it is intended to run over
      long periods of time.
      
      For this reason, change the allocation policy to always optimistically
      start with the highest possible order when allocating pages for the AUX
      buffer, desceding until the allocation succeeds or order zero allocation
      fails.
      Signed-off-by: default avatarAlexander Shishkin <alexander.shishkin@linux.intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@surriel.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Link: https://lkml.kernel.org/r/20190215114727.62648-2-alexander.shishkin@linux.intel.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      5768402f
  2. 06 Mar, 2019 24 commits
    • Jiri Olsa's avatar
      perf data: Force perf_data__open|close zero data->file.path · b8f7d86b
      Jiri Olsa authored
      Making sure the data->file.path is zeroed on perf_data__open error path
      and in perf_data__close, so we don't double free it in case someone call
      it twice.
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de>
      Cc: Nageswara R Sastry <nasastry@in.ibm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
      Link: http://lkml.kernel.org/r/20190305152536.21035-9-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      b8f7d86b
    • Jiri Olsa's avatar
      perf session: Fix double free in perf_data__close · befa09b6
      Jiri Olsa authored
      We can't call perf_data__close and subsequently perf_session__delete,
      because it will call perf_data__close again and cause double free for
      data->file.path.
      
        $ perf report -i .
        incompatible file format (rerun with -v to learn more)
        free(): double free detected in tcache 2
        Aborted (core dumped)
      
      In fact we don't need to call perf_data__close at all, because at the
      time the got out_close is reached, session->data is already initialized,
      so the perf_data__close call will be triggered from
      perf_session__delete.
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de>
      Cc: Nageswara R Sastry <nasastry@in.ibm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
      Fixes: 2d4f2799 ("perf data: Add global path holder")
      Link: http://lkml.kernel.org/r/20190305152536.21035-8-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      befa09b6
    • Jiri Olsa's avatar
      perf evsel: Probe for precise_ip with simple attr · 5b61adb1
      Jiri Olsa authored
      Currently we probe for precise_ip with user specified perf_event_attr,
      which might fail because of unsupported kernel features, which would get
      disabled during the open time anyway.
      
      Switching the probe to take place on simple hw cycles, so the following
      record sets proper precise_ip:
      
        # perf record -e cycles:P ls
        # perf evlist -v
        cycles:P: size: 112, ... precise_ip: 3, ...
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de>
      Cc: Nageswara R Sastry <nasastry@in.ibm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
      Link: http://lkml.kernel.org/r/20190305152536.21035-7-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      5b61adb1
    • Jiri Olsa's avatar
      perf tools: Read and store caps/max_precise in perf_pmu · 90a86bde
      Jiri Olsa authored
      Read the caps/max_precise value and store it in struct perf_pmu to be
      used when setting the maximum precise_ip field in following patch.
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de>
      Cc: Nageswara R Sastry <nasastry@in.ibm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
      Link: http://lkml.kernel.org/r/20190305152536.21035-5-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      90a86bde
    • Jiri Olsa's avatar
      perf hist: Fix memory leak of srcline · 26349585
      Jiri Olsa authored
      We can't allocate he->srcline unconditionaly, only when new hist_entry
      is created. Moving he->srcline allocation into hist_entry__init
      function.
      Original-patch-by: default avatarJonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de>
      Suggested-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Nageswara R Sastry <nasastry@in.ibm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
      Link: http://lkml.kernel.org/r/20190305152536.21035-4-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      26349585
    • Jiri Olsa's avatar
      perf hist: Add error path into hist_entry__init · c5758910
      Jiri Olsa authored
      Adding error path into hist_entry__init to unify error handling, so
      every new member does not need to free everything else.
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
      Cc: nageswara r sastry <nasastry@in.ibm.com>
      Link: http://lkml.kernel.org/r/20190305152536.21035-3-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c5758910
    • Jiri Olsa's avatar
      perf c2c: Fix c2c report for empty numa node · e34c9402
      Jiri Olsa authored
      Ravi Bangoria reported that we fail with an empty NUMA node with the
      following message:
      
        $ lscpu
        NUMA node0 CPU(s):
        NUMA node1 CPU(s):   0-4
      
        $ sudo ./perf c2c report
        node/cpu topology bugFailed setup nodes
      
      Fix this by detecting the empty node and keeping its CPU set empty.
      Reported-by: default avatarNageswara R Sastry <nasastry@in.ibm.com>
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Tested-by: default avatarRavi Bangoria <ravi.bangoria@linux.ibm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20190305152536.21035-2-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e34c9402
    • Tony Jones's avatar
      perf script python: Add Python3 support to intel-pt-events.py · fdf2460c
      Tony Jones authored
      Support both Python2 and Python3 in the intel-pt-events.py script
      
      There may be differences in the ordering of output lines due to
      differences in dictionary ordering etc.  However the format within lines
      should be unchanged.
      
      The use of 'from __future__' implies the minimum supported Python2 version
      is now v2.6
      Signed-off-by: default avatarTony Jones <tonyj@suse.de>
      Acked-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Link: http://lkml.kernel.org/r/fd26acf9-0c0f-717f-9664-a3c33043ce19@suse.deSigned-off-by: default avatarSeeteena Thoufeek <s1seetee@linux.vnet.ibm.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      fdf2460c
    • Tony Jones's avatar
      perf script python: Add Python3 support to event_analyzing_sample.py · c253c72e
      Tony Jones authored
      Support both Python2 and Python3 in the event_analyzing_sample.py script
      
      There may be differences in the ordering of output lines due to
      differences in dictionary ordering etc.  However the format within lines
      should be unchanged.
      
      The use of 'from __future__' implies the minimum supported Python2 version
      is now v2.6
      Signed-off-by: default avatarTony Jones <tonyj@suse.de>
      Cc: Feng Tang <feng.tang@intel.com>
      Link: http://lkml.kernel.org/r/20190302011903.2416-5-tonyj@suse.deSigned-off-by: default avatarSeeteena Thoufeek <s1seetee@linux.vnet.ibm.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c253c72e
    • Tony Jones's avatar
      perf script python: add Python3 support to check-perf-trace.py · 57e604b1
      Tony Jones authored
      Support both Python 2 and Python 3 in the check-perf-trace.py script.
      
      There may be differences in the ordering of output lines due to
      differences in dictionary ordering etc.  However the format within lines
      should be unchanged.
      
      The use of from __future__ implies the minimum supported version of
      Python2 is now v2.6
      Signed-off-by: default avatarTony Jones <tonyj@suse.de>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Link: http://lkml.kernel.org/r/20190302011903.2416-4-tonyj@suse.deSigned-off-by: default avatarSeeteena Thoufeek <s1seetee@linux.vnet.ibm.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      57e604b1
    • Tony Jones's avatar
      perf script python: Add Python3 support to futex-contention.py · de2ec16b
      Tony Jones authored
      Support both Python2 and Python3 in the futex-contention.py script
      
      There may be differences in the ordering of output lines due to
      differences in dictionary ordering etc.  However the format within lines
      should be unchanged.
      
      The use of 'from __future__' implies the minimum supported Python2 version
      is now v2.6
      Signed-off-by: default avatarTony Jones <tonyj@suse.de>
      Link: http://lkml.kernel.org/r/20190302011903.2416-3-tonyj@suse.deSigned-off-by: default avatarSeeteena Thoufeek <s1seetee@linux.vnet.ibm.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      de2ec16b
    • Tony Jones's avatar
      perf script python: Remove mixed indentation · b504d7f6
      Tony Jones authored
      Remove mixed indentation in Python scripts.  Revert to either all tabs
      (most common form) or all spaces (4 or 8) depending on what was the
      intent of the original commit.  This is necessary to complete Python3
      support as it will flag an error if it encounters mixed indentation.
      Signed-off-by: default avatarTony Jones <tonyj@suse.de>
      Link: http://lkml.kernel.org/r/20190302011903.2416-2-tonyj@suse.deSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      b504d7f6
    • Jin Yao's avatar
      perf diff: Support --pid/--tid filter options · c1d3e633
      Jin Yao authored
      Using the existing symbol_conf.pid_list_str and symbol_conf.tid_list_str
      logic.
      
      For example:
      
        perf diff --tid 13965
      
      It'll only diff the samples for thread 13965.
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jin Yao <yao.jin@intel.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1551791143-10334-4-git-send-email-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c1d3e633
    • Jin Yao's avatar
      perf diff: Support --cpu filter option · daca23b2
      Jin Yao authored
      To improve 'perf diff', implement a --cpu filter option.
      
      Multiple CPUs can be provided as a comma-separated list with no space:
      0,1.  Ranges of CPUs are specified with -: 0-2. Default is to report
      samples on all CPUs.
      
      For example,
      
        perf diff --cpu 0,1
      
      It only diff the samples for CPU0 and CPU1.
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jin Yao <yao.jin@intel.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1551791143-10334-3-git-send-email-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      daca23b2
    • Jin Yao's avatar
      perf diff: Support --time filter option · 4802138d
      Jin Yao authored
      To improve 'perf diff', implement a --time filter option to diff the
      samples within given time window.
      
      It supports time percent with multiple time ranges. The time string
      format is 'a%/n,b%/m,...' or 'a%-b%,c%-%d,...'.
      
      For example:
      
      Select the second 10% time slice to diff:
      
        perf diff --time 10%/2
      
      Select from 0% to 10% time slice to diff:
      
        perf diff --time 0%-10%
      
      Select the first and the second 10% time slices to diff:
      
        perf diff --time 10%/1,10%/2
      
      Select from 0% to 10% and 30% to 40% slices to diff:
      
        perf diff --time 0%-10%,30%-40%
      
      It also supports analysing samples within a given time window
      <start>,<stop>.
      
      Times have the format seconds.microseconds.
      
      If 'start' is not given (i.e., time string is ',x.y') then analysis starts at
      the beginning of the file.
      
      If the stop time is not given (i.e, time string is 'x.y,') then analysis
      goes to end of file.
      
      Time string is 'a1.b1,c1.d1:a2.b2,c2.d2'. Use ':' to separate timestamps for
      different perf.data files.
      
      For example, we get the timestamp information from perf script.
      
        perf script -i perf.data.old
      
          mgen 13940 [000]  3946.361400: ...
      
        perf script -i perf.data
      
          mgen 13940 [000]  3971.150589 ...
      
        perf diff --time 3946.361400,:3971.150589,
      
      It analyzes the perf.data.old from the timestamp 3946.361400 to the end of
      perf.data.old and analyzes the perf.data from the timestamp 3971.150589 to the
      end of perf.data.
      
       v4:
       ---
       Update abstime_str_dup(), let it return error if strdup
       is failed, and update __cmd_diff() accordingly.
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jin Yao <yao.jin@intel.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1551791143-10334-2-git-send-email-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      4802138d
    • Andi Kleen's avatar
      perf thread: Generalize function to copy from thread addr space from intel-bts code · 15325938
      Andi Kleen authored
      Add a utility function to fetch executable code. Convert one
      user over to it. There are more places doing that, but they
      do significantly different actions, so they are not
      easy to fit into a single library function.
      
      Committer changes:
      
      . No need to cast around, make 'buf' be a void pointer.
      
      . Rename it to thread__memcpy() to reflect the fact it is about copying
        a chunk of memory from a thread, i.e. from its address space.
      
      . No need to have it in a separate object file, move it to thread.[ch]
      
      . Check the return of map__load(), the original code didn't do it, but
        since we're moving this around, check that as well.
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/r/20190305144758.12397-2-andi@firstfloor.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      15325938
    • Arnaldo Carvalho de Melo's avatar
      perf annotate: Calculate the max instruction name, align column to that · bc3bb795
      Arnaldo Carvalho de Melo authored
      We were hardcoding '6' as the max instruction name, and we have lots
      that are longer than that, see the diff from two 'P' printed TUI
      annotations for a libc function that uses instructions with long names,
      such as 'vpmovmskb' with its 9 chars:
      
        --- __strcmp_avx2.annotation.before	2019-03-06 16:31:39.368020425 -0300
        +++ __strcmp_avx2.annotation	2019-03-06 16:32:12.079450508 -0300
        @@ -2,284 +2,284 @@
         Event: cycles:ppp
      
         Percent        endbr64
        -  0.10         mov    %edi,%eax
        +  0.10         mov        %edi,%eax
        -               xor    %edx,%edx
        +               xor        %edx,%edx
        -  3.54         vpxor  %ymm7,%ymm7,%ymm7
        +  3.54         vpxor      %ymm7,%ymm7,%ymm7
        -               or     %esi,%eax
        +               or         %esi,%eax
        -               and    $0xfff,%eax
        +               and        $0xfff,%eax
        -               cmp    $0xf80,%eax
        +               cmp        $0xf80,%eax
        -             ↓ jg     370
        +             ↓ jg         370
        - 27.07         vmovdqu (%rdi),%ymm1
        + 27.07         vmovdqu    (%rdi),%ymm1
        -  7.97         vpcmpeqb (%rsi),%ymm1,%ymm0
        +  7.97         vpcmpeqb   (%rsi),%ymm1,%ymm0
        -  2.15         vpminub %ymm1,%ymm0,%ymm0
        +  2.15         vpminub    %ymm1,%ymm0,%ymm0
        -  4.09         vpcmpeqb %ymm7,%ymm0,%ymm0
        +  4.09         vpcmpeqb   %ymm7,%ymm0,%ymm0
        -  0.43         vpmovmskb %ymm0,%ecx
        +  0.43         vpmovmskb  %ymm0,%ecx
        -  1.53         test   %ecx,%ecx
        +  1.53         test       %ecx,%ecx
        -             ↓ je     b0
        +             ↓ je         b0
        -  5.26         tzcnt  %ecx,%edx
        +  5.26         tzcnt      %ecx,%edx
        - 18.40         movzbl (%rdi,%rdx,1),%eax
        + 18.40         movzbl     (%rdi,%rdx,1),%eax
        -  7.09         movzbl (%rsi,%rdx,1),%edx
        +  7.09         movzbl     (%rsi,%rdx,1),%edx
        -  3.34         sub    %edx,%eax
        +  3.34         sub        %edx,%eax
           2.37         vzeroupper
                      ← retq
                        nop
        -         50:   tzcnt  %ecx,%edx
        +         50:   tzcnt      %ecx,%edx
        -               movzbl 0x20(%rdi,%rdx,1),%eax
        +               movzbl     0x20(%rdi,%rdx,1),%eax
        -               movzbl 0x20(%rsi,%rdx,1),%edx
        +               movzbl     0x20(%rsi,%rdx,1),%edx
        -               sub    %edx,%eax
        +               sub        %edx,%eax
                        vzeroupper
                      ← retq
        -               data16 nopw %cs:0x0(%rax,%rax,1)
        +               data16     nopw %cs:0x0(%rax,%rax,1)
      Reported-by: default avatarTravis Downs <travis.downs@gmail.com>
      LPU-Reference: CAOBGo4z1KfmWeOm6Et0cnX5Z6DWsG2PQbAvRn1MhVPJmXHrc5g@mail.gmail.com
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-89wsdd9h9g6bvq52sgp6d0u4@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      bc3bb795
    • Linus Torvalds's avatar
      Merge branch 'x86-alternatives-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 6ea98b4b
      Linus Torvalds authored
      Pull x86 alternative instruction updates from Ingo Molnar:
       "Small RDTSCP opimization, enabled by the newly added ALTERNATIVE_3(),
        and other small improvements"
      
      * 'x86-alternatives-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/TSC: Use RDTSCP
        x86/alternatives: Add an ALTERNATIVE_3() macro
        x86/alternatives: Print containing function
        x86/alternatives: Add macro comments
      6ea98b4b
    • Linus Torvalds's avatar
      Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 45802da0
      Linus Torvalds authored
      Pull scheduler updates from Ingo Molnar:
       "The main changes in this cycle were:
      
         - refcount conversions
      
         - Solve the rq->leaf_cfs_rq_list can of worms for real.
      
         - improve power-aware scheduling
      
         - add sysctl knob for Energy Aware Scheduling
      
         - documentation updates
      
         - misc other changes"
      
      * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (34 commits)
        kthread: Do not use TIMER_IRQSAFE
        kthread: Convert worker lock to raw spinlock
        sched/fair: Use non-atomic cpumask_{set,clear}_cpu()
        sched/fair: Remove unused 'sd' parameter from select_idle_smt()
        sched/wait: Use freezable_schedule() when possible
        sched/fair: Prune, fix and simplify the nohz_balancer_kick() comment block
        sched/fair: Explain LLC nohz kick condition
        sched/fair: Simplify nohz_balancer_kick()
        sched/topology: Fix percpu data types in struct sd_data & struct s_data
        sched/fair: Simplify post_init_entity_util_avg() by calling it with a task_struct pointer argument
        sched/fair: Fix O(nr_cgroups) in the load balancing path
        sched/fair: Optimize update_blocked_averages()
        sched/fair: Fix insertion in rq->leaf_cfs_rq_list
        sched/fair: Add tmp_alone_branch assertion
        sched/core: Use READ_ONCE()/WRITE_ONCE() in move_queued_task()/task_rq_lock()
        sched/debug: Initialize sd_sysctl_cpus if !CONFIG_CPUMASK_OFFSTACK
        sched/pelt: Skip updating util_est when utilization is higher than CPU's capacity
        sched/fair: Update scale invariance of PELT
        sched/fair: Move the rq_of() helper function
        sched/core: Convert task_struct.stack_refcount to refcount_t
        ...
      45802da0
    • Linus Torvalds's avatar
      Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 203b6609
      Linus Torvalds authored
      Pull perf updates from Ingo Molnar:
       "Lots of tooling updates - too many to list, here's a few highlights:
      
         - Various subcommand updates to 'perf trace', 'perf report', 'perf
           record', 'perf annotate', 'perf script', 'perf test', etc.
      
         - CPU and NUMA topology and affinity handling improvements,
      
         - HW tracing and HW support updates:
            - Intel PT updates
            - ARM CoreSight updates
            - vendor HW event updates
      
         - BPF updates
      
         - Tons of infrastructure updates, both on the build system and the
           library support side
      
         - Documentation updates.
      
         - ... and lots of other changes, see the changelog for details.
      
        Kernel side updates:
      
         - Tighten up kprobes blacklist handling, reduce the number of places
           where developers can install a kprobe and hang/crash the system.
      
         - Fix/enhance vma address filter handling.
      
         - Various PMU driver updates, small fixes and additions.
      
         - refcount_t conversions
      
         - BPF updates
      
         - error code propagation enhancements
      
         - misc other changes"
      
      * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (238 commits)
        perf script python: Add Python3 support to syscall-counts-by-pid.py
        perf script python: Add Python3 support to syscall-counts.py
        perf script python: Add Python3 support to stat-cpi.py
        perf script python: Add Python3 support to stackcollapse.py
        perf script python: Add Python3 support to sctop.py
        perf script python: Add Python3 support to powerpc-hcalls.py
        perf script python: Add Python3 support to net_dropmonitor.py
        perf script python: Add Python3 support to mem-phys-addr.py
        perf script python: Add Python3 support to failed-syscalls-by-pid.py
        perf script python: Add Python3 support to netdev-times.py
        perf tools: Add perf_exe() helper to find perf binary
        perf script: Handle missing fields with -F +..
        perf data: Add perf_data__open_dir_data function
        perf data: Add perf_data__(create_dir|close_dir) functions
        perf data: Fail check_backup in case of error
        perf data: Make check_backup work over directories
        perf tools: Add rm_rf_perf_data function
        perf tools: Add pattern name checking to rm_rf
        perf tools: Add depth checking to rm_rf
        perf data: Add global path holder
        ...
      203b6609
    • Linus Torvalds's avatar
      Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 3478588b
      Linus Torvalds authored
      Pull locking updates from Ingo Molnar:
       "The biggest part of this tree is the new auto-generated atomics API
        wrappers by Mark Rutland.
      
        The primary motivation was to allow instrumentation without uglifying
        the primary source code.
      
        The linecount increase comes from adding the auto-generated files to
        the Git space as well:
      
          include/asm-generic/atomic-instrumented.h     | 1689 ++++++++++++++++--
          include/asm-generic/atomic-long.h             | 1174 ++++++++++---
          include/linux/atomic-fallback.h               | 2295 +++++++++++++++++++++++++
          include/linux/atomic.h                        | 1241 +------------
      
        I preferred this approach, so that the full call stack of the (already
        complex) locking APIs is still fully visible in 'git grep'.
      
        But if this is excessive we could certainly hide them.
      
        There's a separate build-time mechanism to determine whether the
        headers are out of date (they should never be stale if we do our job
        right).
      
        Anyway, nothing from this should be visible to regular kernel
        developers.
      
        Other changes:
      
         - Add support for dynamic keys, which removes a source of false
           positives in the workqueue code, among other things (Bart Van
           Assche)
      
         - Updates to tools/memory-model (Andrea Parri, Paul E. McKenney)
      
         - qspinlock, wake_q and lockdep micro-optimizations (Waiman Long)
      
         - misc other updates and enhancements"
      
      * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (48 commits)
        locking/lockdep: Shrink struct lock_class_key
        locking/lockdep: Add module_param to enable consistency checks
        lockdep/lib/tests: Test dynamic key registration
        lockdep/lib/tests: Fix run_tests.sh
        kernel/workqueue: Use dynamic lockdep keys for workqueues
        locking/lockdep: Add support for dynamic keys
        locking/lockdep: Verify whether lock objects are small enough to be used as class keys
        locking/lockdep: Check data structure consistency
        locking/lockdep: Reuse lock chains that have been freed
        locking/lockdep: Fix a comment in add_chain_cache()
        locking/lockdep: Introduce lockdep_next_lockchain() and lock_chain_count()
        locking/lockdep: Reuse list entries that are no longer in use
        locking/lockdep: Free lock classes that are no longer in use
        locking/lockdep: Update two outdated comments
        locking/lockdep: Make it easy to detect whether or not inside a selftest
        locking/lockdep: Split lockdep_free_key_range() and lockdep_reset_lock()
        locking/lockdep: Initialize the locks_before and locks_after lists earlier
        locking/lockdep: Make zap_class() remove all matching lock order entries
        locking/lockdep: Reorder struct lock_class members
        locking/lockdep: Avoid that add_chain_cache() adds an invalid chain to the cache
        ...
      3478588b
    • Linus Torvalds's avatar
      Merge branch 'efi-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · c8f5ed6e
      Linus Torvalds authored
      Pull EFI updates from Ingo Molnar:
       "The main EFI changes in this cycle were:
      
         - Use 32-bit alignment for efi_guid_t
      
         - Allow the SetVirtualAddressMap() call to be omitted
      
         - Implement earlycon=efifb based on existing earlyprintk code
      
         - Various minor fixes and code cleanups from Sai, Ard and me"
      
      * 'efi-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        efi: Fix build error due to enum collision between efi.h and ima.h
        efi/x86: Convert x86 EFI earlyprintk into generic earlycon implementation
        x86: Make ARCH_USE_MEMREMAP_PROT a generic Kconfig symbol
        efi/arm/arm64: Allow SetVirtualAddressMap() to be omitted
        efi: Replace GPL license boilerplate with SPDX headers
        efi/fdt: Apply more cleanups
        efi: Use 32-bit alignment for efi_guid_t
        efi/memattr: Don't bail on zero VA if it equals the region's PA
        x86/efi: Mark can_free_region() as an __init function
      c8f5ed6e
    • Yang Wei's avatar
      perf clang: Remove needless extra semicolon · a53837a5
      Yang Wei authored
      Delete a superfluous semicolon in getBPFObjectFromModule().
      Signed-off-by: default avatarYang Wei <yang.wei9@zte.com.cn>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Yang Wei <albin_yang@163.com>
      Link: http://lkml.kernel.org/r/1551710174-3349-1-git-send-email-albin_yang@163.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      a53837a5
    • Arnaldo Carvalho de Melo's avatar
      perf bpf: Automatically add BTF ELF markers · 3163613c
      Arnaldo Carvalho de Melo authored
      The libbpf loader expects that some __btf_map_<MAP_NAME> structs be in
      place with the keys and values types of maps so that one can store the
      struct definitions and have them sent to the kernel via sys_bpf(fd, cmd
      = BTF_LOAD) and then later be retrievable via sys_bpf(fd, cmd =
      BPF_OBJ_GET_INFO_BY_FD) for use by tools such as 'bpftool map dump id
      MAP_ID'.
      
      Since we already have this for defining maps in 'perf trace' BPF events:
      
         bpf_map(name, _type, type_key, type_val, _max_entries)
      
      As used in the tools/perf/examples/bpf/augmented_raw_syscalls.c:
      
       --- 8< ---
      
      struct syscall {
              bool    enabled;
      };
      
      bpf_map(syscalls, ARRAY, int, struct syscall, 512);
      
       --- 8< ---
      
      All we need is to get all that already available info, piggyback on the
      'bpf_map' define in tools/perf/include/bpf/bpf.h, that is included by
      'perf trace' BPF programs and do that without requiring changes to the
      BPF programs already defining maps using 'bpf_map()'.
      
      So this is what we have before this patch:
      
      1) With this in ~/.perfconfig to dump .c events as .o, aka save a copy
         so that we can use the .o later as a pre-compiled BPF bytecode:
      
        # grep '\[llvm\]' -A2 ~/.perfconfig
        [llvm]
      	dump-obj = true
      	clang-opt = -g
      
        #
        # clang --version
        clang version 9.0.0 (https://git.llvm.org/git/clang.git/ 7906282d3afec5dfdc2b27943fd6c0309086c507) (https://git.llvm.org/git/llvm.git/ a1b5de1ff8ae8bc79dc8e86e1f82565229bd0500)
        Target: x86_64-unknown-linux-gnu
        Thread model: posix
        InstalledDir: /opt/llvm/bin
      
      2) Note the -g there so that we get clang to generate debuginfo, and
         since the target is 'bpf' it will generate the BTF info in this
         clang version (9.0).
      
      3) Run a simple 'perf record' specifiying as an event the augmented_raw_syscalls.c
         source code:
      
        # perf record -e /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.c sleep 1
        LLVM: dumping /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.025 MB perf.data ]
      
        # file /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
        /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o: ELF 64-bit LSB relocatable, eBPF, version 1 (SYSV), with debug_info, not stripped
      
      4) Look at the BTF structs encoded in it:
      
        # pahole -F btf --sizes /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
        syscall_enter_args	64	0
        augmented_filename	264	0
        syscall	1	0
        syscall_exit_args	24	0
        bpf_map	28	0
        #
        # pahole -F btf -C syscalls /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
        # pahole -F btf -C syscall /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
        struct syscall {
      	  bool                       enabled;              /*     0     1 */
      
      	  /* size: 1, cachelines: 1, members: 1 */
      	  /* last cacheline: 1 bytes */
        };
        #
      
      5) Ok, with just this we don't have the markers expected by the libbpf
         loader and when we run with this BPF bytecode, because we have:
      
        # grep '\[trace\]' -A1 ~/.perfconfig
        [trace]
      	add_events = /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
        #
      
      6) Lets do a 'perf trace' system wide session using this BPF program:
      
         # perf trace -e *mmsg,open*
        Cache2 I/O/6885 openat(AT_FDCWD, "/home/acme/.cache/mozilla/firefox/ina67tev.default/cache2/entries/BA220AB2914006A7AE96D27BE6EA13DD77519FCA", O_RDWR|O_CREAT|O_TRUNC, S_IRUSR|S_IWUSR) = 106
        Cache2 I/O/6885 openat(AT_FDCWD, "/proc/self/mountinfo", O_RDONLY) = 121
        Cache2 I/O/6885 openat(AT_FDCWD, "/proc/self/mountinfo", O_RDONLY) = 121
        Cache2 I/O/6885 openat(AT_FDCWD, "/proc/self/mountinfo", O_RDONLY) = 121
        Cache2 I/O/6885 openat(AT_FDCWD, "/proc/self/mountinfo", O_RDONLY) = 121
        DNS Res~ver #3/23340 openat(AT_FDCWD, "/etc/hosts", O_RDONLY|O_CLOEXEC) = 106
        DNS Res~ver #3/23340 sendmmsg(106<socket:[3482690]>, 0x7f252f1fcaf0, 2, MSG_NOSIGNAL) = 2
        Cache2 I/O/6885 openat(AT_FDCWD, "/home/acme/.cache/mozilla/firefox/ina67tev.default/cache2/entries/BA220AB2914006A7AE96D27BE6EA13DD77519FCA", O_RDWR) = 106
        lighttpd/18915 openat(AT_FDCWD, "/proc/loadavg", O_RDONLY) = 12
      
      7) While it runs lets see the maps that 'perf trace' + libbpf's BPF
        loader loaded into the kernel via sys_bpf(fd, BPF_BTF_LOAD, ...):
      
        # bpftool map list | tail -6
        149: perf_event_array  name __augmented_sys  flags 0x0
      	  key 4B  value 4B  max_entries 8  memlock 4096B
        150: array  name syscalls  flags 0x0
      	  key 4B  value 1B  max_entries 512  memlock 8192B
        151: hash  name pids_filtered  flags 0x0
      	  key 4B  value 1B  max_entries 64  memlock 8192B
        #
      
      8) Dump the "pids_filtered", map, that will have one entry per PID that
         'perf trace' wants filtered, which includes its own, to avoid a
         tracing feedback loop (perf trace shows the syscalls it does which
         generates more syscalls that it has to show that...), it also
         auto-filters the 'gnome-terminal' and 'sshd' parent PIDs, for the
         same reason:
      
        # bpftool map dump id 151
        key: a5 0c 00 00  value: 01
        key: 14 63 00 00  value: 01
        Found 2 elements
        #
      
      9) Since there is no BTF info available, it does a generic hex dump :-\
      
      10) Now, with this patch applied, we'll do steps 3 to 6 again and look
          with pahole if there are extra structs encoded in BTF:
      
        # pahole -F btf --sizes /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
        syscall_enter_args	64	0
        augmented_filename	264	0
        syscall	1	0
        syscall_exit_args	24	0
        bpf_map	28	0
        ____btf_map___augmented_syscalls__	8	0
        ____btf_map_syscalls	8	0
        ____btf_map_pids_filtered	8	0
        #
      
      11) Yes, those __btf_map_ + the map names, lets see how they look like:
      
        # pahole -F btf -C ____btf_map_syscalls /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
        struct ____btf_map_syscalls {
      	  int                        key;                  /*     0     4 */
      	  struct syscall             value;                /*     4     1 */
      
      	  /* size: 8, cachelines: 1, members: 2 */
      	  /* padding: 3 */
      	  /* last cacheline: 8 bytes */
        };
        #
      
      12) Lets repeat step 7 to get the new map ids:
      
        # bpftool map list | tail -6
        155: perf_event_array  name __augmented_sys  flags 0x0
      	  key 4B  value 4B  max_entries 8  memlock 4096B
        156: array  name syscalls  flags 0x0
      	  key 4B  value 1B  max_entries 512  memlock 8192B
        157: hash  name pids_filtered  flags 0x0
      	  key 4B  value 1B  max_entries 64  memlock 8192B
        #
      
      13) And finally lets dump the 'pids_filtered':
      
        # bpftool map dump id 157
        [{
              "key": 3237,
              "value": true
          },{
              "key": 26435,
              "value": true
          }
        ]
        #
      
      Looks much better! BTF info was used to interpret the key as an integer
      and the value as a struct with just one boolean member, so to make it
      more compact, show just the 'true' value where we saw '01'.
      
      Now to make 'perf trace --dump-map' to use BTF!
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexei Starovoitov <ast@fb.com>
      Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
      Cc: Martin KaFai Lau <kafai@fb.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Cc: Yonghong Song <yhs@fb.com>
      Link: https://lkml.kernel.org/n/tip-ybuf9wpkm30xk28iq7jbwb40@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      3163613c
  3. 05 Mar, 2019 12 commits
    • Linus Torvalds's avatar
      Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 3717f613
      Linus Torvalds authored
      Pull RCU updates from Ingo Molnar:
       "The main RCU related changes in this cycle were:
      
         - Additional cleanups after RCU flavor consolidation
      
         - Grace-period forward-progress cleanups and improvements
      
         - Documentation updates
      
         - Miscellaneous fixes
      
         - spin_is_locked() conversions to lockdep
      
         - SPDX changes to RCU source and header files
      
         - SRCU updates
      
         - Torture-test updates, including nolibc updates and moving nolibc to
           tools/include"
      
      * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (71 commits)
        locking/locktorture: Convert to SPDX license identifier
        linux/torture: Convert to SPDX license identifier
        torture: Convert to SPDX license identifier
        linux/srcu: Convert to SPDX license identifier
        linux/rcutree: Convert to SPDX license identifier
        linux/rcutiny: Convert to SPDX license identifier
        linux/rcu_sync: Convert to SPDX license identifier
        linux/rcu_segcblist: Convert to SPDX license identifier
        linux/rcupdate: Convert to SPDX license identifier
        linux/rcu_node_tree: Convert to SPDX license identifier
        rcu/update: Convert to SPDX license identifier
        rcu/tree: Convert to SPDX license identifier
        rcu/tiny: Convert to SPDX license identifier
        rcu/sync: Convert to SPDX license identifier
        rcu/srcu: Convert to SPDX license identifier
        rcu/rcutorture: Convert to SPDX license identifier
        rcu/rcu_segcblist: Convert to SPDX license identifier
        rcu/rcuperf: Convert to SPDX license identifier
        rcu/rcu.h: Convert to SPDX license identifier
        RCU/torture.txt: Remove section MODULE PARAMETERS
        ...
      3717f613
    • Linus Torvalds's avatar
      Merge branch 'timers-2038-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · b1b988a6
      Linus Torvalds authored
      Pull year 2038 updates from Thomas Gleixner:
       "Another round of changes to make the kernel ready for 2038. After lots
        of preparatory work this is the first set of syscalls which are 2038
        safe:
      
          403 clock_gettime64
          404 clock_settime64
          405 clock_adjtime64
          406 clock_getres_time64
          407 clock_nanosleep_time64
          408 timer_gettime64
          409 timer_settime64
          410 timerfd_gettime64
          411 timerfd_settime64
          412 utimensat_time64
          413 pselect6_time64
          414 ppoll_time64
          416 io_pgetevents_time64
          417 recvmmsg_time64
          418 mq_timedsend_time64
          419 mq_timedreceiv_time64
          420 semtimedop_time64
          421 rt_sigtimedwait_time64
          422 futex_time64
          423 sched_rr_get_interval_time64
      
        The syscall numbers are identical all over the architectures"
      
      * 'timers-2038-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits)
        riscv: Use latest system call ABI
        checksyscalls: fix up mq_timedreceive and stat exceptions
        unicore32: Fix __ARCH_WANT_STAT64 definition
        asm-generic: Make time32 syscall numbers optional
        asm-generic: Drop getrlimit and setrlimit syscalls from default list
        32-bit userspace ABI: introduce ARCH_32BIT_OFF_T config option
        compat ABI: use non-compat openat and open_by_handle_at variants
        y2038: add 64-bit time_t syscalls to all 32-bit architectures
        y2038: rename old time and utime syscalls
        y2038: remove struct definition redirects
        y2038: use time32 syscall names on 32-bit
        syscalls: remove obsolete __IGNORE_ macros
        y2038: syscalls: rename y2038 compat syscalls
        x86/x32: use time64 versions of sigtimedwait and recvmmsg
        timex: change syscalls to use struct __kernel_timex
        timex: use __kernel_timex internally
        sparc64: add custom adjtimex/clock_adjtime functions
        time: fix sys_timer_settime prototype
        time: Add struct __kernel_timex
        time: make adjtime compat handling available for 32 bit
        ...
      b1b988a6
    • Linus Torvalds's avatar
      Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · edaed168
      Linus Torvalds authored
      Pull x86/pti update from Thomas Gleixner:
       "Just a single change from the anti-performance departement:
      
         - Add a new PR_SPEC_DISABLE_NOEXEC option which allows to apply the
           speculation protections on a process without inheriting the state
           on exec.
      
           This remedies a situation where a Java-launcher has speculation
           protections enabled because that's the default for JVMs which
           causes the launched regular harmless processes to inherit the
           protection state which results in unintended performance
           degradation"
      
      * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/speculation: Add PR_SPEC_DISABLE_NOEXEC
      edaed168
    • Linus Torvalds's avatar
      Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 78f86013
      Linus Torvalds authored
      Pull irq updates from Thomas Gleixner:
       "The interrupt departement delivers this time:
      
         - New infrastructure to manage NMIs on platforms which have a sane
           NMI delivery, i.e. identifiable NMI vectors instead of a single
           lump.
      
         - Simplification of the interrupt affinity management so drivers
           don't have to implement ugly loops around the PCI/MSI enablement.
      
         - Speedup for interrupt statistics in /proc/stat
      
         - Provide a function to retrieve the default irq domain
      
         - A new interrupt controller for the Loongson LS1X platform
      
         - Affinity support for the SiFive PLIC
      
         - Better support for the iMX irqsteer driver
      
         - NUMA aware memory allocations for GICv3
      
         - The usual small fixes, improvements and cleanups all over the
           place"
      
      * 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits)
        irqchip/imx-irqsteer: Add multi output interrupts support
        irqchip/imx-irqsteer: Change to use reg_num instead of irq_group
        dt-bindings: irq: imx-irqsteer: Add multi output interrupts support
        dt-binding: irq: imx-irqsteer: Use irq number instead of group number
        irqchip/brcmstb-l2: Use _irqsave locking variants in non-interrupt code
        irqchip/gicv3-its: Use NUMA aware memory allocation for ITS tables
        irqdomain: Allow the default irq domain to be retrieved
        irqchip/sifive-plic: Implement irq_set_affinity() for SMP host
        irqchip/sifive-plic: Differentiate between PLIC handler and context
        irqchip/sifive-plic: Add warning in plic_init() if handler already present
        irqchip/sifive-plic: Pre-compute context hart base and enable base
        PCI/MSI: Remove obsolete sanity checks for multiple interrupt sets
        genirq/affinity: Remove the leftovers of the original set support
        nvme-pci: Simplify interrupt allocation
        genirq/affinity: Add new callback for (re)calculating interrupt sets
        genirq/affinity: Store interrupt sets size in struct irq_affinity
        genirq/affinity: Code consolidation
        irqchip/irq-sifive-plic: Check and continue in case of an invalid cpuid.
        irqchip/i8259: Fix shutdown order by moving syscore_ops registration
        dt-bindings: interrupt-controller: loongson ls1x intc
        ...
      78f86013
    • Linus Torvalds's avatar
      Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 18483190
      Linus Torvalds authored
      Pull timer and clockevent updates from Thomas Gleixner:
       "The time(r) core and clockevent updates are mostly boring this time:
      
         - A new driver for the Tegra210 timer
      
         - Small fixes and improvements alll over the place
      
         - Documentation updates and cleanups"
      
      * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (22 commits)
        soc/tegra: default select TEGRA_TIMER for Tegra210
        clocksource/drivers/tegra: Add Tegra210 timer support
        dt-bindings: timer: add Tegra210 timer
        clocksource/drivers/timer-cs5535: Rename the file for consistency
        clocksource/drivers/timer-pxa: Rename the file for consistency
        clocksource/drivers/tango-xtal: Rename the file for consistency
        dt-bindings: timer: gpt: update binding doc
        clocksource/drivers/exynos_mct: Remove unused header includes
        dt-bindings: timer: mediatek: update bindings for MT7629 SoC
        clocksource/drivers/exynos_mct: Fix error path in timer resources initialization
        clocksource/drivers/exynos_mct: Remove dead code
        clocksource/drivers/riscv: Add required checks during clock source init
        dt-bindings: timer: renesas: tmu: Document r8a774c0 bindings
        dt-bindings: timer: renesas, cmt: Document r8a774c0 CMT support
        clocksource/drivers/exynos_mct: Clear timer interrupt when shutdown
        clocksource/drivers/exynos_mct: Move one-shot check from tick clear to ISR
        clocksource/drivers/arch_timer: Workaround for Allwinner A64 timer instability
        clocksource/drivers/sun5i: Fail gracefully when clock rate is unavailable
        timers: Mark expected switch fall-throughs
        timekeeping/debug: No need to check return value of debugfs_create functions
        ...
      18483190
    • Linus Torvalds's avatar
      Merge tag 'mips_5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux · d9862cfb
      Linus Torvalds authored
      Pull MIPS updates from Paul Burton:
      
       - Support for the MIPSr6 MemoryMapID register & Global INValidate TLB
         (GINVT) instructions, allowing for more efficient TLB maintenance
         when running on a CPU such as the I6500 that supports these.
      
       - Enable huge page support for MIPS64r6.
      
       - Optimize post-DMA cache sync by removing that code entirely for
         kernel configurations in which we know it won't be needed.
      
       - The number of pages allocated for interrupt stacks is now calculated
         correctly, where before we would wastefully allocate too much memory
         in some configurations.
      
       - The ath79 platform migrates to devicetree.
      
       - The bcm47xx platform sees fixes for the Buffalo WHR-G54S board.
      
       - The ingenic/jz4740 platform gains support for appended devicetrees.
      
       - The cavium_octeon, lantiq, loongson32 & sgi-ip27 platforms all see
         cleanups as do various pieces of core architecture code.
      
      * tag 'mips_5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: (66 commits)
        MIPS: lantiq: Remove separate GPHY Firmware loader
        MIPS: ingenic: Add support for appended devicetree
        MIPS: SGI-IP27: rework HUB interrupts
        MIPS: SGI-IP27: do boot CPU init later
        MIPS: SGI-IP27: do xtalk scanning later
        MIPS: SGI-IP27: use pr_info/pr_emerg and pr_cont to fix output
        MIPS: SGI-IP27: clean up bridge access and header files
        MIPS: SGI-IP27: get rid of volatile and hubreg_t
        MIPS: irq: Allocate accurate order pages for irq stack
        MIPS: dma-noncoherent: Remove bogus condition in dma_sync_phys()
        MIPS: eBPF: Remove REG_32BIT_ZERO_EX
        MIPS: eBPF: Always return sign extended 32b values
        MIPS: CM: Fix indentation
        MIPS: BCM47XX: Fix/improve Buffalo WHR-G54S support
        MIPS: OCTEON: program rx/tx-delay always from DT
        MIPS: OCTEON: delete board-specific link status
        MIPS: OCTEON: don't lie about interface type of CN3005 board
        MIPS: OCTEON: warn if deprecated link status is being used
        MIPS: OCTEON: add fixed-link nodes to in-kernel device tree
        MIPS: Delete unused flush_cache_sigtramp()
        ...
      d9862cfb
    • Linus Torvalds's avatar
      Merge branch 'parisc-5.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux · 8feed3ef
      Linus Torvalds authored
      Pull parisc updates from Helge Deller:
       "The most important changes in this patch set are:
      
         - DMA-related cleanups for parisc with the aim to move anything not
           required by drivers out of <asm/dma-mapping.h>, by Christoph
           Hellwig
      
         - Switch to memblock_alloc(), by Mike Rapoport
      
         - Makefile cleanups by Masahiro Yamada
      
         - Switch to bust_spinlocks(), by Sergey Senozhatsky
      
         - Improved initial SMP affinity selection for IRQs
      
         - Added IPI- and rescheduling interrupts in /proc/interrupts output"
      
      * 'parisc-5.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: (21 commits)
        parisc: use memblock_alloc() instead of custom get_memblock()
        parisc: Add constants for various PDC firmware calls
        parisc: Add constant for PDC_PAT_COMPLEX firmware call
        parisc: Show machine product number during boot
        parisc: Add constants for PDC_RELOCATE PDC call
        parisc: Add PDC_CRASH_PREP PDC function number
        parisc: Use F_EXTEND() macro in iosapic code
        parisc: remove the HBA_DATA macro
        parisc/lba_pci: use container_of in LBA_DEV
        parisc/dino: use container_of in DINO_DEV
        parisc: properly type the return value of parisc_walk_tree
        parisc: properly type the iommu field in struct pci_hba_data
        parisc: turn GET_IOC into an inline function
        parisc: move internal implementation details out of <asm/dma-mapping.h>
        parisc: don't include <asm/cacheflush.h> in <asm/dma-mapping.h>
        parisc: remove meaningless ccflags-y in arch/parisc/boot/Makefile
        parisc: replace oops_in_progress manipulation with bust_spinlocks()
        parisc: Improve initial IRQ to CPU assignment
        parisc: Count IPI function call interrupts
        parisc: Show rescheduling interrupts on SMP machines only
        ...
      8feed3ef
    • Linus Torvalds's avatar
      Merge tag 's390-5.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · 3591b195
      Linus Torvalds authored
      Pull s390 updates from Martin Schwidefsky:
      
       - A copy of Arnds compat wrapper generation series
      
       - Pass information about the KVM guest to the host in form the control
         program code and the control program version code
      
       - Map IOV resources to support PCI physical functions on s390
      
       - Add vector load and store alignment hints to improve performance
      
       - Use the "jdd" constraint with gcc 9 to make jump labels working again
      
       - Remove amode workaround for old z/VM releases from the DCSS code
      
       - Add support for in-kernel performance measurements using the CPU
         measurement counter facility
      
       - Introduce a new PMU device cpum_cf_diag to capture counters and store
         thenn as event raw data.
      
       - Bug fixes and cleanups
      
      * tag 's390-5.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (54 commits)
        Revert "s390/cpum_cf: Add kernel message exaplanations"
        s390/dasd: fix read device characteristic with CONFIG_VMAP_STACK=y
        s390/suspend: fix prefix register reset in swsusp_arch_resume
        s390: warn about clearing als implied facilities
        s390: allow overriding facilities via command line
        s390: clean up redundant facilities list setup
        s390/als: remove duplicated in-place implementation of stfle
        s390/cio: Use cpa range elsewhere within vfio-ccw
        s390/cio: Fix vfio-ccw handling of recursive TICs
        s390: vfio_ap: link the vfio_ap devices to the vfio_ap bus subsystem
        s390/cpum_cf: Handle EBUSY return code from CPU counter facility reservation
        s390/cpum_cf: Add kernel message exaplanations
        s390/cpum_cf_diag: Add support for s390 counter facility diagnostic trace
        s390/cpum_cf: add ctr_stcctm() function
        s390/cpum_cf: move common functions into a separate file
        s390/cpum_cf: introduce kernel_cpumcf_avail() function
        s390/cpu_mf: replace stcctm5() with the stcctm() function
        s390/cpu_mf: add store cpu counter multiple instruction support
        s390/cpum_cf: Add minimal in-kernel interface for counter measurements
        s390/cpum_cf: introduce kernel_cpumcf_alert() to obtain measurement alerts
        ...
      3591b195
    • Linus Torvalds's avatar
      Merge tag 'm68k-for-v5.1-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k · 45f5532a
      Linus Torvalds authored
      Pull m68k updates from Geert Uytterhoeven:
      
       - VLA removal
      
       - gcc-8.x build fixes
      
       - small improvements and cleanups
      
       - defconfig updates
      
      * tag 'm68k-for-v5.1-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
        m68k: Add -ffreestanding to CFLAGS
        m68k/apollo: Fix comment in Makefile
        dio: Fix buffer overflow in case of unknown board
        m68k/defconfig: Update defconfigs for v5.0-rc1
        m68k/atari: Avoid VLA use in atari_switches_setup()
        m68k: Avoid VLA use in mangle_kernel_stack()
        m68k/mac: Use '030 reset method on SE/30
        m68k/mac: Remove obsolete comment
        m68k/mac: Skip VIA port setup unless RTC is connected
        m68k/mac: Clean up unused timer definitions
        m68k/defconfig: Drop NET_VENDOR_<FOO>=n
      45f5532a
    • Borislav Petkov's avatar
      x86: Deprecate a.out support · eac61655
      Borislav Petkov authored
      Linux supports ELF binaries for ~25 years now.  a.out coredumping has
      bitrotten quite significantly and would need some fixing to get it into
      shape again but considering how even the toolchains cannot create a.out
      executables in its default configuration, let's deprecate a.out support
      and remove it a couple of releases later, instead.
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Acked-by: default avatarRichard Weinberger <richard@nod.at>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
      Cc: Jann Horn <jannh@google.com>
      Cc: <linux-api@vger.kernel.org>
      Cc: <linux-fsdevel@vger.kernel.org>
      Cc: lkml <linux-kernel@vger.kernel.org>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: <x86@kernel.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      eac61655
    • Linus Torvalds's avatar
      a.out: remove core dumping support · 08300f44
      Linus Torvalds authored
      We're (finally) phasing out a.out support for good.  As Borislav Petkov
      points out, we've supported ELF binaries for about 25 years by now, and
      coredumping in particular has bitrotted over the years.
      
      None of the tool chains even support generating a.out binaries any more,
      and the plan is to deprecate a.out support entirely for the kernel.  But
      I want to start with just removing the core dumping code, because I can
      still imagine that somebody actually might want to support a.out as a
      simpler biinary format.
      
      Particularly if you generate some random binaries on the fly, ELF is a
      much more complicated format (admittedly ELF also does have a lot of
      toolchain support, mitigating that complexity a lot and you really
      should have moved over in the last 25 years).
      
      So it's at least somewhat possible that somebody out there has some
      workflow that still involves generating and running a.out executables.
      
      In contrast, it's very unlikely that anybody depends on debugging any
      legacy a.out core files.  But regardless, I want this phase-out to be
      done in two steps, so that we can resurrect a.out support (if needed)
      without having to resurrect the core file dumping that is almost
      certainly not needed.
      
      Jann Horn pointed to the <asm/a.out-core.h> file that my first trivial
      cut at this had missed.
      
      And Alan Cox points out that the a.out binary loader _could_ be done in
      user space if somebody wants to, but we might keep just the loader in
      the kernel if somebody really wants it, since the loader isn't that big
      and has no really odd special cases like the core dumping does.
      Acked-by: default avatarBorislav Petkov <bp@alien8.de>
      Cc: Alan Cox <gnomes@lxorguk.ukuu.org.uk>
      Cc: Jann Horn <jannh@google.com>
      Cc: Richard Weinberger <richard@nod.at>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      08300f44
    • Linus Torvalds's avatar
      Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 · 63bdf428
      Linus Torvalds authored
      Pull crypto update from Herbert Xu:
       "API:
         - Add helper for simple skcipher modes.
         - Add helper to register multiple templates.
         - Set CRYPTO_TFM_NEED_KEY when setkey fails.
         - Require neither or both of export/import in shash.
         - AEAD decryption test vectors are now generated from encryption
           ones.
         - New option CONFIG_CRYPTO_MANAGER_EXTRA_TESTS that includes random
           fuzzing.
      
        Algorithms:
         - Conversions to skcipher and helper for many templates.
         - Add more test vectors for nhpoly1305 and adiantum.
      
        Drivers:
         - Add crypto4xx prng support.
         - Add xcbc/cmac/ecb support in caam.
         - Add AES support for Exynos5433 in s5p.
         - Remove sha384/sha512 from artpec7 as hardware cannot do partial
           hash"
      
      [ There is a merge of the Freescale SoC tree in order to pull in changes
        required by patches to the caam/qi2 driver. ]
      
      * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (174 commits)
        crypto: s5p - add AES support for Exynos5433
        dt-bindings: crypto: document Exynos5433 SlimSSS
        crypto: crypto4xx - add missing of_node_put after of_device_is_available
        crypto: cavium/zip - fix collision with generic cra_driver_name
        crypto: af_alg - use struct_size() in sock_kfree_s()
        crypto: caam - remove redundant likely/unlikely annotation
        crypto: s5p - update iv after AES-CBC op end
        crypto: x86/poly1305 - Clear key material from stack in SSE2 variant
        crypto: caam - generate hash keys in-place
        crypto: caam - fix DMA mapping xcbc key twice
        crypto: caam - fix hash context DMA unmap size
        hwrng: bcm2835 - fix probe as platform device
        crypto: s5p-sss - Use AES_BLOCK_SIZE define instead of number
        crypto: stm32 - drop pointless static qualifier in stm32_hash_remove()
        crypto: chelsio - Fixed Traffic Stall
        crypto: marvell - Remove set but not used variable 'ivsize'
        crypto: ccp - Update driver messages to remove some confusion
        crypto: adiantum - add 1536 and 4096-byte test vectors
        crypto: nhpoly1305 - add a test vector with len % 16 != 0
        crypto: arm/aes-ce - update IV after partial final CTR block
        ...
      63bdf428