1. 05 May, 2020 8 commits
  2. 30 Apr, 2020 13 commits
    • Kajol Jain's avatar
      perf vendor events power9: Add hv_24x7 socket/chip level metric events · 354575c0
      Kajol Jain authored
      The hv_24×7 feature in IBM® POWER9 processor-based servers provide the
      facility to continuously collect large numbers of hardware performance
      metrics efficiently and accurately.
      
      This patch adds hv_24x7  metric file for different Socket/chip
      resources.
      
      Result:
      
      power9 platform:
      
        command:# ./perf stat --metric-only -M Memory_RD_BW_Chip -C 0 -I 1000
      
           1.000096188          0.9           0.3
           2.000285720          0.5           0.1
           3.000424990          0.4           0.1
      
        command:# ./perf stat --metric-only -M PowerBUS_Frequency -C 0 -I 1000
      
           1.000097981          2.3           2.3
           2.000291713          2.3           2.3
           3.000421719          2.3           2.3
           4.000550912          2.3           2.3
      Signed-off-by: default avatarKajol Jain <kjain@linux.ibm.com>
      Acked-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Anju T Sudhakar <anju@linux.vnet.ibm.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Joe Mario <jmario@redhat.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
      Cc: Mamatha Inamdar <mamatha4@linux.vnet.ibm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@ozlabs.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
      Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linuxppc-dev@lists.ozlabs.org
      Link: http://lore.kernel.org/lkml/20200401203340.31402-8-kjain@linux.ibm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      354575c0
    • Kajol Jain's avatar
      perf tools: Enable Hz/hz prinitg for --metric-only option · 3351c6da
      Kajol Jain authored
      Commit 54b50916 ("perf stat: Implement --metric-only mode") added
      function 'valid_only_metric()' which drops "Hz" or "hz", if it is part
      of "ScaleUnit". This patch enable it since hv_24x7 supports couple of
      frequency events.
      Signed-off-by: default avatarKajol Jain <kjain@linux.ibm.com>
      Acked-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Anju T Sudhakar <anju@linux.vnet.ibm.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Joe Mario <jmario@redhat.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
      Cc: Mamatha Inamdar <mamatha4@linux.vnet.ibm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@ozlabs.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
      Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linuxppc-dev@lists.ozlabs.org
      Link: http://lore.kernel.org/lkml/20200401203340.31402-7-kjain@linux.ibm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      3351c6da
    • Kajol Jain's avatar
      perf tests expr: Added test for runtime param in metric expression · 9022608e
      Kajol Jain authored
      Added test case for parsing  "?" in metric expression.
      Signed-off-by: default avatarKajol Jain <kjain@linux.ibm.com>
      Acked-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Anju T Sudhakar <anju@linux.vnet.ibm.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Joe Mario <jmario@redhat.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
      Cc: Mamatha Inamdar <mamatha4@linux.vnet.ibm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@ozlabs.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
      Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linuxppc-dev@lists.ozlabs.org
      Link: http://lore.kernel.org/lkml/20200401203340.31402-6-kjain@linux.ibm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      9022608e
    • Kajol Jain's avatar
      perf metricgroups: Enhance JSON/metric infrastructure to handle "?" · 1e1a873d
      Kajol Jain authored
      Patch enhances current metric infrastructure to handle "?" in the metric
      expression. The "?" can be use for parameters whose value not known
      while creating metric events and which can be replace later at runtime
      to the proper value. It also add flexibility to create multiple events
      out of single metric event added in JSON file.
      
      Patch adds function 'arch_get_runtimeparam' which is a arch specific
      function, returns the count of metric events need to be created.  By
      default it return 1.
      
      This infrastructure needed for hv_24x7 socket/chip level events.
      "hv_24x7" chip level events needs specific chip-id to which the data is
      requested. Function 'arch_get_runtimeparam' implemented in header.c
      which extract number of sockets from sysfs file "sockets" under
      "/sys/devices/hv_24x7/interface/".
      
      With this patch basically we are trying to create as many metric events
      as define by runtime_param.
      
      For that one loop is added in function 'metricgroup__add_metric', which
      create multiple events at run time depend on return value of
      'arch_get_runtimeparam' and merge that event in 'group_list'.
      
      To achieve that we are actually passing this parameter value as part of
      `expr__find_other` function and changing "?" present in metric
      expression with this value.
      
      As in our JSON file, there gonna be single metric event, and out of
      which we are creating multiple events.
      
      To understand which data count belongs to which parameter value,
      we also printing param value in generic_metric function.
      
      For example,
      
        command:# ./perf stat  -M PowerBUS_Frequency -C 0 -I 1000
          1.000101867  9,356,933  hv_24x7/pm_pb_cyc,chip=0/ #  2.3 GHz  PowerBUS_Frequency_0
          1.000101867  9,366,134  hv_24x7/pm_pb_cyc,chip=1/ #  2.3 GHz  PowerBUS_Frequency_1
          2.000314878  9,365,868  hv_24x7/pm_pb_cyc,chip=0/ #  2.3 GHz  PowerBUS_Frequency_0
          2.000314878  9,366,092  hv_24x7/pm_pb_cyc,chip=1/ #  2.3 GHz  PowerBUS_Frequency_1
      
      So, here _0 and _1 after PowerBUS_Frequency specify parameter value.
      Signed-off-by: default avatarKajol Jain <kjain@linux.ibm.com>
      Acked-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Anju T Sudhakar <anju@linux.vnet.ibm.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Joe Mario <jmario@redhat.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
      Cc: Mamatha Inamdar <mamatha4@linux.vnet.ibm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@ozlabs.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
      Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linuxppc-dev@lists.ozlabs.org
      Link: http://lore.kernel.org/lkml/20200401203340.31402-5-kjain@linux.ibm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      1e1a873d
    • Shaokun Zhang's avatar
      perf pmu: Fix function name in comment, its get_cpuid_str(), not get_cpustr() · 454a8be0
      Shaokun Zhang authored
      get_cpuid_str() is used in tools/perf/arch/xxx/util/header.c,
      fix the name in comment.
      Signed-off-by: default avatarShaokun Zhang <zhangshaokun@hisilicon.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Link: http://lore.kernel.org/lkml/1588141992-48382-1-git-send-email-zhangshaokun@hisilicon.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      454a8be0
    • Zou Wei's avatar
      perf report: Fix warning assignment of 0/1 to bool variable · 6fa9c3e7
      Zou Wei authored
      Fixes coccicheck warning:
      
        tools/perf/builtin-report.c:1403:2-34: WARNING: Assignment of 0/1 to bool variable
      Reported-by: default avatarHulk Robot <hulkci@huawei.com>
      Signed-off-by: default avatarZou Wei <zou_wei@huawei.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/1587904683-3510-1-git-send-email-zou_wei@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      6fa9c3e7
    • Zou Wei's avatar
      perf tools: Remove unneeded semicolons · 8284bbea
      Zou Wei authored
      Fixes coccicheck warnings:
      
        tools/perf/builtin-diff.c:1565:2-3: Unneeded semicolon
        tools/perf/builtin-lock.c:778:2-3: Unneeded semicolon
        tools/perf/builtin-mem.c:126:2-3: Unneeded semicolon
        tools/perf/util/intel-pt-decoder/intel-pt-pkt-decoder.c:555:2-3: Unneeded semicolon
        tools/perf/util/ordered-events.c:317:2-3: Unneeded semicolon
        tools/perf/util/synthetic-events.c:1131:2-3: Unneeded semicolon
        tools/perf/util/trace-event-read.c:78:2-3: Unneeded semicolon
      Reported-by: default avatarHulk Robot <hulkci@huawei.com>
      Signed-off-by: default avatarZou Wei <zou_wei@huawei.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/1588065523-71423-1-git-send-email-zou_wei@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      8284bbea
    • Zou Wei's avatar
      perf c2c: Remove unneeded semicolon · 2cca512a
      Zou Wei authored
      Fixes coccicheck warnings:
      
       tools/perf/builtin-c2c.c:1712:2-3: Unneeded semicolon
       tools/perf/builtin-c2c.c:1928:2-3: Unneeded semicolon
       tools/perf/builtin-c2c.c:2962:2-3: Unneeded semicolon
      Reported-by: default avatarHulk Robot <hulkci@huawei.com>
      Signed-off-by: default avatarZou Wei <zou_wei@huawei.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/1588064336-70456-1-git-send-email-zou_wei@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      2cca512a
    • Zou Wei's avatar
      libtraceevent: Remove unneeded semicolon · eebe80c9
      Zou Wei authored
      Fixes coccicheck warning:
      
       tools/lib/traceevent/kbuffer-parse.c:441:2-3: Unneeded semicolon
      Reported-by: default avatarHulk Robot <hulkci@huawei.com>
      Signed-off-by: default avatarZou Wei <zou_wei@huawei.com>
      Acked-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Link: http://lore.kernel.org/lkml/1588065121-71236-1-git-send-email-zou_wei@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      eebe80c9
    • Stephane Eranian's avatar
      perf script: Remove extraneous newline in perf_sample__fprintf_regs() · fad1f1e7
      Stephane Eranian authored
      When printing iregs, there was a double newline printed because
      perf_sample__fprintf_regs() was printing its own and then at the end of
      all fields, perf script was adding one.  This was causing blank line in
      the output:
      
      Before:
      
        $ perf script -Fip,iregs
                   401b8d ABI:2    DX:0x100    SI:0x4a8340    DI:0x4a9340
      
                   401b8d ABI:2    DX:0x100    SI:0x4a9340    DI:0x4a8340
      
                   401b8d ABI:2    DX:0x100    SI:0x4a8340    DI:0x4a9340
      
                   401b8d ABI:2    DX:0x100    SI:0x4a9340    DI:0x4a8340
      
      After:
      
        $ perf script -Fip,iregs
                   401b8d ABI:2    DX:0x100    SI:0x4a8340    DI:0x4a9340
                   401b8d ABI:2    DX:0x100    SI:0x4a9340    DI:0x4a8340
                   401b8d ABI:2    DX:0x100    SI:0x4a8340    DI:0x4a9340
      
      Committer testing:
      
      First we need to figure out how to request that registers be recorded,
      so we use:
      
        # perf record -h reg
      
         Usage: perf record [<options>] [<command>]
            or: perf record [<options>] -- <command> [<options>]
      
            -I, --intr-regs[=<any register>]
                                  sample selected machine registers on interrupt, use '-I?' to list register names
                --buildid-all     Record build-id of all DSOs regardless of hits
                --user-regs[=<any register>]
                                  sample selected machine registers on interrupt, use '--user-regs=?' to list register names
      
        #
      
      Ok, now lets ask for them all:
      
        # perf record -a --intr-regs --user-regs sleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 4.105 MB perf.data (2760 samples) ]
        #
      
      Lets look at the first 6 output lines:
      
        # perf script -Fip,iregs | head -6
         ffffffff8a06f2f4 ABI:2    AX:0xffffd168fee0a980    BX:0xffff8a23b087f000    CX:0xfffeb69aaeb25d73    DX:0xffff8a253e8310f0    SI:0xfffffff9bafe7359    DI:0xffffb1690204fb10    BP:0xffffd168fee0a950    SP:0xffffb1690204fb88    IP:0xffffffff8a06f2f4 FLAGS:0x4e    CS:0x10    SS:0x18    R8:0x1495f0a91129a    R9:0xffff8a23b087f000   R10:0x1   R11:0xffffffff   R12:0x0   R13:0xffff8a253e827e00   R14:0xffffd168fee0aa5c   R15:0xffffd168fee0a980
      
         ffffffff8a06f2f4 ABI:2    AX:0x0    BX:0xffffd168fee0a950    CX:0x5684cc1118491900    DX:0x0    SI:0xffffd168fee0a9d0    DI:0x202    BP:0xffffb1690204fd70    SP:0xffffb1690204fd20    IP:0xffffffff8a06f2f4 FLAGS:0x24e    CS:0x10    SS:0x18    R8:0x0    R9:0xffffd168fee0a9d0   R10:0x1   R11:0xffffffff   R12:0xffffffff8a23e480   R13:0xffff8a23b087f240   R14:0xffff8a23b087f000   R15:0xffffd168fee0a950
      
         ffffffff8a06f2f4 ABI:2    AX:0x0    BX:0x0    CX:0x7f25f334335b    DX:0x0    SI:0x2400    DI:0x4    BP:0x7fff5f264570    SP:0x7fff5f264538    IP:0xffffffff8a06f2f4 FLAGS:0x24e    CS:0x10    SS:0x2b    R8:0x0    R9:0x2312d20   R10:0x0   R11:0x246   R12:0x22cc0e0   R13:0x0   R14:0x0   R15:0x22d0780
      
        #
      
      Reproduced, apply the patch and:
      
      [root@five ~]# perf script -Fip,iregs | head -6
       ffffffff8a06f2f4 ABI:2    AX:0xffffd168fee0a980    BX:0xffff8a23b087f000    CX:0xfffeb69aaeb25d73    DX:0xffff8a253e8310f0    SI:0xfffffff9bafe7359    DI:0xffffb1690204fb10    BP:0xffffd168fee0a950    SP:0xffffb1690204fb88    IP:0xffffffff8a06f2f4 FLAGS:0x4e    CS:0x10    SS:0x18    R8:0x1495f0a91129a    R9:0xffff8a23b087f000   R10:0x1   R11:0xffffffff   R12:0x0   R13:0xffff8a253e827e00   R14:0xffffd168fee0aa5c   R15:0xffffd168fee0a980
       ffffffff8a06f2f4 ABI:2    AX:0x0    BX:0xffffd168fee0a950    CX:0x5684cc1118491900    DX:0x0    SI:0xffffd168fee0a9d0    DI:0x202    BP:0xffffb1690204fd70    SP:0xffffb1690204fd20    IP:0xffffffff8a06f2f4 FLAGS:0x24e    CS:0x10    SS:0x18    R8:0x0    R9:0xffffd168fee0a9d0   R10:0x1   R11:0xffffffff   R12:0xffffffff8a23e480   R13:0xffff8a23b087f240   R14:0xffff8a23b087f000   R15:0xffffd168fee0a950
       ffffffff8a06f2f4 ABI:2    AX:0x0    BX:0x0    CX:0x7f25f334335b    DX:0x0    SI:0x2400    DI:0x4    BP:0x7fff5f264570    SP:0x7fff5f264538    IP:0xffffffff8a06f2f4 FLAGS:0x24e    CS:0x10    SS:0x2b    R8:0x0    R9:0x2312d20   R10:0x0   R11:0x246   R12:0x22cc0e0   R13:0x0   R14:0x0   R15:0x22d0780
       ffffffff8a24074b ABI:2    AX:0xcb    BX:0xcb    CX:0x0    DX:0x0    SI:0xffffb1690204ff58    DI:0xcb    BP:0xffffb1690204ff58    SP:0xffffb1690204ff40    IP:0xffffffff8a24074b FLAGS:0x24e    CS:0x10    SS:0x18    R8:0x0    R9:0x0   R10:0x0   R11:0x0   R12:0x0   R13:0x0   R14:0x0   R15:0x0
       ffffffff8a310600 ABI:2    AX:0x0    BX:0xffffffff8b8c39a0    CX:0x0    DX:0xffff8a2503890300    SI:0xffffb1690204ff20    DI:0xffff8a23e4080000    BP:0xffff8a23e4080000    SP:0xffffb1690204fec0    IP:0xffffffff8a310600 FLAGS:0x28e    CS:0x10    SS:0x18    R8:0x0    R9:0x0   R10:0x0   R11:0x0   R12:0xffffffffffffffea   R13:0xffff8a23e4080020   R14:0x0   R15:0x0
       ffffffff8a11b688 ABI:2    AX:0x0    BX:0xffff8a237b7c8800    CX:0xffffb1690204fae0    DX:0x78    SI:0xffff8a237b7c8800    DI:0xffffb1690204fa10    BP:0xffffb1690204fb00    SP:0xffffb1690204fa00    IP:0xffffffff8a11b688 FLAGS:0x8a    CS:0x10    SS:0x18    R8:0x1495f0a917eba    R9:0xffffd168fde19a48   R10:0xffffb1690204fd98   R11:0xffff8a253e82afb0   R12:0xffff8a237b7c8800   R13:0xffffb1690204fb00   R14:0x0   R15:0xffff8a237b7c8800
      [root@five ~]#
      
      To see it more clearly, lets get just two of those registers by sample:
      
        # perf record -a --intr-regs=ax,bx --user-regs=cx,dx sleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 3.502 MB perf.data (1653 samples) ]
        #
      
      Extra info, lets see what gets setup in that 'struct perf_event_attr':
      
        # perf evlist -v
        cycles: size: 120, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CPU|PERIOD|REGS_USER|REGS_INTR, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, precise_ip: 2, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1, sample_regs_user: 0xc, sample_regs_intr: 0x3
        #
      
      Cook, some PERF_SAMPLE_REGS_USER|PERF_SAMPLE_REGS_INTR +
      attr.sample_regs_user and attr.sample_regs_intr register masks, now lets
      see if those newlines are gone in a more compact fashion:
      
        # perf script -Fip,iregs,uregs
         ffffffff8a56df78 ABI:2    AX:0xffff8a25137b6028    BX:0xffff8a2502f18000  ABI:2    CX:0x7f204460e49b    DX:0xf42920
         ffffffff8a56df78 ABI:2    AX:0xffff8a25137b6028    BX:0xffff8a2502f18000  ABI:2    CX:0x7f204460e49b    DX:0xf42920
         ffffffff8a56df78 ABI:2    AX:0xffff8a25137b6028    BX:0xffff8a2502f18000  ABI:2    CX:0x7f204460e49b    DX:0xf42920
         ffffffff8a56df78 ABI:2    AX:0xffff8a25137b6028    BX:0xffff8a2502f18000  ABI:2    CX:0x7f204460e49b    DX:0xf42920
         ffffffff8a56df78 ABI:2    AX:0xffff8a25137b6028    BX:0xffff8a2502f18000  ABI:2    CX:0x7f204460e49b    DX:0xf42920
         ffffffff8a56df78 ABI:2    AX:0xffff8a25137b6028    BX:0xffff8a2502f18000  ABI:2    CX:0x7f204460e49b    DX:0xf42920
         ffffffff8a29b78d ABI:2    AX:0x2a20ffcd6000    BX:0x2ec7d9000  ABI:2    CX:0x7f204460e49b    DX:0xf42920
        #
      
      And where was that?
      
        # perf script -Fip,iregs,uregs,sym,dso
         ffffffff8a56df78 strrchr (/lib/modules/5.7.0-rc2/build/vmlinux) ABI:2    AX:0xffff8a25137b6028    BX:0xffff8a2502f18000  ABI:2    CX:0x7f204460e49b    DX:0xf42920
         ffffffff8a56df78 strrchr (/lib/modules/5.7.0-rc2/build/vmlinux) ABI:2    AX:0xffff8a25137b6028    BX:0xffff8a2502f18000  ABI:2    CX:0x7f204460e49b    DX:0xf42920
         ffffffff8a56df78 strrchr (/lib/modules/5.7.0-rc2/build/vmlinux) ABI:2    AX:0xffff8a25137b6028    BX:0xffff8a2502f18000  ABI:2    CX:0x7f204460e49b    DX:0xf42920
         ffffffff8a56df78 strrchr (/lib/modules/5.7.0-rc2/build/vmlinux) ABI:2    AX:0xffff8a25137b6028    BX:0xffff8a2502f18000  ABI:2    CX:0x7f204460e49b    DX:0xf42920
         ffffffff8a56df78 strrchr (/lib/modules/5.7.0-rc2/build/vmlinux) ABI:2    AX:0xffff8a25137b6028    BX:0xffff8a2502f18000  ABI:2    CX:0x7f204460e49b    DX:0xf42920
         ffffffff8a56df78 strrchr (/lib/modules/5.7.0-rc2/build/vmlinux) ABI:2    AX:0xffff8a25137b6028    BX:0xffff8a2502f18000  ABI:2    CX:0x7f204460e49b    DX:0xf42920
         ffffffff8a29b78d __vma_link_rb (/lib/modules/5.7.0-rc2/build/vmlinux) ABI:2    AX:0x2a20ffcd6000    BX:0x2ec7d9000  ABI:2    CX:0x7f204460e49b    DX:0xf42920
        #
      Signed-off-by: default avatarStephane Eranian <eranian@google.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20200418231908.152212-1-eranian@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      fad1f1e7
    • Ian Rogers's avatar
      perf synthetic events: Remove use of sscanf from /proc reading · 2069425e
      Ian Rogers authored
      The synthesize benchmark, run on a single process and thread, shows
      perf_event__synthesize_mmap_events as the hottest function with fgets
      and sscanf taking the majority of execution time.
      
      fscanf performs similarly well. Replace the scanf call with manual
      reading of each field of the /proc/pid/maps line, and remove some
      unnecessary buffering.
      
      This change also addresses potential, but unlikely, buffer overruns for
      the string values read by scanf.
      
      Performance before is:
      
        $ sudo perf bench internals synthesize -m 16 -M 16 -s -t
        \# Running 'internals/synthesize' benchmark:
        Computing performance of single threaded perf event synthesis by
        synthesizing events on the perf process itself:
          Average synthesis took: 102.810 usec (+- 0.027 usec)
          Average num. events: 17.000 (+- 0.000)
          Average time per event 6.048 usec
          Average data synthesis took: 106.325 usec (+- 0.018 usec)
          Average num. events: 89.000 (+- 0.000)
          Average time per event 1.195 usec
        Computing performance of multi threaded perf event synthesis by
        synthesizing events on CPU 0:
          Number of synthesis threads: 16
            Average synthesis took: 68103.100 usec (+- 441.234 usec)
            Average num. events: 30703.000 (+- 0.730)
            Average time per event 2.218 usec
      
      And after is:
      
        $ sudo perf bench internals synthesize -m 16 -M 16 -s -t
        \# Running 'internals/synthesize' benchmark:
        Computing performance of single threaded perf event synthesis by
        synthesizing events on the perf process itself:
          Average synthesis took: 50.388 usec (+- 0.031 usec)
          Average num. events: 17.000 (+- 0.000)
          Average time per event 2.964 usec
          Average data synthesis took: 52.693 usec (+- 0.020 usec)
          Average num. events: 89.000 (+- 0.000)
          Average time per event 0.592 usec
        Computing performance of multi threaded perf event synthesis by
        synthesizing events on CPU 0:
          Number of synthesis threads: 16
            Average synthesis took: 45022.400 usec (+- 552.740 usec)
            Average num. events: 30624.200 (+- 10.037)
            Average time per event 1.470 usec
      
      On a Intel Xeon 6154 compiling with Debian gcc 9.2.1.
      
      Committer testing:
      
      On a AMD Ryzen 5 3600X 6-Core Processor:
      
      Before:
      
        # perf bench internals synthesize --min-threads 12 --max-threads 12 --st --mt
        # Running 'internals/synthesize' benchmark:
        Computing performance of single threaded perf event synthesis by
        synthesizing events on the perf process itself:
          Average synthesis took: 267.491 usec (+- 0.176 usec)
          Average num. events: 56.000 (+- 0.000)
          Average time per event 4.777 usec
          Average data synthesis took: 277.257 usec (+- 0.169 usec)
          Average num. events: 287.000 (+- 0.000)
          Average time per event 0.966 usec
        Computing performance of multi threaded perf event synthesis by
        synthesizing events on CPU 0:
          Number of synthesis threads: 12
            Average synthesis took: 81599.500 usec (+- 346.315 usec)
            Average num. events: 36096.100 (+- 2.523)
            Average time per event 2.261 usec
        #
      
      After:
      
        # perf bench internals synthesize --min-threads 12 --max-threads 12 --st --mt
        # Running 'internals/synthesize' benchmark:
        Computing performance of single threaded perf event synthesis by
        synthesizing events on the perf process itself:
          Average synthesis took: 110.125 usec (+- 0.080 usec)
          Average num. events: 56.000 (+- 0.000)
          Average time per event 1.967 usec
          Average data synthesis took: 118.518 usec (+- 0.057 usec)
          Average num. events: 287.000 (+- 0.000)
          Average time per event 0.413 usec
        Computing performance of multi threaded perf event synthesis by
        synthesizing events on CPU 0:
          Number of synthesis threads: 12
            Average synthesis took: 43490.700 usec (+- 284.527 usec)
            Average num. events: 37028.500 (+- 0.563)
            Average time per event 1.175 usec
        #
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: default avatarJiri Olsa <jolsa@redhat.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andrey Zhizhikin <andrey.z@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lore.kernel.org/lkml/20200415054050.31645-4-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      2069425e
    • Ian Rogers's avatar
      tools api: Add a lightweight buffered reading api · e95770af
      Ian Rogers authored
      The synthesize benchmark shows the majority of execution time going to
      fgets and sscanf, necessary to parse /proc/pid/maps. Add a new buffered
      reading library that will be used to replace these calls in a follow-up
      CL. Add tests for the library to perf test.
      
      Committer tests:
      
        $ perf test api
        63: Test api io                                           : Ok
        $
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Acked-by: default avatarJiri Olsa <jolsa@redhat.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andrey Zhizhikin <andrey.z@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lore.kernel.org/lkml/20200415054050.31645-3-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e95770af
    • Ian Rogers's avatar
      perf bench: Add a multi-threaded synthesize benchmark · 13edc237
      Ian Rogers authored
      By default this isn't run as it reads /proc and may not have access.
      For consistency, modify the single threaded benchmark to compute an
      average time per event.
      
      Committer testing:
      
        $ grep -m1 "model name" /proc/cpuinfo
        model name	: Intel(R) Core(TM) i7-8650U CPU @ 1.90GHz
        $ grep "model name" /proc/cpuinfo  | wc -l
        8
        $
        $ perf bench internals synthesize -h
        # Running 'internals/synthesize' benchmark:
      
         Usage: perf bench internals synthesize <options>
      
            -I, --multi-iterations <n>
                                  Number of iterations used to compute multi-threaded average
            -i, --single-iterations <n>
                                  Number of iterations used to compute single-threaded average
            -M, --max-threads <n>
                                  Maximum number of threads in multithreaded bench
            -m, --min-threads <n>
                                  Minimum number of threads in multithreaded bench
            -s, --st              Run single threaded benchmark
            -t, --mt              Run multi-threaded benchmark
      
        $
        $ perf bench internals synthesize -t
        # Running 'internals/synthesize' benchmark:
        Computing performance of multi threaded perf event synthesis by
        synthesizing events on CPU 0:
          Number of synthesis threads: 1
            Average synthesis took: 65449.000 usec (+- 586.442 usec)
            Average num. events: 9405.400 (+- 0.306)
            Average time per event 6.959 usec
          Number of synthesis threads: 2
            Average synthesis took: 37838.300 usec (+- 130.259 usec)
            Average num. events: 9501.800 (+- 20.469)
            Average time per event 3.982 usec
          Number of synthesis threads: 3
            Average synthesis took: 48551.400 usec (+- 225.686 usec)
            Average num. events: 9544.000 (+- 0.000)
            Average time per event 5.087 usec
          Number of synthesis threads: 4
            Average synthesis took: 29632.500 usec (+- 50.808 usec)
            Average num. events: 9544.000 (+- 0.000)
            Average time per event 3.105 usec
          Number of synthesis threads: 5
            Average synthesis took: 33920.400 usec (+- 284.509 usec)
            Average num. events: 9544.000 (+- 0.000)
            Average time per event 3.554 usec
          Number of synthesis threads: 6
            Average synthesis took: 27604.100 usec (+- 72.344 usec)
            Average num. events: 9548.000 (+- 0.000)
            Average time per event 2.891 usec
          Number of synthesis threads: 7
            Average synthesis took: 25406.300 usec (+- 933.371 usec)
            Average num. events: 9545.500 (+- 0.167)
            Average time per event 2.662 usec
          Number of synthesis threads: 8
            Average synthesis took: 24110.400 usec (+- 73.229 usec)
            Average num. events: 9551.000 (+- 0.000)
            Average time per event 2.524 usec
        $
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: default avatarJiri Olsa <jolsa@redhat.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andrey Zhizhikin <andrey.z@gmail.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lore.kernel.org/lkml/20200415054050.31645-2-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      13edc237
  3. 23 Apr, 2020 3 commits
    • Stephane Eranian's avatar
      perf record: Add num-synthesize-threads option · d99c22ea
      Stephane Eranian authored
      To control degree of parallelism of the synthesize_mmap() code which
      is scanning /proc/PID/task/PID/maps and can be time consuming.
      Mimic perf top way of handling the option.
      If not specified will default to 1 thread, i.e. default behavior before
      this option.
      
      On a desktop computer the processing of /proc/PID/task/PID/maps isn't
      slow enough to warrant parallel processing and the thread creation has
      some cost - hence the default of 1. On a loaded server with
      >100 cores it is possible to see synthesis times in the order of
      seconds and in this case having the option is desirable.
      
      As the processing is a synchronization point, it is legitimate to worry if
      Amdahl's law will apply to this patch. Profiling with this patch in
      place:
      https://lore.kernel.org/lkml/20200415054050.31645-4-irogers@google.com/
      shows:
      ...
            - 32.59% __perf_event__synthesize_threads
               - 32.54% __event__synthesize_thread
                  + 22.13% perf_event__synthesize_mmap_events
                  + 6.68% perf_event__get_comm_ids.constprop.0
                  + 1.49% process_synthesized_event
                  + 1.29% __GI___readdir64
                  + 0.60% __opendir
      ...
      That is the processing is 1.49% of execution time and there is plenty to
      make parallel. This is shown in the benchmark in this patch:
      
      https://lore.kernel.org/lkml/20200415054050.31645-2-irogers@google.com/
      
        Computing performance of multi threaded perf event synthesis by
        synthesizing events on CPU 0:
         Number of synthesis threads: 1
           Average synthesis took: 127729.000 usec (+- 3372.880 usec)
           Average num. events: 21548.600 (+- 0.306)
           Average time per event 5.927 usec
         Number of synthesis threads: 2
           Average synthesis took: 88863.500 usec (+- 385.168 usec)
           Average num. events: 21552.800 (+- 0.327)
           Average time per event 4.123 usec
         Number of synthesis threads: 3
           Average synthesis took: 83257.400 usec (+- 348.617 usec)
           Average num. events: 21553.200 (+- 0.327)
           Average time per event 3.863 usec
         Number of synthesis threads: 4
           Average synthesis took: 75093.000 usec (+- 422.978 usec)
           Average num. events: 21554.200 (+- 0.200)
           Average time per event 3.484 usec
         Number of synthesis threads: 5
           Average synthesis took: 64896.600 usec (+- 353.348 usec)
           Average num. events: 21558.000 (+- 0.000)
           Average time per event 3.010 usec
         Number of synthesis threads: 6
           Average synthesis took: 59210.200 usec (+- 342.890 usec)
           Average num. events: 21560.000 (+- 0.000)
           Average time per event 2.746 usec
         Number of synthesis threads: 7
           Average synthesis took: 54093.900 usec (+- 306.247 usec)
           Average num. events: 21562.000 (+- 0.000)
           Average time per event 2.509 usec
         Number of synthesis threads: 8
           Average synthesis took: 48938.700 usec (+- 341.732 usec)
           Average num. events: 21564.000 (+- 0.000)
           Average time per event 2.269 usec
      
      Where average time per synthesized event goes from 5.927 usec with 1
      thread to 2.269 usec with 8. This isn't a linear speed up as not all of
      synthesize code has been made parallel. If the synthesis time was about
      10 seconds then using 8 threads may bring this down to less than 4.
      Signed-off-by: default avatarStephane Eranian <eranian@google.com>
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Acked-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Tony Jones <tonyj@suse.de>
      Cc: yuzhoujian <yuzhoujian@didichuxing.com>
      Link: http://lore.kernel.org/lkml/20200422155038.9380-1-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      d99c22ea
    • Tommi Rantala's avatar
      perf test session topology: Fix data path · dbd660e6
      Tommi Rantala authored
      Commit 2d4f2799 ("perf data: Add global path holder") missed path
      conversion in tests/topology.c, causing the "Session topology" testcase
      to "hang" (waits forever for input from stdin) when doing "ssh $VM perf
      test".
      
      Can be reproduced by running "cat | perf test topo", and crashed by
      replacing cat with true:
      
        $ true | perf test -v topo
        40: Session topology                                      :
        --- start ---
        test child forked, pid 3638
        templ file: /tmp/perf-test-QPvAch
        incompatible file format
        incompatible file format (rerun with -v to learn more)
        free(): invalid pointer
        test child interrupted
        ---- end ----
        Session topology: FAILED!
      
      Committer testing:
      
      Reproduced the above result before the patch and after it is back
      working:
      
        # true | perf test -v topo
        41: Session topology                                      :
        --- start ---
        test child forked, pid 19374
        templ file: /tmp/perf-test-YOTEQg
        CPU 0, core 0, socket 0
        CPU 1, core 1, socket 0
        CPU 2, core 2, socket 0
        CPU 3, core 3, socket 0
        CPU 4, core 0, socket 0
        CPU 5, core 1, socket 0
        CPU 6, core 2, socket 0
        CPU 7, core 3, socket 0
        test child finished with 0
        ---- end ----
        Session topology: Ok
        #
      
      Fixes: 2d4f2799 ("perf data: Add global path holder")
      Signed-off-by: default avatarTommi Rantala <tommi.t.rantala@nokia.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Mamatha Inamdar <mamatha4@linux.vnet.ibm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
      Link: http://lore.kernel.org/lkml/20200423115341.562782-1-tommi.t.rantala@nokia.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      dbd660e6
    • Jin Yao's avatar
      perf stat: Improve runtime stat for interval mode · 197ba86f
      Jin Yao authored
      For interval mode, the metric is printed after the '#' character if it
      exists. But it's not calculated by the counts generated in this
      interval.
      
      See the following examples:
      
        root@kbl-ppc:~# perf stat -M CPI -I1000 --interval-count 2
        #           time             counts unit events
             1.000422803            764,809      inst_retired.any          #      2.9 CPI
             1.000422803          2,234,932      cycles
             2.001464585          1,960,061      inst_retired.any          #      1.6 CPI
             2.001464585          4,022,591      cycles
      
      The second CPI should not be 1.6 (4,022,591/1,960,061 is 2.1)
      
        root@kbl-ppc:~# perf stat -e cycles,instructions -I1000 --interval-count 2
        #           time             counts unit events
             1.000429493          2,869,311      cycles
             1.000429493            816,875      instructions              #    0.28  insn per cycle
             2.001516426          9,260,973      cycles
             2.001516426          5,250,634      instructions              #    0.87  insn per cycle
      
      The second 'insn per cycle' should not be 0.87 (5,250,634/9,260,973 is
      0.57).
      
      The current code uses a global variable 'rt_stat' for tracking and
      updating the std dev of runtime stat. Unlike the counts, 'rt_stat' is not
      reset for interval. While the counts are reset for interval.
      
        perf_stat_process_counter()
        {
                if (config->interval)
                        init_stats(ps->res_stats);
        }
      
      So for interval mode, the 'rt_stat' variable should be reset too.
      
      This patch resets 'rt_stat' before read_counters(), so the runtime stat
      is only calculated by the counts generated in this interval.
      
      With this patch:
      
        root@kbl-ppc:~# perf stat -M CPI -I1000 --interval-count 2
        #           time             counts unit events
             1.000420924          2,408,818      inst_retired.any          #      2.1 CPI
             1.000420924          5,010,111      cycles
             2.001448579          2,798,407      inst_retired.any          #      1.6 CPI
             2.001448579          4,599,861      cycles
      
        root@kbl-ppc:~# perf stat -e cycles,instructions -I1000 --interval-count 2
        #           time             counts unit events
             1.000428555          2,769,714      cycles
             1.000428555            774,462      instructions              #    0.28  insn per cycle
             2.001471562          3,595,904      cycles
             2.001471562          1,243,703      instructions              #    0.35  insn per cycle
      
      Now the second 'insn per cycle' and CPI are calculated by the counts
      generated in this interval.
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Acked-by: default avatarJiri Olsa <jolsa@redhat.com>
      Tested-By: default avatarKajol Jain <kjain@linux.ibm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jin Yao <yao.jin@intel.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20200420145417.6864-1-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      197ba86f
  4. 22 Apr, 2020 6 commits
  5. 21 Apr, 2020 10 commits
    • Linus Torvalds's avatar
      Merge branch 'akpm' (patches from Andrew) · 18bf3408
      Linus Torvalds authored
      Merge misc fixes from Andrew Morton:
       "15 fixes"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>:
        tools/vm: fix cross-compile build
        coredump: fix null pointer dereference on coredump
        mm: shmem: disable interrupt when acquiring info->lock in userfaultfd_copy path
        shmem: fix possible deadlocks on shmlock_user_lock
        vmalloc: fix remap_vmalloc_range() bounds checks
        mm/shmem: fix build without THP
        mm/ksm: fix NULL pointer dereference when KSM zero page is enabled
        tools/build: tweak unused value workaround
        checkpatch: fix a typo in the regex for $allocFunctions
        mm, gup: return EINTR when gup is interrupted by fatal signals
        mm/hugetlb: fix a addressing exception caused by huge_pte_offset
        MAINTAINERS: add an entry for kfifo
        mm/userfaultfd: disable userfaultfd-wp on x86_32
        slub: avoid redzone when choosing freepointer location
        sh: fix build error in mm/init.c
      18bf3408
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 8160a563
      Linus Torvalds authored
      Pull kvm fixes from Paolo Bonzini:
       "Bugfixes, and a few cleanups to the newly-introduced assembly language
        vmentry code for AMD"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        KVM: PPC: Book3S HV: Handle non-present PTEs in page fault functions
        kvm: Disable objtool frame pointer checking for vmenter.S
        MAINTAINERS: add a reviewer for KVM/s390
        KVM: s390: Fix PV check in deliverable_irqs()
        kvm: Handle reads of SandyBridge RAPL PMU MSRs rather than injecting #GP
        KVM: Remove CREATE_IRQCHIP/SET_PIT2 race
        KVM: SVM: Fix __svm_vcpu_run declaration.
        KVM: SVM: Do not setup frame pointer in __svm_vcpu_run
        KVM: SVM: Fix build error due to missing release_pages() include
        KVM: SVM: Do not mark svm_vcpu_run with STACK_FRAME_NON_STANDARD
        kvm: nVMX: match comment with return type for nested_vmx_exit_reflected
        kvm: nVMX: reflect MTF VM-exits if injected by L1
        KVM: s390: Return last valid slot if approx index is out-of-bounds
        KVM: Check validity of resolved slot when searching memslots
        KVM: VMX: Enable machine check support for 32bit targets
        KVM: SVM: move more vmentry code to assembly
        KVM: SVM: fix compilation with modular PSP and non-modular KVM
      8160a563
    • Linus Torvalds's avatar
      Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost · 189522da
      Linus Torvalds authored
      Pull virtio fixes and cleanups from Michael Tsirkin:
      
       - Some bug fixes
      
       - Cleanup a couple of issues that surfaced meanwhile
      
       - Disable vhost on ARM with OABI for now - to be fixed fully later in
         the cycle or in the next release.
      
      * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (24 commits)
        vhost: disable for OABI
        virtio: drop vringh.h dependency
        virtio_blk: add a missing include
        virtio-balloon: Avoid using the word 'report' when referring to free page hinting
        virtio-balloon: make virtballoon_free_page_report() static
        vdpa: fix comment of vdpa_register_device()
        vdpa: make vhost, virtio depend on menu
        vdpa: allow a 32 bit vq alignment
        drm/virtio: fix up for include file changes
        remoteproc: pull in slab.h
        rpmsg: pull in slab.h
        virtio_input: pull in slab.h
        remoteproc: pull in slab.h
        virtio-rng: pull in slab.h
        virtgpu: pull in uaccess.h
        tools/virtio: make asm/barrier.h self contained
        tools/virtio: define aligned attribute
        virtio/test: fix up after IOTLB changes
        vhost: Create accessors for virtqueues private_data
        vdpasim: Return status in vdpasim_get_status
        ...
      189522da
    • Linus Torvalds's avatar
      Merge tag 'tpmdd-next-20200421' of git://git.infradead.org/users/jjs/linux-tpmdd · b61f7ff0
      Linus Torvalds authored
      Pull tpm fixes from Jarkko Sakkinen:
       "A few bug fixes"
      
      * tag 'tpmdd-next-20200421' of git://git.infradead.org/users/jjs/linux-tpmdd:
        tpm/tpm_tis: Free IRQ if probing fails
        tpm: fix wrong return value in tpm_pcr_extend
        tpm: ibmvtpm: retry on H_CLOSED in tpm_ibmvtpm_send()
        tpm: Export tpm2_get_cc_attrs_tbl for ibmvtpm driver as module
      b61f7ff0
    • Linus Torvalds's avatar
      Merge tag 'clang-format-for-linus-v5.7-rc3' of git://github.com/ojeda/linux · 20f16489
      Linus Torvalds authored
      Pull clang-format fixlets from Miguel Ojeda:
       "Two trivial clang-format changes:
      
         - Don't indent C++ namespaces (Ian Rogers)
      
         - The usual clang-format macro list update (Miguel Ojeda)"
      
      * tag 'clang-format-for-linus-v5.7-rc3' of git://github.com/ojeda/linux:
        clang-format: Update with the latest for_each macro list
        clang-format: don't indent namespaces
      20f16489
    • Lucas Stach's avatar
      tools/vm: fix cross-compile build · cf01699e
      Lucas Stach authored
      Commit 7ed1c190 ("tools: fix cross-compile var clobbering") moved
      the setup of the CC variable to tools/scripts/Makefile.include to make
      the behavior consistent across all the tools Makefiles.
      
      As the vm tools missed the include we end up with the wrong CC in a
      cross-compiling evironment.
      
      Fixes: 7ed1c190 (tools: fix cross-compile var clobbering)
      Signed-off-by: default avatarLucas Stach <l.stach@pengutronix.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Martin Kelly <martin@martingkelly.com>
      Cc: <stable@vger.kernel.org>
      Link: http://lkml.kernel.org/r/20200416104748.25243-1-l.stach@pengutronix.deSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      cf01699e
    • Sudip Mukherjee's avatar
      coredump: fix null pointer dereference on coredump · db973a72
      Sudip Mukherjee authored
      If the core_pattern is set to "|" and any process segfaults then we get
      a null pointer derefernce while trying to coredump. The call stack shows:
      
          RIP: do_coredump+0x628/0x11c0
      
      When the core_pattern has only "|" there is no use of trying the
      coredump and we can check that while formating the corename and exit
      with an error.
      
      After this change I get:
      
          format_corename failed
          Aborting core
      
      Fixes: 315c6926 ("coredump: split pipe command whitespace before expanding template")
      Reported-by: default avatarMatthew Ruffell <matthew.ruffell@canonical.com>
      Signed-off-by: default avatarSudip Mukherjee <sudipm.mukherjee@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Paul Wise <pabs3@bonedaddy.net>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Cc: <stable@vger.kernel.org>
      Link: http://lkml.kernel.org/r/20200416194612.21418-1-sudipm.mukherjee@gmail.comSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      db973a72
    • Yang Shi's avatar
      mm: shmem: disable interrupt when acquiring info->lock in userfaultfd_copy path · 94b7cc01
      Yang Shi authored
      Syzbot reported the below lockdep splat:
      
          WARNING: possible irq lock inversion dependency detected
          5.6.0-rc7-syzkaller #0 Not tainted
          --------------------------------------------------------
          syz-executor.0/10317 just changed the state of lock:
          ffff888021d16568 (&(&info->lock)->rlock){+.+.}, at: spin_lock include/linux/spinlock.h:338 [inline]
          ffff888021d16568 (&(&info->lock)->rlock){+.+.}, at: shmem_mfill_atomic_pte+0x1012/0x21c0 mm/shmem.c:2407
          but this lock was taken by another, SOFTIRQ-safe lock in the past:
           (&(&xa->xa_lock)->rlock#5){..-.}
      
          and interrupts could create inverse lock ordering between them.
      
          other info that might help us debug this:
           Possible interrupt unsafe locking scenario:
      
                 CPU0                    CPU1
                 ----                    ----
            lock(&(&info->lock)->rlock);
                                         local_irq_disable();
                                         lock(&(&xa->xa_lock)->rlock#5);
                                         lock(&(&info->lock)->rlock);
            <Interrupt>
              lock(&(&xa->xa_lock)->rlock#5);
      
           *** DEADLOCK ***
      
      The full report is quite lengthy, please see:
      
        https://lore.kernel.org/linux-mm/alpine.LSU.2.11.2004152007370.13597@eggly.anvils/T/#m813b412c5f78e25ca8c6c7734886ed4de43f241d
      
      It is because CPU 0 held info->lock with IRQ enabled in userfaultfd_copy
      path, then CPU 1 is splitting a THP which held xa_lock and info->lock in
      IRQ disabled context at the same time.  If softirq comes in to acquire
      xa_lock, the deadlock would be triggered.
      
      The fix is to acquire/release info->lock with *_irq version instead of
      plain spin_{lock,unlock} to make it softirq safe.
      
      Fixes: 4c27fe4c ("userfaultfd: shmem: add shmem_mcopy_atomic_pte for userfaultfd support")
      Reported-by: syzbot+e27980339d305f2dbfd9@syzkaller.appspotmail.com
      Signed-off-by: default avatarYang Shi <yang.shi@linux.alibaba.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Tested-by: syzbot+e27980339d305f2dbfd9@syzkaller.appspotmail.com
      Acked-by: default avatarHugh Dickins <hughd@google.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Link: http://lkml.kernel.org/r/1587061357-122619-1-git-send-email-yang.shi@linux.alibaba.comSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      94b7cc01
    • Hugh Dickins's avatar
      shmem: fix possible deadlocks on shmlock_user_lock · ea0dfeb4
      Hugh Dickins authored
      Recent commit 71725ed1 ("mm: huge tmpfs: try to split_huge_page()
      when punching hole") has allowed syzkaller to probe deeper, uncovering a
      long-standing lockdep issue between the irq-unsafe shmlock_user_lock,
      the irq-safe xa_lock on mapping->i_pages, and shmem inode's info->lock
      which nests inside xa_lock (or tree_lock) since 4.8's shmem_uncharge().
      
      user_shm_lock(), servicing SysV shmctl(SHM_LOCK), wants
      shmlock_user_lock while its caller shmem_lock() holds info->lock with
      interrupts disabled; but hugetlbfs_file_setup() calls user_shm_lock()
      with interrupts enabled, and might be interrupted by a writeback endio
      wanting xa_lock on i_pages.
      
      This may not risk an actual deadlock, since shmem inodes do not take
      part in writeback accounting, but there are several easy ways to avoid
      it.
      
      Requiring interrupts disabled for shmlock_user_lock would be easy, but
      it's a high-level global lock for which that seems inappropriate.
      Instead, recall that the use of info->lock to guard info->flags in
      shmem_lock() dates from pre-3.1 days, when races with SHMEM_PAGEIN and
      SHMEM_TRUNCATE could occur: nowadays it serves no purpose, the only flag
      added or removed is VM_LOCKED itself, and calls to shmem_lock() an inode
      are already serialized by the caller.
      
      Take info->lock out of the chain and the possibility of deadlock or
      lockdep warning goes away.
      
      Fixes: 4595ef88 ("shmem: make shmem_inode_info::lock irq-safe")
      Reported-by: syzbot+c8a8197c8852f566b9d9@syzkaller.appspotmail.com
      Reported-by: syzbot+40b71e145e73f78f81ad@syzkaller.appspotmail.com
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Acked-by: default avatarYang Shi <yang.shi@linux.alibaba.com>
      Cc: Yang Shi <yang.shi@linux.alibaba.com>
      Link: http://lkml.kernel.org/r/alpine.LSU.2.11.2004161707410.16322@eggly.anvils
      Link: https://lore.kernel.org/lkml/000000000000e5838c05a3152f53@google.com/
      Link: https://lore.kernel.org/lkml/0000000000003712b305a331d3b1@google.com/Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ea0dfeb4
    • Jann Horn's avatar
      vmalloc: fix remap_vmalloc_range() bounds checks · bdebd6a2
      Jann Horn authored
      remap_vmalloc_range() has had various issues with the bounds checks it
      promises to perform ("This function checks that addr is a valid
      vmalloc'ed area, and that it is big enough to cover the vma") over time,
      e.g.:
      
       - not detecting pgoff<<PAGE_SHIFT overflow
      
       - not detecting (pgoff<<PAGE_SHIFT)+usize overflow
      
       - not checking whether addr and addr+(pgoff<<PAGE_SHIFT) are the same
         vmalloc allocation
      
       - comparing a potentially wildly out-of-bounds pointer with the end of
         the vmalloc region
      
      In particular, since commit fc970227 ("bpf: Add mmap() support for
      BPF_MAP_TYPE_ARRAY"), unprivileged users can cause kernel null pointer
      dereferences by calling mmap() on a BPF map with a size that is bigger
      than the distance from the start of the BPF map to the end of the
      address space.
      
      This could theoretically be used as a kernel ASLR bypass, by using
      whether mmap() with a given offset oopses or returns an error code to
      perform a binary search over the possible address range.
      
      To allow remap_vmalloc_range_partial() to verify that addr and
      addr+(pgoff<<PAGE_SHIFT) are in the same vmalloc region, pass the offset
      to remap_vmalloc_range_partial() instead of adding it to the pointer in
      remap_vmalloc_range().
      
      In remap_vmalloc_range_partial(), fix the check against
      get_vm_area_size() by using size comparisons instead of pointer
      comparisons, and add checks for pgoff.
      
      Fixes: 83342314 ("[PATCH] mm: introduce remap_vmalloc_range()")
      Signed-off-by: default avatarJann Horn <jannh@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: stable@vger.kernel.org
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Martin KaFai Lau <kafai@fb.com>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Yonghong Song <yhs@fb.com>
      Cc: Andrii Nakryiko <andriin@fb.com>
      Cc: John Fastabend <john.fastabend@gmail.com>
      Cc: KP Singh <kpsingh@chromium.org>
      Link: http://lkml.kernel.org/r/20200415222312.236431-1-jannh@google.comSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      bdebd6a2