1. 25 Jan, 2018 9 commits
  2. 23 Jan, 2018 9 commits
  3. 18 Jan, 2018 5 commits
  4. 17 Jan, 2018 17 commits
    • Ingo Molnar's avatar
      Merge tag 'perf-core-for-mingo-4.16-20180117' of... · a72594ca
      Ingo Molnar authored
      Merge tag 'perf-core-for-mingo-4.16-20180117' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
      
      Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
      
      - Fix various per event 'max-stack' and 'call-graph=dwarf' issues,
        mostly in 'perf trace', allowing to use 'perf trace --call-graph' with
        'dwarf' and 'fp' to setup the callgraph details for the syscall events
        and make that apply to other events, whilhe allowing to override that on
        a per-event basis, using '-e sched:*switch/call-graph=dwarf/' for
        instance (Arnaldo Carvalho de Melo)
      
      - Improve the --time percent support in record/report/script (Jin Yao)
      
      - Fix copyfile_offset update of output offset (Jiri Olsa)
      
      - Add python script to profile and resolve physical mem type (Kan Liang)
      
      - Add ARM Statistical Profiling Extensions (SPE) support (Kim Phillips)
      
      - Remove trailing semicolon in the evlist code (Luis de Bethencourt)
      
      - Fix incorrect handling of type _TERM_DRV_CFG (Mathieu Poirier)
      
      - Use asprintf when possible in libtraceevent (Federico Vaga)
      
      - Fix bad force_token escape sequence in libtraceevent (Michael Sartain)
      
      - Add UL suffix to MISSING_EVENTS in libtraceevent (Michael Sartain)
      
      - value of unknown symbolic fields in libtraceevent (Jan Kiszka)
      
      - libtraceevent updates: (Steven Rostedt)
        o Show value of flags that have not been parsed
        o Simplify pointer print logic and fix %pF
        o Handle new pointer processing of bprint strings
        o Show contents (in hex) of data of unrecognized type records
        o Fix get_field_str() for dynamic strings
      
      - Add missing break in FALSE case of pevent_filter_clear_trivial() (Taeung Song)
      
      - Fix failed memory allocation for get_cpuid_str (Thomas Richter)
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      a72594ca
    • Ingo Molnar's avatar
      7a7368a5
    • Thomas Richter's avatar
      perf record: Fix failed memory allocation for get_cpuid_str · 81fccd6c
      Thomas Richter authored
      In x86 architecture dependend part function get_cpuid_str() mallocs a
      128 byte buffer, but does not check if the memory allocation succeeded
      or not.
      
      When the memory allocation fails, function __get_cpuid() is called with
      first parameter being a NULL pointer.  However this function references
      its first parameter and operates on a NULL pointer which might cause
      core dumps.
      Signed-off-by: default avatarThomas Richter <tmricht@linux.vnet.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Link: http://lkml.kernel.org/r/20180117131611.34319-1-tmricht@linux.vnet.ibm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      81fccd6c
    • Jin Yao's avatar
      perf script: Remove the time slices number limitation · cc2ef584
      Jin Yao authored
      Previously it was only allowed to use at most 10 time slices in 'perf
      script --time'.
      
      This patch removes this limitation.
      For example, following command line is OK (12 time slices)
      
      perf script --time 1%/1,1%/2,1%/3,1%/4,1%/5,1%/6,1%/7,1%/8,1%/9,1%/10,1%/11,1%/12
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Suggested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Reviewed-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1515596433-24653-9-git-send-email-yao.jin@linux.intel.com
      [ No need to check for NULL to call free, use zfree ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      cc2ef584
    • Jin Yao's avatar
      perf report: Remove the time slices number limitation · 0a3cc3ae
      Jin Yao authored
      Previously it was only allowed to use at most 10 time slices in 'perf
      report --time'.
      
      This patch removes this limitation.
      For example, following command line is OK (12 time slices)
      
      perf report --stdio --time 1%/1,1%/2,1%/3,1%/4,1%/5,1%/6,1%/7,1%/8,1%/9,1%/10,1%/11,1%/12
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Suggested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Reviewed-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1515596433-24653-8-git-send-email-yao.jin@linux.intel.com
      [ No need to check for NULL to call free, use zfree ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      0a3cc3ae
    • Jin Yao's avatar
      perf util: Allocate time slices buffer according to number of comma · 5a031f88
      Jin Yao authored
      Previously we use a magic number 10 to limit the number of time slices.
      It's not very good.
      
      This patch creates a new function perf_time__range_alloc() to allocate
      time slices buffer. The number of buffer entries is determined by the
      number of comma in string but at least it will allocate one entry even
      if no comma is found.
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Suggested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Reviewed-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1515596433-24653-7-git-send-email-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      5a031f88
    • Jin Yao's avatar
      perf report: Add an indication of what time slices are used · 7425664b
      Jin Yao authored
      Add a time slices indication to the perf report header.
      
      For example,
      
        # perf report --stdio --time 10%
      
        # Total Lost Samples: 0
        #
        # Samples: 9K of event 'cycles:ppp' (time slices: 10%)
        # Event count (approx.): 8951288803
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Suggested--by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Reviewed-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1515596433-24653-6-git-send-email-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      7425664b
    • Jin Yao's avatar
      perf util: Support no index time percent slice · 3002812e
      Jin Yao authored
      Previously, the time percent slice needs an index to specify which one
      the user wants.
      
      It may be easier to use if the index can be omitted.  So with this
      patch, for example,
      
      perf report --stdio --time 10%/1 should be equivalent to
      perf report --stdio --time 10%
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Suggested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Reviewed-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1515596433-24653-5-git-send-email-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      3002812e
    • Jin Yao's avatar
      perf util: Improve error checking for time percent input · 6e761cbc
      Jin Yao authored
      The command line like 'perf report --stdio --time 1abc%/1' could be
      accepted by perf. It looks not very good.
      
      This patch uses strtod() to replace original atof() and check the entire
      string. Now for the same command line, it would return error message
      "Invalid time string".
      
      root@skl:/tmp# perf report --stdio --time 1abc%/1
      Invalid time string
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1515596433-24653-4-git-send-email-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      6e761cbc
    • Jin Yao's avatar
      perf script: Improve error msg when no first/last sample time found · 1e2778e9
      Jin Yao authored
      The following message will be returned to user when executing 'perf
      script --time' if perf data file doesn't contain the first/last sample
      time.
      
      "HINT: no first/last sample time found in perf data.
       Please use latest perf binary to execute 'perf record'
       (if '--buildid-all' is enabled, needs to set '--timestamp-boundary')."
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1515596433-24653-3-git-send-email-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      1e2778e9
    • Jin Yao's avatar
      perf report: Improve error msg when no first/last sample time found · eb0b419e
      Jin Yao authored
      The following message will be returned to user when executing
      'perf report --time' if perf data file doesn't contain the
      first/last sample time.
      
      "HINT: no first/last sample time found in perf data.
       Please use latest perf binary to execute 'perf record'
       (if '--buildid-all' is enabled, needs to set '--timestamp-boundary')."
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1515596433-24653-2-git-send-email-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      eb0b419e
    • Arnaldo Carvalho de Melo's avatar
      perf callchains: Ask for PERF_RECORD_MMAP for data mmaps for DWARF unwinding · 0d3dcc0e
      Arnaldo Carvalho de Melo authored
      When we use a global DWARF setting as in:
      
      	perf record --call-graph dwarf
      
      According to 5c0cf224 ("perf record: Store data mmaps for dwarf unwind") we need
      to set up some extra perf_event_attr bits.
      
      But when we instead do a per event dwarf setting:
      
      	perf record -e cycles/call-graph=dwarf/
      
      This was not being done, make them equivalent.
      
      This didn't produce any output changes in my tests while fixing up loose
      ends in the per-event settings, I found it just by comparing the
      perf_event_attr fields trying to find an explanation for those problems.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Hendrick Brueckner <brueckner@linux.vnet.ibm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Noel Grandin <noelgrandin@gmail.com>
      Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-6476r53h2o38skbs9qa4ust4@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      0d3dcc0e
    • Arnaldo Carvalho de Melo's avatar
      perf trace: Allow overriding global --max-stack per event · bd3dda9a
      Arnaldo Carvalho de Melo authored
      The per-event max-stack setting wasn't overriding the global --max-stack
      setting:
      
        # perf trace --no-syscalls --max-stack 4 -e probe_libc:inet_pton/call-graph=dwarf,max-stack=2/ ping -6 -c 1 ::1
        PING ::1(::1) 56 data bytes
        64 bytes from ::1: icmp_seq=1 ttl=64 time=0.072 ms
      
        --- ::1 ping statistics ---
        1 packets transmitted, 1 received, 0% packet loss, time 0ms
        rtt min/avg/max/mdev = 0.072/0.072/0.072/0.000 ms
             0.000 probe_libc:inet_pton:(7feb7a998350))
                                               __inet_pton (inlined)
                                               gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so)
                                               __GI_getaddrinfo (inlined)
                                               [0xffffaa39b6108f3f] (/usr/bin/ping)
        #
      
      Fix it:
      
        # perf trace --no-syscalls --max-stack 4 -e probe_libc:inet_pton/call-graph=dwarf,max-stack=2/ ping -6 -c 1 ::1
        PING ::1(::1) 56 data bytes
        64 bytes from ::1: icmp_seq=1 ttl=64 time=0.073 ms
      
        --- ::1 ping statistics ---
        1 packets transmitted, 1 received, 0% packet loss, time 0ms
        rtt min/avg/max/mdev = 0.073/0.073/0.073/0.000 ms
             0.000 probe_libc:inet_pton:(7f1083221350))
                                               __inet_pton (inlined)
                                               gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so)
        #
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Hendrick Brueckner <brueckner@linux.vnet.ibm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-ic3g837xg8ob3kcpkspxwz0g@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      bd3dda9a
    • Arnaldo Carvalho de Melo's avatar
      perf trace: Setup DWARF callchains for non-syscall events when --max-stack is used · 75d50117
      Arnaldo Carvalho de Melo authored
      If we use:
      
      	perf trace --max-stack=4
      
      then the syscall events will use DWARF callchains, when available
      (libunwind enabled in the build) and the printing will stop at 4 levels.
      
      When we introduced support for tracepoint events this ended up not
      applying for them, fix it.
      
      Before:
      
        # perf trace --call-graph=dwarf --no-syscalls -e probe_libc:inet_pton ping -6 -c 1 ::1
        PING ::1(::1) 56 data bytes
        64 bytes from ::1: icmp_seq=1 ttl=64 time=0.058 ms
      
        --- ::1 ping statistics ---
        1 packets transmitted, 1 received, 0% packet loss, time 0ms
        rtt min/avg/max/mdev = 0.058/0.058/0.058/0.000 ms
             0.000 probe_libc:inet_pton:(7fc6c2a16350))
        #
      
      After:
      
        # perf trace --call-graph=dwarf --no-syscalls -e probe_libc:inet_pton ping -6 -c 1 ::1
        PING ::1(::1) 56 data bytes
        64 bytes from ::1: icmp_seq=1 ttl=64 time=0.087 ms
      
        --- ::1 ping statistics ---
        1 packets transmitted, 1 received, 0% packet loss, time 0ms
        rtt min/avg/max/mdev = 0.087/0.087/0.087/0.000 ms
             0.000 probe_libc:inet_pton:(7fbf9a041350))
                                               __inet_pton (inlined)
                                               gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so)
                                               __GI_getaddrinfo (inlined)
                                               [0xffffaa947cb67f3f] (/usr/bin/ping)
                                               __libc_start_main (/usr/lib64/libc-2.26.so)
                                               [0xffffaa947cb68379] (/usr/bin/ping)
        #
      Reported-by: default avatarThomas Richter <tmricht@linux.vnet.ibm.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Hendrick Brueckner <brueckner@linux.vnet.ibm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-afsu9eegd43ppihiuafhh9qv@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      75d50117
    • Arnaldo Carvalho de Melo's avatar
      perf unwind: Do not look just at the global callchain_param.record_mode · eabad8c6
      Arnaldo Carvalho de Melo authored
      When setting up DWARF callchains on specific events, without using
      'record' or 'trace' --call-graph, but instead doing it like:
      
      	perf trace -e cycles/call-graph=dwarf/
      
      The unwind__prepare_access() call in thread__insert_map() when we
      process PERF_RECORD_MMAP(2) metadata events were not being performed,
      precluding us from using per-event DWARF callchains, handling them just
      when we asked for all events to be DWARF, using "--call-graph dwarf".
      
      We do it in the PERF_RECORD_MMAP because we have to look at one of the
      executable maps to figure out the executable type (64-bit, 32-bit) of
      the DSO laid out in that mmap. Also to look at the architecture where
      the perf.data file was recorded.
      
      All this probably should be deferred to when we process a sample for
      some thread that has callchains, so that we do this processing only for
      the threads with samples, not for all of them.
      
      For now, fix using DWARF on specific events.
      
      Before:
      
        # perf trace --no-syscalls -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1
        PING ::1(::1) 56 data bytes
        64 bytes from ::1: icmp_seq=1 ttl=64 time=0.048 ms
      
        --- ::1 ping statistics ---
        1 packets transmitted, 1 received, 0% packet loss, time 0ms
        rtt min/avg/max/mdev = 0.048/0.048/0.048/0.000 ms
           0.000 probe_libc:inet_pton:(7fe9597bb350))
        Problem processing probe_libc:inet_pton callchain, skipping...
        #
      
      After:
      
        # perf trace --no-syscalls -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1
        PING ::1(::1) 56 data bytes
        64 bytes from ::1: icmp_seq=1 ttl=64 time=0.060 ms
      
        --- ::1 ping statistics ---
        1 packets transmitted, 1 received, 0% packet loss, time 0ms
        rtt min/avg/max/mdev = 0.060/0.060/0.060/0.000 ms
             0.000 probe_libc:inet_pton:(7fd4aa930350))
                                               __inet_pton (inlined)
                                               gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so)
                                               __GI_getaddrinfo (inlined)
                                               [0xffffaa804e51af3f] (/usr/bin/ping)
                                               __libc_start_main (/usr/lib64/libc-2.26.so)
                                               [0xffffaa804e51b379] (/usr/bin/ping)
        #
        # perf trace --call-graph=dwarf --no-syscalls -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1
        PING ::1(::1) 56 data bytes
        64 bytes from ::1: icmp_seq=1 ttl=64 time=0.057 ms
      
        --- ::1 ping statistics ---
        1 packets transmitted, 1 received, 0% packet loss, time 0ms
        rtt min/avg/max/mdev = 0.057/0.057/0.057/0.000 ms
             0.000 probe_libc:inet_pton:(7f9363b9e350))
                                               __inet_pton (inlined)
                                               gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so)
                                               __GI_getaddrinfo (inlined)
                                               [0xffffa9e8a14e0f3f] (/usr/bin/ping)
                                               __libc_start_main (/usr/lib64/libc-2.26.so)
                                               [0xffffa9e8a14e1379] (/usr/bin/ping)
        #
        # perf trace --call-graph=fp --no-syscalls -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1
        PING ::1(::1) 56 data bytes
        64 bytes from ::1: icmp_seq=1 ttl=64 time=0.077 ms
      
        --- ::1 ping statistics ---
        1 packets transmitted, 1 received, 0% packet loss, time 0ms
        rtt min/avg/max/mdev = 0.077/0.077/0.077/0.000 ms
             0.000 probe_libc:inet_pton:(7f4947e1c350))
                                               __inet_pton (inlined)
                                               gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so)
                                               __GI_getaddrinfo (inlined)
                                               [0xffffaa716d88ef3f] (/usr/bin/ping)
                                               __libc_start_main (/usr/lib64/libc-2.26.so)
                                               [0xffffaa716d88f379] (/usr/bin/ping)
        #
        # perf trace --no-syscalls -e probe_libc:inet_pton/call-graph=fp/ ping -6 -c 1 ::1
        PING ::1(::1) 56 data bytes
        64 bytes from ::1: icmp_seq=1 ttl=64 time=0.078 ms
      
        --- ::1 ping statistics ---
        1 packets transmitted, 1 received, 0% packet loss, time 0ms
        rtt min/avg/max/mdev = 0.078/0.078/0.078/0.000 ms
             0.000 probe_libc:inet_pton:(7fa157696350))
                                               __GI___inet_pton (/usr/lib64/libc-2.26.so)
                                               getaddrinfo (/usr/lib64/libc-2.26.so)
                                               [0xffffa9ba39c74f40] (/usr/bin/ping)
        #
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Hendrick Brueckner <brueckner@linux.vnet.ibm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/r/20180116182650.GE16107@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      eabad8c6
    • Arnaldo Carvalho de Melo's avatar
      perf callchain: Fix attr.sample_max_stack setting · 249d98e5
      Arnaldo Carvalho de Melo authored
      When setting the "dwarf" unwinder for a specific event and not
      specifying the max-stack, the attr.sample_max_stack ended up using an
      uninitialized callchain_param.max_stack, fix it by using designated
      initializers for that callchain_param variable, zeroing all non
      explicitely initialized struct members.
      
      Here is what happened:
      
        # perf trace -vv --no-syscalls --max-stack 4 -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1
        callchain: type DWARF
        callchain: stack dump size 8192
        perf_event_attr:
          type                             2
          size                             112
          config                           0x730
          { sample_period, sample_freq }   1
          sample_type                      IP|TID|TIME|ADDR|CALLCHAIN|CPU|PERIOD|RAW|REGS_USER|STACK_USER|DATA_SRC
          exclude_callchain_user           1
          { wakeup_events, wakeup_watermark } 1
          sample_regs_user                 0xff0fff
          sample_stack_user                8192
          sample_max_stack                 50656
        sys_perf_event_open failed, error -75
        Value too large for defined data type
        # perf trace -vv --no-syscalls --max-stack 4 -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1
        callchain: type DWARF
        callchain: stack dump size 8192
        perf_event_attr:
          type                             2
          size                             112
          config                           0x730
          sample_type                      IP|TID|TIME|ADDR|CALLCHAIN|CPU|PERIOD|RAW|REGS_USER|STACK_USER|DATA_SRC
          exclude_callchain_user           1
          sample_regs_user                 0xff0fff
          sample_stack_user                8192
          sample_max_stack                 30448
        sys_perf_event_open failed, error -75
        Value too large for defined data type
        #
      
      Now the attr.sample_max_stack is set to zero and the above works as
      expected:
      
        # perf trace --no-syscalls --max-stack 4 -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1
        PING ::1(::1) 56 data bytes
        64 bytes from ::1: icmp_seq=1 ttl=64 time=0.072 ms
      
        --- ::1 ping statistics ---
        1 packets transmitted, 1 received, 0% packet loss, time 0ms
        rtt min/avg/max/mdev = 0.072/0.072/0.072/0.000 ms
             0.000 probe_libc:inet_pton:(7feb7a998350))
                                               __inet_pton (inlined)
                                               gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so)
                                               __GI_getaddrinfo (inlined)
                                               [0xffffaa39b6108f3f] (/usr/bin/ping)
        #
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Hendrick Brueckner <brueckner@linux.vnet.ibm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-is9tramondqa9jlxxsgcm9iz@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      249d98e5
    • Kim Phillips's avatar
      perf tools: Add ARM Statistical Profiling Extensions (SPE) support · ffd3d18c
      Kim Phillips authored
      'perf record' and 'perf report --dump-raw-trace' supported in this
      release.
      
      Example usage:
      
       # perf record -e arm_spe/ts_enable=1,pa_enable=1/ dd if=/dev/zero of=/dev/null count=10000
       # perf report --dump-raw-trace
      
      Note that the perf.data file is portable, so the report can be run on
      another architecture host if necessary.
      
      Output will contain raw SPE data and its textual representation, such
      as:
      
      0x5c8 [0x30]: PERF_RECORD_AUXTRACE size: 0x200000  offset: 0  ref: 0x1891ad0e  idx: 1  tid: 2227  cpu: 1
      .
      . ... ARM SPE data: size 2097152 bytes
      .  00000000:  49 00                                           LD
      .  00000002:  b2 c0 3b 29 0f 00 00 ff ff                      VA 0xffff00000f293bc0
      .  0000000b:  b3 c0 eb 24 fb 00 00 00 80                      PA 0xfb24ebc0 ns=1
      .  00000014:  9a 00 00                                        LAT 0 XLAT
      .  00000017:  42 16                                           EV RETIRED L1D-ACCESS TLB-ACCESS
      .  00000019:  b0 00 c4 15 08 00 00 ff ff                      PC 0xff00000815c400 el3 ns=1
      .  00000022:  98 00 00                                        LAT 0 TOT
      .  00000025:  71 36 6c 21 2c 09 00 00 00                      TS 39395093558
      .  0000002e:  49 00                                           LD
      .  00000030:  b2 80 3c 29 0f 00 00 ff ff                      VA 0xffff00000f293c80
      .  00000039:  b3 80 ec 24 fb 00 00 00 80                      PA 0xfb24ec80 ns=1
      .  00000042:  9a 00 00                                        LAT 0 XLAT
      .  00000045:  42 16                                           EV RETIRED L1D-ACCESS TLB-ACCESS
      .  00000047:  b0 f4 11 16 08 00 00 ff ff                      PC 0xff0000081611f4 el3 ns=1
      .  00000050:  98 00 00                                        LAT 0 TOT
      .  00000053:  71 36 6c 21 2c 09 00 00 00                      TS 39395093558
      .  0000005c:  48 00                                           INSN-OTHER
      .  0000005e:  42 02                                           EV RETIRED
      .  00000060:  b0 2c ef 7f 08 00 00 ff ff                      PC 0xff0000087fef2c el3 ns=1
      .  00000069:  98 00 00                                        LAT 0 TOT
      .  0000006c:  71 d1 6f 21 2c 09 00 00 00                      TS 39395094481
      ...
      
      Other release notes:
      
      - applies to acme's perf/{core,urgent} branches, likely elsewhere
      
      - Report is self-contained within the tool.
        Record requires enabling the kernel SPE driver by
        setting CONFIG_ARM_SPE_PMU.
      
      - The intel-bts implementation was used as a starting point; its
        min/default/max buffer sizes and power of 2 pages granularity need to be
        revisited for ARM SPE
      
      - Recording across multiple SPE clusters/domains not supported
      
      - Snapshot support (record -S), and conversion to native perf events
        (e.g., via 'perf inject --itrace'), are also not supported
      
      - Technically both cs-etm and spe can be used simultaneously, however
        disabled for simplicity in this release
      Signed-off-by: default avatarKim Phillips <kim.phillips@arm.com>
      Reviewed-by: default avatarDongjiu Geng <gengdongjiu@huawei.com>
      Acked-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Pawel Moll <pawel.moll@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rob Herring <robh@kernel.org>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Wang Nan <wangnan0@huawei.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Link: http://lkml.kernel.org/r/20180114132850.0b127434b704a26bad13268f@arm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ffd3d18c