1. 23 Feb, 2015 5 commits
    • Arnaldo Carvalho de Melo's avatar
      perf trace: Introduce --filter-pids · f078c385
      Arnaldo Carvalho de Melo authored
      When tracing in X we get event loops due to the tracing activity, i.e.
      updates to a gnome-terminal that generate syscalls for X.org, etc.
      
      To get a more useful view of what is happening, syscall wise, system
      wide, we need to filter those, like in:
      
       # ps ax|egrep '981|2296|1519' | grep -v egrep
         981 tty1 Ss+ 5:40 /usr/bin/Xorg :0 -background none ...
        1519 ?    Sl  2:22 /usr/bin/gnome-shell
        2296 ?    Sl  4:16 /usr/libexec/gnome-terminal-server
       #
      
       # trace -e write --filter-pids 981,2296,1519
          0.385 ( 0.021 ms): goa-daemon/2061 write(fd: 1</dev/null>, buf: 0x7fbeb017b000, count: 136) = 136
          0.922 ( 0.014 ms): goa-daemon/2061 write(fd: 1</dev/null>, buf: 0x7fbeb017b000, count: 140) = 140
       5006.525 ( 0.029 ms): goa-daemon/2061 write(fd: 1</dev/null>, buf: 0x7fbeb017b000, count: 136) = 136
       5007.235 ( 0.023 ms): goa-daemon/2061 write(fd: 1</dev/null>, buf: 0x7fbeb017b000, count: 140) = 140
       5177.646 ( 0.018 ms): rtkit-daemon/782 write(fd: 5<anon_inode:[eventfd]>, buf: 0x7f7eea70be88, count: 8) = 8
       8314.497 ( 0.004 ms): gsd-locate-poi/2084 write(fd: 5<anon_inode:[eventfd]>, buf: 0x7fffe96af7b0, count: 8) = 8
       8314.518 ( 0.002 ms): gsd-locate-poi/2084 write(fd: 5<anon_inode:[eventfd]>, buf: 0x7fffe96af0e0, count: 8) = 8
       ^C#
      
      When this option is used the tracer pid is also filtered.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-f5qmiyy7c0uxdm21ncatpeek@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f078c385
    • Arnaldo Carvalho de Melo's avatar
      perf evlist: Introduce set_filter_pids method · be199ada
      Arnaldo Carvalho de Melo authored
      We need to filter multiple pids in trace, i.e. trace itself,
      gnome-terminal, X.org, etc.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-frtpkg7qapqwf7asa35wf8am@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      be199ada
    • Arnaldo Carvalho de Melo's avatar
      perf trace: Filter out the trace pid when no threads are specified · 241b057c
      Arnaldo Carvalho de Melo authored
      To avoid tracing the tracer.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-shmwd1khzpaobr3i0j1ygapg@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      241b057c
    • Arnaldo Carvalho de Melo's avatar
      perf evlist: Introduce set_filter_pid method · cfd70a26
      Arnaldo Carvalho de Melo authored
      To filter out events for a certain pid, for instance, when tracing
      system wide, so that the tracer itself doesn't creates an event loop.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-byoia9dzu4gmkdv87etnd9zf@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      cfd70a26
    • Arnaldo Carvalho de Melo's avatar
      perf trace: Only insert blank duration bracket when tracing syscalls · 0808921a
      Arnaldo Carvalho de Melo authored
      When printing just events, i.e. '--no-sys --ev some:events' it makes no
      sense to waste screen space.
      
      Before:
      
       # trace --no-sys --ev probe:*
       84481.704 (         ): probe:vfs_getname:(ffffffff811ed023) pathname="/etc/services")
       84481.892 (         ): probe:vfs_getname:(ffffffff811ed023) pathname="/etc/services")
       84482.230 (         ): probe:vfs_getname:(ffffffff811ed023) pathname="/etc/resolv.conf")
       84482.481 (         ): probe:vfs_getname:(ffffffff811ed023) pathname="/etc/hosts")
       85097.725 (         ): probe:vfs_getname:(ffffffff811ed023) pathname="/root"
       #
      
      After:
      
       # trace --no-sys --ev probe:*
       0.000 probe:vfs_getname:(ffffffff811ed023) pathname="/root")
       1.711 probe:vfs_getname:(ffffffff811ed023) pathname="/etc/localtime")
       2.103 probe:vfs_getname:(ffffffff811ed023) pathname="/etc/localtime")
      ^C#
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-jhryxgnam8zecq0q0wsy6pyb@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      0808921a
  2. 18 Feb, 2015 22 commits
  3. 13 Feb, 2015 7 commits
    • Arnaldo Carvalho de Melo's avatar
      perf trace: Support --events foo:bar --no-syscalls · 726f3234
      Arnaldo Carvalho de Melo authored
      I.e. support tracing just tracepoints, without strace like
      raw_syscalls:*.
      
      [acme@ssdandy linux]$ trace --no-sys --ev sched:*exec,sched:*switch,sched:*exit usleep 1
        0.048 (     ): sched:sched_process_exec:filename=/usr/bin/usleep pid=27298 old_pid=27298)
        0.369 (     ): sched:sched_switch:usleep:27298 [120] S ==> swapper/5:0 [120])
        0.452 (     ): sched:sched_process_exit:comm=usleep pid=27298 prio=120)
      [acme@ssdandy linux]$
      
      TODO: remove that (...) thing when --no-syscalls is specified.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-vn0hsixsbhm31b2rpj97r96k@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      726f3234
    • Arnaldo Carvalho de Melo's avatar
      perf trace: Allow mixing with other events · 14a052df
      Arnaldo Carvalho de Melo authored
      Basically adopting 'perf record' --event command line argument syntax:
      
       # trace -e \!mprotect,mmap,munmap,open,close,read,fstat,access,arch_prctl --event sched:*switch,sched:*exec,sched:*exit usleep 1
        0.048 (        ): sched:sched_process_exec:filename=/bin/usleep pid=24732 old_pid=24732)
        0.078 (0.002 ms): usleep/24732 brk(                          ) = 0x78f000
        0.430 (0.002 ms): usleep/24732 brk(                          ) = 0x78f000
        0.434 (0.003 ms): usleep/24732 brk(brk: 0x7b0000             ) = 0x7b0000
        0.438 (0.001 ms): usleep/24732 brk(                          ) = 0x7b0000
        0.460 (0.004 ms): usleep/24732 nanosleep(rqtp: 0x7ffff3696a40) ...
        0.460 (        ): sched:sched_switch:prev_comm=usleep prev_pid=24732 prev_prio=120 prev_state=S ==> next_comm=swapper/1 next_pid=0 next_prio=120)
        0.515 (0.058 ms): usleep/24732  ... [continued]: nanosleep()) = 0
        0.520 (0.000 ms): usleep/24732 exit_group(
        0.550 (        ): sched:sched_process_exit:comm=usleep pid=24732 prio=120)
       #
      
      Next steps, probably in this order:
      
      1) Use ordered_events code, the logic in trace needs the events to be
         time ordered when needed, i.e. when multiple CPUs are involved.
      
      2) Callchains!
      
      3) Automatically account for interruptions when saying how long things
         took.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-gpst8mph575yb4wgf91qibyb@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      14a052df
    • Arnaldo Carvalho de Melo's avatar
      perf trace: Handle multiple threads better wrt syscalls being intermixed · e596663e
      Arnaldo Carvalho de Melo authored
       $ trace time taskset -c 0 usleep 1
         0.845 ( 0.021 ms): time/16722 wait4(upid: 4294967295, stat_addr: 0x7fff17f443d4, ru: 0x7fff17f44438 ) ...
         0.865 ( 0.008 ms): time/16723 execve(arg0: 140733595272004, arg1: 140733595272720, arg2: 140733595272768, arg3: 139755107218496, arg4: 7307199665339051828, arg5: 3) = -2
         2.395 ( 1.523 ms): taskset/16723 execve(arg0: 140733595272013, arg1: 140733595272720, arg2: 140733595272768, arg3: 139755107218496, arg4: 7307199665339051828, arg5: 3) = 0
         2.411 ( 0.002 ms): taskset/16723 brk(                                                                  ) = 0x1915000
         3.300 ( 0.058 ms): usleep/16723 nanosleep(rqtp: 0x7ffff4ada190                                        ) = 0
       <SNIP>
         3.305 ( 0.000 ms): usleep/16723 exit_group(
         3.363 ( 2.539 ms): time/16722  ... [continued]: wait4()) = 16723
         3.366 ( 0.001 ms): time/16722 rt_sigaction(sig: INT, act: 0x7fff17f44160, oact: 0x7fff17f44200, sigsetsize: 8) = 0
      
      We we're not seeing this line:
      
        0.845 ( 0.021 ms): time/16722 wait4(upid: 4294967295, stat_addr: 0x7fff17f443d4, ru: 0x7fff17f44438 ) ...
      
      just the one when it finishes:
      
        3.363 ( 2.539 ms): time/16722  ... [continued]: wait4()) = 16723
      
      Still some issues left till we move to ordered_samples when multiple
      CPUs/threads are involved...
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-zq9x30a1ky3djqewqn2v3ja3@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e596663e
    • Arnaldo Carvalho de Melo's avatar
      perf trace: Print thread info when following children · 42052bea
      Arnaldo Carvalho de Melo authored
      The default for 'trace workload' is to set perf_event_attr.inherit to 1,
      i.e. to make it equivalent to 'strace -f workload', so we were ending
      with syscalls for multiple processes mixed up, fix it:
      
      Before:
      
        [root@ssdandy ~]# trace -e brk time usleep 1
           0.071 ( 0.002 ms): brk(              ) = 0x100e000
           0.802 ( 0.001 ms): brk(              ) = 0x1d99000
           1.132 ( 0.003 ms): brk(              ) = 0x1d99000
           1.136 ( 0.003 ms): brk(brk: 0x1dba000) = 0x1dba000
           1.140 ( 0.001 ms): brk(              ) = 0x1dba000
        0.00user 0.00system 0:00.00elapsed 63%CPU (0avgtext+0avgdata 528maxresident)k
        0inputs+0outputs (0major+181minor)pagefaults 0swaps
        [root@ssdandy ~]#
      
      After:
      
        [root@ssdandy ~]# trace -f -e brk time usleep 1
           0.072 ( 0.002 ms): time/26308 brk(               ) = 0x1e6e000
           0.860 ( 0.001 ms): usleep/26309 brk(             ) = 0xb91000
           1.193 ( 0.003 ms): usleep/26309 brk(             ) = 0xb91000
           1.197 ( 0.003 ms): usleep/26309 brk(brk: 0xbb2000) = 0xbb2000
           1.201 ( 0.001 ms): usleep/26309 brk(             ) = 0xbb2000
        0.00user 0.00system 0:00.00elapsed 0%CPU (0avgtext+0avgdata 524maxresident)k
        0inputs+0outputs (0major+180minor)pagefaults 0swaps
        [root@ssdandy ~]#
      
      BTW: to achieve the 'strace workload' behaviour, i.e. without a explicit
      '-f', one has to use --no-inherit.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      echo Link: http://lkml.kernel.org/n/tip-`ranpwd -l 24`@git.kernel.org
      Link: http://lkml.kernel.org/n/tip-2wu2d5n65msxoq1i7vtcaft2@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      42052bea
    • Yunlong Song's avatar
      perf list: Place the header text in its right position · 619a303c
      Yunlong Song authored
      The hearer text 'List of pre-defined events (to be used in -e):' is
      placed in an improper function, which causes an abnormal output, e.g.
      'perf list hw' shows no guiding text at all, and 'perf list hw
      L1-dcache*' shows the guiding text incorrectly in the middle of the
      output.
      
      Example
      Before this patch:
      
       $ perf list hw L1-dcache*
      
         branch-instructions OR branches                    [Hardware event]
         branch-misses                                      [Hardware event]
         bus-cycles                                         [Hardware event]
         cache-misses                                       [Hardware event]
         cache-references                                   [Hardware event]
         cpu-cycles OR cycles                               [Hardware event]
         instructions                                       [Hardware event]
         stalled-cycles-backend OR idle-cycles-backend      [Hardware event]
         stalled-cycles-frontend OR idle-cycles-frontend    [Hardware event]
      
       List of pre-defined events (to be used in -e):              <-- incorrect position
         L1-dcache-load-misses                              [Hardware cache event]
         L1-dcache-loads                                    [Hardware cache event]
         L1-dcache-prefetch-misses                          [Hardware cache event]
         L1-dcache-prefetches                               [Hardware cache event]
         L1-dcache-store-misses                             [Hardware cache event]
         L1-dcache-stores                                   [Hardware cache event]
      
      After this patch:
      
       $ perf list hw L1-dcache*
      
       List of pre-defined events (to be used in -e):              <-- correct position
      
         branch-instructions OR branches                    [Hardware event]
         branch-misses                                      [Hardware event]
         bus-cycles                                         [Hardware event]
         cache-misses                                       [Hardware event]
         cache-references                                   [Hardware event]
         cpu-cycles OR cycles                               [Hardware event]
         instructions                                       [Hardware event]
         stalled-cycles-backend OR idle-cycles-backend      [Hardware event]
         stalled-cycles-frontend OR idle-cycles-frontend    [Hardware event]
      
         L1-dcache-load-misses                              [Hardware cache event]
         L1-dcache-loads                                    [Hardware cache event]
         L1-dcache-prefetch-misses                          [Hardware cache event]
         L1-dcache-prefetches                               [Hardware cache event]
         L1-dcache-store-misses                             [Hardware cache event]
         L1-dcache-stores                                   [Hardware cache event]
      Signed-off-by: default avatarYunlong Song <yunlong.song@huawei.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1423833115-11199-8-git-send-email-yunlong.song@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      619a303c
    • Kaixu Xia's avatar
      perf: Remove the extra validity check on nr_pages · 74390aa5
      Kaixu Xia authored
      The function is_power_of_2() also do the check on nr_pages, so the first
      check performed is unnecessary. On the other hand, the key point is to
      ensure @nr_pages is a power-of-two number and mostly @nr_pages is a
      nonzero value, so in the most cases, the function is_power_of_2() will
      be called.
      Signed-off-by: default avatarKaixu Xia <xiakaixu@huawei.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Link: http://lkml.kernel.org/r/1422352512-75150-1-git-send-email-xiakaixu@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      74390aa5
    • Yunlong Song's avatar
      perf tools: Fix a bug of segmentation fault · 3a03005f
      Yunlong Song authored
      Fix the 'segmentation fault' bug of 'perf list --list-cmds', which also
      happens in other cases (e.g. record, report ...). This bug happens when
      there are no cmds to list at all.
      
      Example:
      
      Before this patch:
      
        $ perf list --list-cmds
        Segmentation fault
        $
      
        After this patch:
        $ perf list --list-cmds
        $
      
      As shown above, the result prints nothing rather than a segmentation
      fault. The null result means 'perf list' has no cmds to display at this
      time.
      Signed-off-by: default avatarYunlong Song <yunlong.song@huawei.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1423833115-11199-5-git-send-email-yunlong.song@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      3a03005f
  4. 12 Feb, 2015 6 commits