1. 09 Oct, 2009 1 commit
  2. 08 Oct, 2009 6 commits
    • Frederic Weisbecker's avatar
      perf tools: Provide backward compatibility with previous perf.data version · 26dd2cb0
      Frederic Weisbecker authored
      We have merged the trace.info file into perf.data by adding one
      section in the perf headers. This makes it incompatible with
      previous version: the new perf tools can't read the older
      perf.data.
      
      To support the previous format, we check the headers size. If they
      have the same size than in the previous format, then ignore the
      trace info section that doesn't exist.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1255032449-12022-1-git-send-email-fweisbec@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      26dd2cb0
    • Frederic Weisbecker's avatar
      perf tools: Fix thread comm resolution in perf sched · 97ea1a7f
      Frederic Weisbecker authored
      This reverts commit 9a92b479 ("perf
      tools: Improve thread comm resolution in perf sched") and fixes the
      real bug.
      
      The bug was elsewhere:
      
      We are failing to resolve thread names in perf sched because the
      table of threads we are building, on top of comm events, has a per
      process granularity. But perf sched, unlike the other perf tools,
      needs a per thread granularity as we are profiling every tasks
      individually.
      
      So fix it by building our threads table using the tid instead of
      the pid as the thread identifier.
      
      v2: Revert the previous fix - it is not really needed
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1255028657-11158-1-git-send-email-fweisbec@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      97ea1a7f
    • Arnaldo Carvalho de Melo's avatar
      perf tools: Improve kernel/modules symbol lookup · 2e538c4a
      Arnaldo Carvalho de Melo authored
      This removes the ovelapping of vmlinux addresses with modules,
      using the ELF section name when using --vmlinux and creating a
      unique DSO name when using /proc/kallsyms ([kernel].N).
      
      This is done by creating multiple 'struct map' instances for
      address ranges backed by DSOs that have just the symbols for that
      range and a name that is derived from the ELF section name.o
      
      Now it is possible to ask for just the symbols in some particular
      kernel section:
      
      $ perf report -m --vmlinux ../build/tip-recvmmsg/vmlinux \
      	--dsos [kernel].vsyscall_fn | head -15
          52.73%             Xorg  [.] vread_hpet
          18.61%          firefox  [.] vread_hpet
          14.50%     npviewer.bin  [.] vread_hpet
           6.83%           compiz  [.] vread_hpet
           5.73%         glxgears  [.] vread_hpet
           0.63%             java  [.] vread_hpet
           0.30%   gnome-terminal  [.] vread_hpet
           0.23%             perf  [.] vread_hpet
           0.18%            xchat  [.] vread_hpet
      $
      
      Now we don't have to first lookup the list of modules and then, if
      it fails, vmlinux symbols, its just a simple lookup for the map
      then the symbols, just like for threads.
      
      Reports generated using /proc/kallsyms and --vmlinux should provide
      the same results, modulo the DSO name for sections other than
      ".text".
      
      But they don't right now because things like:
      
       ffffffff81011c20-ffffffff81012068 system_call
       ffffffff81011c30-ffffffff81011c9b system_call_after_swapgs
       ffffffff81011c9c-ffffffff81011cb6 system_call_fastpath
       ffffffff81011cb7-ffffffff81011cbb ret_from_sys_call
      
      I.e. overlapping symbols, again some ASM special case that we have
      to fixup.
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Mike Galbraith <efault@gmx.de>
      LKML-Reference: <1254934136-8503-1-git-send-email-acme@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      2e538c4a
    • Arnaldo Carvalho de Melo's avatar
      perf tools: Up the verbose level for some really verbose stuff · da21d1b5
      Arnaldo Carvalho de Melo authored
      Like printing every symbol created.
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Mike Galbraith <efault@gmx.de>
      LKML-Reference: <1254923340-4870-1-git-send-email-acme@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      da21d1b5
    • Frederic Weisbecker's avatar
      perf tools: Improve thread comm resolution in perf sched · 9a92b479
      Frederic Weisbecker authored
      When we get sched traces that involve a task that was already
      created before opening the event, we won't have the comm event for
      it.
      
      So if we can't find the comm event for a given thread, we look at
      the traces that may contain these informations.
      
      Before:
      
       ata/1:371             |      0.000 ms |        1 | avg: 3988.693 ms | max: 3988.693 ms |
       kondemand/1:421       |      0.096 ms |        3 | avg:  345.346 ms | max: 1035.989 ms |
       kondemand/0:420       |      0.025 ms |        3 | avg:  421.332 ms | max:  964.014 ms |
       :5124:5124            |      0.103 ms |        5 | avg:   74.082 ms | max:  277.194 ms |
       :6244:6244            |      0.691 ms |        9 | avg:  125.655 ms | max:  271.306 ms |
       firefox:5080          |      0.924 ms |        5 | avg:   53.833 ms | max:  257.828 ms |
       npviewer.bin:6225     |     21.871 ms |       53 | avg:   22.462 ms | max:  220.835 ms |
       :6245:6245            |      9.631 ms |       21 | avg:   41.864 ms | max:  213.349 ms |
      
      After:
      
       ata/1:371             |      0.000 ms |        1 | avg: 3988.693 ms | max: 3988.693 ms |
       kondemand/1:421       |      0.096 ms |        3 | avg:  345.346 ms | max: 1035.989 ms |
       kondemand/0:420       |      0.025 ms |        3 | avg:  421.332 ms | max:  964.014 ms |
       firefox:5124          |      0.103 ms |        5 | avg:   74.082 ms | max:  277.194 ms |
       npviewer.bin:6244     |      0.691 ms |        9 | avg:  125.655 ms | max:  271.306 ms |
       firefox:5080          |      0.924 ms |        5 | avg:   53.833 ms | max:  257.828 ms |
       npviewer.bin:6225     |     21.871 ms |       53 | avg:   22.462 ms | max:  220.835 ms |
       npviewer.bin:6245     |      9.631 ms |       21 | avg:   41.864 ms | max:  213.349 ms |
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1255012632-7882-1-git-send-email-fweisbec@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      9a92b479
    • Frederic Weisbecker's avatar
      perf tools: Unify perf.data mapping and events handling · 016e92fb
      Frederic Weisbecker authored
      This librarizes the perf.data file mapping and handling in various
      perf tools, roughly reducing the amount of code and fixing the
      places that mmap from beginning of the file whereas we want to mmap
      from the beginning of the data, leading to page fault because the
      mmap window is too small since the trace info are written in the
      file too.
      
      TODO:
      
       - convert perf timechart too
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arjan van de Ven <arjan@infradead.org>
      LKML-Reference: <20091007104729.GD5043@nowhere>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      016e92fb
  3. 07 Oct, 2009 2 commits
    • Frederic Weisbecker's avatar
      perf tools: Merge trace.info content into perf.data · 03456a15
      Frederic Weisbecker authored
      This drops the trace.info file and move its contents into the
      common perf.data file.
      
      This is done by creating a new trace_info section into this file. A
      user of perf headers needs to call perf_header__set_trace_info() to
      save the trace meta informations into the perf.data file.
      
      A file created by perf after his patch is unsupported by previous
      version because the size of the headers have increased.
      
      That said, it's two new fields that have been added in the end of
      the headers, and those could be ignored by previous versions if
      they just handled the dynamic header size and then ignore the
      unknow part. The offsets guarantee the compatibility. We'll do a
      -stable fix for that.
      
      But current previous versions handle the header size using its
      static size, not dynamic, then it's not backward compatible with
      trace records.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <20091006213643.GA5343@nowhere>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      03456a15
    • Frederic Weisbecker's avatar
      perf tools: Start the perf.data mapping at data offset in perf trace · b209aa1f
      Frederic Weisbecker authored
      Currently, we are mapping perf.data in the beginning of the file
      and use the data offset as a buffer offset.
      
      This may exceed the mapping area if the data offset is upper than
      page_size * mmap_window and result in a page fault (thing that
      happen if we merge trace.info in perf.data).
      
      Instead, let's start the mapping in the page that matches our data
      offset.
      
      v2: Drop a junk from another patch (trace_report() removal)
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <1254856886-10348-1-git-send-email-fweisbec@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      b209aa1f
  4. 06 Oct, 2009 11 commits
  5. 05 Oct, 2009 4 commits
  6. 04 Oct, 2009 2 commits
    • Chris Wilson's avatar
      perf: Propagate term signal to child · 933da83a
      Chris Wilson authored
      If we launch the child on behalf of the user, ensure that it dies
      along with ourselves when we are interrupted.
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      LKML-Reference: <1254616502-4728-1-git-send-email-chris@chris-wilson.co.uk>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      933da83a
    • Arnaldo Carvalho de Melo's avatar
      perf tools: Remove show_mask bitmask · ec218fc4
      Arnaldo Carvalho de Melo authored
      As it was not being exposed via any command line and with --dsos/--comms
      we can do this and even more, like asking for just kernel + some module:
      
      [root@doppio linux-2.6-tip]# perf report --dsos \[kernel\],\[drm\]
      --vmlinux /home/acme/git/build/tip-recvmmsg/vmlinux --modules | head -15
       # Samples: 619669
       #
       # Overhead          Command  Shared Object  Symbol
       # ........  ...............  .............  ......
       #
            7.12%          swapper  [kernel]       [k] read_hpet
            6.86%             init  [kernel]       [k] read_hpet
            6.22%             init  [kernel]       [k] mwait_idle_with_hints
            5.34%          swapper  [kernel]       [k] mwait_idle_with_hints
            3.01%          firefox  [kernel]       [.] vread_hpet
            2.14%             Xorg  [drm]          [k] drm_clflush_pages
            2.09%           pidgin  [kernel]       [.] vread_hpet
            1.58%     npviewer.bin  [kernel]       [.] vread_hpet
            1.37%          swapper  [kernel]       [k] hpet_next_event
            1.23%             Xorg  [kernel]       [k] read_hpet
      [root@doppio linux-2.6-tip]#
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Mike Galbraith <efault@gmx.de>
      LKML-Reference: <20091003233048.GA30535@ghostprotocols.net>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      ec218fc4
  7. 03 Oct, 2009 1 commit
  8. 02 Oct, 2009 1 commit
    • Arnaldo Carvalho de Melo's avatar
      perf tools: Rewrite and improve support for kernel modules · 439d473b
      Arnaldo Carvalho de Melo authored
      Representing modules as struct map entries, backed by a DSO, etc,
      using /proc/modules to find where the module is loaded.
      
      DSOs now can have a short and long name, so that in verbose mode we
      can show exactly which .ko or vmlinux image was used.
      
      As kernel modules now are a DSO separate from the kernel, we can
      ask for just the hits for a particular set of kernel modules, just
      like we can do with shared libraries:
      
      [root@doppio linux-2.6-tip]# perf report -n --vmlinux
      /home/acme/git/build/tip-recvmmsg/vmlinux --modules --dsos \[drm\] | head -15
          84.58%      13266             Xorg  [k] drm_clflush_pages
           4.02%        630             Xorg  [k] trace_kmalloc.clone.0
           3.95%        619             Xorg  [k] drm_ioctl
           2.07%        324             Xorg  [k] drm_addbufs
           1.68%        263             Xorg  [k] drm_gem_close_ioctl
           0.77%        120             Xorg  [k] drm_setmaster_ioctl
           0.70%        110             Xorg  [k] drm_lastclose
           0.68%        106             Xorg  [k] drm_open
           0.54%         85             Xorg  [k] drm_mm_search_free
      [root@doppio linux-2.6-tip]#
      
      Specifying --dsos /lib/modules/2.6.31-tip/kernel/drivers/gpu/drm/drm.ko
      would have the same effect. Allowing specifying just 'drm.ko' is left
      for another patch.
      
      Processing kallsyms so that per kernel module struct map are
      instantiated was also left for another patch. That will allow
      removing the module name from each of its symbols.
      
      struct symbol was reduced by removing the ->module backpointer and
      moving it (well now the map) to struct symbol_entry in perf top,
      that is its only user right now.
      
      The total linecount went down by ~500 lines.
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Avi Kivity <avi@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      439d473b
  9. 01 Oct, 2009 5 commits
  10. 30 Sep, 2009 5 commits
    • Arnaldo Carvalho de Melo's avatar
      perf trace: Remove dead code · cad30714
      Arnaldo Carvalho de Melo authored
      Several variables are not used at all, cut'n'paste leftovers.
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      LKML-Reference: <20090928200818.GF3361@ghostprotocols.net>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      cad30714
    • Arnaldo Carvalho de Melo's avatar
      perf sched: Remove dead code · a80deb62
      Arnaldo Carvalho de Melo authored
      Several variables are not used at all, cut'n'paste leftovers.
      
      Also check if the sample_type is RAW earlier, to avoid needless
      searches.
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      a80deb62
    • Arnaldo Carvalho de Melo's avatar
      perf tools: Use rb_tree for maps · 1b46cddf
      Arnaldo Carvalho de Melo authored
      Threads can have many and kernel modules will be represented as a
      tree of maps as well.
      
      Ah, and for a perf.data with 146607 samples:
      
      Before:
      
      [root@doppio ~]# perf stat -r 5 perf report > /dev/null
      
       Performance counter stats for 'perf report' (5 runs):
      
           699.823680  task-clock-msecs         #      0.991 CPUs    ( +-   0.454% )
                   74  context-switches         #      0.000 M/sec   ( +-   1.709% )
                    2  CPU-migrations           #      0.000 M/sec   ( +-  17.008% )
                23114  page-faults              #      0.033 M/sec   ( +-   0.000% )
           1381257019  cycles                   #   1973.721 M/sec   ( +-   0.290% )
           1456894438  instructions             #      1.055 IPC     ( +-   0.007% )
             18779818  cache-references         #     26.835 M/sec   ( +-   0.380% )
               641799  cache-misses             #      0.917 M/sec   ( +-   1.200% )
      
          0.705972729  seconds time elapsed   ( +-   0.501% )
      
      [root@doppio ~]#
      
      After
      
       Performance counter stats for 'perf report' (5 runs):
      
           691.261451  task-clock-msecs         #      0.993 CPUs    ( +-   0.307% )
                   72  context-switches         #      0.000 M/sec   ( +-   0.829% )
                    6  CPU-migrations           #      0.000 M/sec   ( +-  18.409% )
                23127  page-faults              #      0.033 M/sec   ( +-   0.000% )
           1366395876  cycles                   #   1976.670 M/sec   ( +-   0.153% )
           1443136016  instructions             #      1.056 IPC     ( +-   0.012% )
             17956402  cache-references         #     25.976 M/sec   ( +-   0.325% )
               661924  cache-misses             #      0.958 M/sec   ( +-   1.335% )
      
          0.696127275  seconds time elapsed   ( +-   0.377% )
      
      I.e. we see some speedup too.
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      LKML-Reference: <20090928174846.GA3361@ghostprotocols.net>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      1b46cddf
    • John Kacur's avatar
      perf tools: Put common histogram functions in their own file · 3d1d07ec
      John Kacur authored
      Move histogram related functions into their own files (hist.c and
      hist.h) and make use of them in builtin-annotate.c and
      builtin-report.c.
      Signed-off-by: default avatarJohn Kacur <jkacur@redhat.com>
      Acked-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <alpine.LFD.2.00.0909281531180.8316@localhost.localdomain>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      3d1d07ec
    • Arnaldo Carvalho de Melo's avatar
      perf top: Add poll_idle to the skip list · 8357275b
      Arnaldo Carvalho de Melo authored
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Mike Galbraith <efault@gmx.de>
      LKML-Reference: <20090925220239.GA5488@ghostprotocols.net>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      8357275b
  11. 27 Sep, 2009 2 commits