1. 29 Jun, 2009 1 commit
  2. 28 Jun, 2009 1 commit
  3. 27 Jun, 2009 4 commits
    • Jaswinder Singh Rajput's avatar
      perf stat: Improve output · 6e750a8f
      Jaswinder Singh Rajput authored
      Increase size for event name to handle bigger names like
      'L1-d$-prefetch-misses'
      
      Changed scaled counters from percentage to a multiplicative
      factor because the latter is more expressive.
      
      Also aligned the scaling factor, otherwise sometimes it looks
      like:
      
                  384  iTLB-load-misses           (4.74x scaled)
               452029  branch-loads               (8.00x scaled)
                 5892  branch-load-misses         (20.39x scaled)
               972315  iTLB-loads                 (3.24x scaled)
      
      Before:
               150708  L1-d$-stores          (scaled from 23.57%)
               428804  L1-d$-prefetches      (scaled from 23.47%)
               314446  L1-d$-prefetch-misses  (scaled from 23.42%)
            252626137  L1-i$-loads           (scaled from 23.24%)
              5297550  dTLB-load-misses      (scaled from 23.96%)
            106992392  branch-loads          (scaled from 23.67%)
              5239561  branch-load-misses    (scaled from 23.43%)
      
      After:
              1731713  L1-d$-loads               (  14.25x scaled)
                44241  L1-d$-prefetches          (   3.88x scaled)
                21076  L1-d$-prefetch-misses     (   3.40x scaled)
              5789421  L1-i$-loads               (   3.78x scaled)
                29645  dTLB-load-misses          (   2.95x scaled)
               461474  branch-loads              (   6.52x scaled)
                 7493  branch-load-misses        (  26.57x scaled)
      Reported-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarJaswinder Singh Rajput <jaswinderrajput@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      LKML-Reference: <1246051927.2988.10.camel@hpdv5.satnam>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      6e750a8f
    • Ingo Molnar's avatar
      perf stat: Fix multi-run stats · 566747e6
      Ingo Molnar authored
      In multi-run (-r/--repeat) printouts, print out the noise of
      the wall-clock average as well.
      
      Also, fix a bug in printing out scaled counters: if it was not
      scaled then we should not update the average with -1.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      566747e6
    • Ingo Molnar's avatar
      perf stat: Add -n/--null option to run without counters · 0cfb7a13
      Ingo Molnar authored
      Allow a no-counters run. This can be useful to measure just
      elapsed wall-clock time - or to assess the raw overhead of perf
      stat itself, without running any counters.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      0cfb7a13
    • Ingo Molnar's avatar
      perf_counter tools: Remove dead code · fde953c1
      Ingo Molnar authored
      Vince Weaver reported that there's a handful of #ifdef __MINGW32__
      sections in the code.
      
      Remove them as they are in essence dead code - as unlike upstream
      Git, the perf tool is unlikely to be ported to Windows.
      Reported-by: default avatarVince Weaver <vince@deater.net>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      fde953c1
  4. 26 Jun, 2009 3 commits
    • Peter Zijlstra's avatar
      perf_counter: Complete counter swap · 19d2e755
      Peter Zijlstra authored
      Complete the counter swap by indeed switching the times too and
      updating the userpage after modifying the counter values.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1246014623.31755.195.camel@twins>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      19d2e755
    • Frederic Weisbecker's avatar
      perf report: Print sorted callchains per histogram entries · f55c5552
      Frederic Weisbecker authored
      Use the newly created callchains radix tree to gather the chains stats
      from the recorded events and then print the callchains for all of them,
      sorted by hits, using the "-c" parameter with perf report.
      
      Example:
      
       66.15%  [k] atm_clip_exit
                  63.08%
                      0xffffffffffffff80
                      0xffffffff810196a8
                      0xffffffff810c14c8
                      0xffffffff8101a79c
                      0xffffffff810194f3
                      0xffffffff8106ab7f
                      0xffffffff8106abe5
                      0xffffffff8106acde
                      0xffffffff8100d94b
                      0xffffffff8153e7ea
                      [...]
      
                   1.54%
                      0xffffffffffffff80
                      0xffffffff810196a8
                      0xffffffff810c14c8
                      0xffffffff8101a79c
      		[...]
      
      Symbols are not yet resolved.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1246026481-8314-3-git-send-email-fweisbec@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      f55c5552
    • Frederic Weisbecker's avatar
      perf_counter tools: Prepare a small callchain framework · 8cb76d99
      Frederic Weisbecker authored
      We plan to display the callchains depending on some user-configurable
      parameters.
      
      To gather the callchains stats from the recorded stream in a fast way,
      this patch introduces an ad hoc radix tree adapted for callchains and also
      a rbtree to sort these callchains once we have gathered every events
      from the stream.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1246026481-8314-2-git-send-email-fweisbec@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      8cb76d99
  5. 25 Jun, 2009 14 commits
    • Frederic Weisbecker's avatar
      perf record: Fix unhandled io return value · 3928ddbe
      Frederic Weisbecker authored
      Building latest perfcounter fails on the following error:
      
       builtin-record.c: In function ‘create_counter’:
       builtin-record.c:451: erreur: ignoring return value of ‘read’, declared with attribute warn_unused_result
       make: *** [builtin-record.o] Erreur 1
      
      Just check if we successfully read the perf file descriptor.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <1245961287-5327-1-git-send-email-fweisbec@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      3928ddbe
    • Jaswinder Singh Rajput's avatar
      perf_counter tools: Add alias for 'l1d' and 'l1i' · 4418351f
      Jaswinder Singh Rajput authored
      Add 'l1d' and 'l1i' aliases again as shortcuts - just dont make them
      the primary display alias.
      Signed-off-by: default avatarJaswinder Singh Rajput <jaswinderrajput@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1245945462.9157.11.camel@hpdv5.satnam>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      4418351f
    • Peter Zijlstra's avatar
      perf-report: Add bare minimum PERF_EVENT_READ parsing · e9ea2fde
      Peter Zijlstra authored
      Provide the basic infrastructure to provide per task stats.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      e9ea2fde
    • Peter Zijlstra's avatar
      perf-report: Add modes for inherited stats and no-samples · 649c48a9
      Peter Zijlstra authored
      Now that we can collect per task statistics, add modes that
      make use of that facility.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      649c48a9
    • Peter Zijlstra's avatar
      perf_counter: Rework the sample ABI · e6e18ec7
      Peter Zijlstra authored
      The PERF_EVENT_READ implementation made me realize we don't
      actually need the sample_type int the output sample, since
      we already have that in the perf_counter_attr information.
      
      Therefore, remove the PERF_EVENT_MISC_OVERFLOW bit and the
      event->type overloading, and imply put counter overflow
      samples in a PERF_EVENT_SAMPLE type.
      
      This also fixes the issue that event->type was only 32-bit
      and sample_type had 64 usable bits.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      e6e18ec7
    • Peter Zijlstra's avatar
      perf_counter: Implement more accurate per task statistics · bfbd3381
      Peter Zijlstra authored
      With the introduction of PERF_EVENT_READ we have the
      possibility to provide accurate counter values for
      individual tasks in a task hierarchy.
      
      However, due to the lazy context switching used for similar
      counter contexts our current per task counts are way off.
      
      In order to maintain some of the lazy switch benefits we
      don't disable it out-right, but simply iterate the active
      counters and flip the values between the contexts.
      
      This only reads the counters but does not need to reprogram
      the full PMU.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      bfbd3381
    • Peter Zijlstra's avatar
      perf_counter: Add PERF_EVENT_READ · 38b200d6
      Peter Zijlstra authored
      Provide a read() like event which can be used to log the
      counter value at specific sites such as child->parent
      folding on exit.
      
      In order to be useful, we log the counter parent ID, not the
      actual counter ID, since userspace can only relate parent
      IDs to perf_counter_attr constructs.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      38b200d6
    • Peter Zijlstra's avatar
      perf_counter, x86: Add mmap counter read support · 194002b2
      Peter Zijlstra authored
      Update the mmap control page with the needed information to
      use the userspace RDPMC instruction for self monitoring.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      194002b2
    • Peter Zijlstra's avatar
      perf_counter: Add scale information to the mmap control page · 7f8b4e4e
      Peter Zijlstra authored
      Add the needed time scale to the self-profile mmap information.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      7f8b4e4e
    • Peter Zijlstra's avatar
      perf_counter: Split the mmap control page in two parts · 41f95331
      Peter Zijlstra authored
      Since there are two distinct sections to the control page,
      move them apart so that possible extentions don't overlap.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      41f95331
    • Peter Zijlstra's avatar
      perf_counter tools: Rework the file format · 7c6a1c65
      Peter Zijlstra authored
      Create a structured file format that includes the full
      perf_counter_attr and all its relevant counter IDs so that
      the reporting program has full information.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      7c6a1c65
    • Jaswinder Singh Rajput's avatar
      perf_counter tools: Shorten names for events · e5c59547
      Jaswinder Singh Rajput authored
      Added new alias for events.
      
      On AMD box:
      
       $ ./perf stat -e l1d -e l1d-misses -e l1d-write -e l1d-prefetch -e l1d-prefetch-miss -e l1i -e l1i-misses -e l1i-prefetch -e l2 -e l2-misses -e l2-write -e dtlb -e dtlb-misses -e itlb -e itlb-misses -e bpu -e bpu-misses -- ls -lR /usr/include/ > /dev/null
      
      Before :
      
       Performance counter stats for 'ls -lR /usr/include/':
      
            248064467  L1-data-Cache-Load-Referencees  (scaled from 23.27%)
              1001433  L1-data-Cache-Load-Misses  (scaled from 23.34%)
               153691  L1-data-Cache-Store-Referencees  (scaled from 23.34%)
               423248  L1-data-Cache-Prefetch-Referencees  (scaled from 23.33%)
               302138  L1-data-Cache-Prefetch-Misses  (scaled from 23.25%)
            251217546  L1-instruction-Cache-Load-Referencees  (scaled from 23.25%)
              5757005  L1-instruction-Cache-Load-Misses  (scaled from 23.23%)
                93435  L1-instruction-Cache-Prefetch-Referencees  (scaled from 23.24%)
              6496073  L2-Cache-Load-Referencees  (scaled from 23.32%)
               609485  L2-Cache-Load-Misses  (scaled from 23.45%)
              6876991  L2-Cache-Store-Referencees  (scaled from 23.71%)
            248922840  Data-TLB-Cache-Load-Referencees  (scaled from 23.94%)
              5828386  Data-TLB-Cache-Load-Misses  (scaled from 24.17%)
            257613506  Instruction-TLB-Cache-Load-Referencees  (scaled from 24.20%)
                 6833  Instruction-TLB-Cache-Load-Misses  (scaled from 23.88%)
            109043606  Branch-Cache-Load-Referencees  (scaled from 23.64%)
              5552296  Branch-Cache-Load-Misses  (scaled from 23.42%)
      
          0.413702461  seconds time elapsed.
      
      After :
      
       Peformance counter stats for 'ls -lR /usr/include/':
      
            266590464  L1-d$-loads           (scaled from 23.03%)
              1222273  L1-d$-load-misses     (scaled from 23.58%)
               146204  L1-d$-stores          (scaled from 23.83%)
               406344  L1-d$-prefetches      (scaled from 24.09%)
               283748  L1-d$-prefetch-misses (scaled from 24.10%)
            249650965  L1-i$-loads           (scaled from 23.80%)
              3353961  L1-i$-load-misses     (scaled from 23.82%)
               104599  L1-i$-prefetches      (scaled from 23.68%)
              4836405  LLC-loads             (scaled from 23.67%)
               498214  LLC-load-misses       (scaled from 23.66%)
              4953994  LLC-stores            (scaled from 23.64%)
            243354097  dTLB-loads            (scaled from 23.77%)
              6468584  dTLB-load-misses      (scaled from 23.74%)
            249719549  iTLB-loads            (scaled from 23.25%)
                 5060  iTLB-load-misses      (scaled from 23.00%)
            112343016  branch-loads          (scaled from 22.76%)
              5528876  branch-load-misses    (scaled from 22.54%)
      
          0.427154051  seconds time elapsed.
      
      Reported-by : Ingo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarJaswinder Singh Rajput <jaswinderrajput@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      LKML-Reference: <1245934522.5308.39.camel@hpdv5.satnam>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      e5c59547
    • Jaswinder Singh Rajput's avatar
      perf_counter tools: Check for valid cache operations · 06813f6c
      Jaswinder Singh Rajput authored
      Made new table for cache operartion stat 'hw_cache_stat' as:
      
       L1I : Read and prefetch only
       ITLB and BPU : Read-only
      
      introduce is_cache_op_valid() for cache operation validity
      
      And checks for valid cache operations.
      
      Reported-by : Ingo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarJaswinder Singh Rajput <jaswinderrajput@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      LKML-Reference: <1245930367.5308.33.camel@localhost.localdomain>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      06813f6c
    • Johannes Weiner's avatar
      perf record: Fix filemap pathname parsing in /proc/pid/maps · 76c64c5e
      Johannes Weiner authored
      Looking backward for the first space from the end of a line in
      /proc/pid/maps does not find the start of the pathname of the mapped
      file if it contains a space.
      
      Since the only slashes we have in this file occur in the (absolute!)
      pathname column of file mappings, looking for the first slash in a
      line is a safe method to find the name.
      Signed-off-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Cc: Stefani Seibold <stefani@seibold.net>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <20090624190835.GA25548@cmpxchg.org>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      76c64c5e
  6. 24 Jun, 2009 4 commits
  7. 23 Jun, 2009 7 commits
  8. 22 Jun, 2009 4 commits
  9. 21 Jun, 2009 2 commits
    • Ingo Molnar's avatar
      perf_counter tools: Fix vmlinux fallback when running on a different kernel · c1f47b45
      Ingo Molnar authored
      Lucas De Marchi reported that perf report and perf annotate
      displays mismatching profile if a perf.data is analyzed on
      an older kernel - even if the correct vmlinux is specified
      via the -k option.
      
      The reason is the fallback path in util/symbol.c:dso__load_kernel():
      
      int dso__load_kernel(struct dso *self, const char *vmlinux,
                           symbol_filter_t filter, int verbose)
      {
              int err = -1;
      
              if (vmlinux)
                      err = dso__load_vmlinux(self, vmlinux, filter, verbose);
      
              if (err)
                      err = dso__load_kallsyms(self, filter, verbose);
      
              return err;
      }
      
      dso__load_vmlinux() returns negative on error, but on success it
      returns the number of symbols loaded - which confuses the function
      to load the kallsyms.
      
      This is normally harmless, as reporting is usually performed on the
      same kernel that is analyzed - but if there's a mismatch then we
      load the wrong kallsyms and create a non-sensical symbol tree.
      
      The fix is to only fall back to kallsyms on errors.
      Reported-by: default avatarLucas De Marchi <lucas.de.marchi@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      c1f47b45
    • Jaswinder Singh Rajput's avatar
      perf_counter, x8: Fix L1-data-Cache-Store-Referencees for AMD · d9f2a5ec
      Jaswinder Singh Rajput authored
      Fix AMD's Data Cache Refills from System event.
      
      After this patch :
      
       ./tools/perf/perf stat -e l1d -e l1d-misses -e l1d-write -e l1d-prefetch -e l1d-prefetch-miss -e l1i -e l1i-misses -e l1i-prefetch -e l2 -e l2-misses -e l2-write -e dtlb -e dtlb-misses -e itlb -e itlb-misses -e bpu -e bpu-misses ls /dev/ > /dev/null
      
       Performance counter stats for 'ls /dev/':
      
              2499484  L1-data-Cache-Load-Referencees             (scaled from 3.97%)
                70347  L1-data-Cache-Load-Misses                  (scaled from 7.30%)
                 9360  L1-data-Cache-Store-Referencees            (scaled from 8.64%)
                32804  L1-data-Cache-Prefetch-Referencees         (scaled from 17.72%)
                 7693  L1-data-Cache-Prefetch-Misses              (scaled from 22.97%)
              2180945  L1-instruction-Cache-Load-Referencees      (scaled from 28.48%)
                14518  L1-instruction-Cache-Load-Misses           (scaled from 35.00%)
                 2405  L1-instruction-Cache-Prefetch-Referencees  (scaled from 34.89%)
                71387  L2-Cache-Load-Referencees                  (scaled from 34.94%)
                18732  L2-Cache-Load-Misses                       (scaled from 34.92%)
                79918  L2-Cache-Store-Referencees                 (scaled from 36.02%)
              1295294  Data-TLB-Cache-Load-Referencees            (scaled from 35.99%)
                30896  Data-TLB-Cache-Load-Misses                 (scaled from 33.36%)
              1222030  Instruction-TLB-Cache-Load-Referencees     (scaled from 29.46%)
                  357  Instruction-TLB-Cache-Load-Misses          (scaled from 20.46%)
               530888  Branch-Cache-Load-Referencees              (scaled from 11.48%)
                 8638  Branch-Cache-Load-Misses                   (scaled from 5.09%)
      
          0.011295149  seconds time elapsed.
      
      Earlier it always shows value 0.
      Signed-off-by: default avatarJaswinder Singh Rajput <jaswinderrajput@gmail.com>
      LKML-Reference: <1245484165.3102.6.camel@localhost.localdomain>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      d9f2a5ec