1. 10 May, 2010 13 commits
    • Tom Zanussi's avatar
      perf/trace/scripting: wakeup-latency script cleanup · e366728d
      Tom Zanussi authored
      Some minor fixes for the wakeup-latency script:
      
       - Fix nuisance 'use of uninitialized value' warnings
      
       - Avoid divide-by-zero error
      
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      LKML-Reference: <1273466820-9330-5-git-send-email-tzanussi@gmail.com>
      Signed-off-by: default avatarTom Zanussi <tzanussi@gmail.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e366728d
    • Tom Zanussi's avatar
      perf/trace/scripting: rwtop script cleanup · e88a4bfb
      Tom Zanussi authored
      A couple of fixes for the rwtop script:
      
      - printing the totals and clearing the hashes in the signal handler
        eventually leads to various random and serious problems when running
        the rwtop script continuously.  Moving the print_totals() calls to
        the event handlers solves that problem, and the event handlers are
        invoked frequently enough that it doesn't affect the timeliness of
        the output.
      
      - Fix nuisance 'use of uninitialized value' warnings
      
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Message-Id: <1273466820-9330-4-git-send-email-tzanussi@gmail.com>
      Signed-off-by: default avatarTom Zanussi <tzanussi@gmail.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e88a4bfb
    • Tom Zanussi's avatar
      perf/trace/scripting: rw-by-pid script cleanup · 6922c3d7
      Tom Zanussi authored
      Some minor fixes for the rw-by-pid script:
      
      - Fix nuisance 'use of uninitialized value' warnings
      
      - Change the failed read/write sections to sort by error counts
      
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      LKML-Reference: <1273466820-9330-3-git-send-email-tzanussi@gmail.com>
      Signed-off-by: default avatarTom Zanussi <tzanussi@gmail.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      6922c3d7
    • Tom Zanussi's avatar
      perf/trace/scripting: failed-syscalls script cleanup · c3f5fd28
      Tom Zanussi authored
      A couple small fixes for the failed syscalls script:
      
      - The script description says it can be restricted to a specific comm,
        make it so.
      
      - silence the match output in the shell script
      
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      LKML-Reference: <1273466820-9330-2-git-send-email-tzanussi@gmail.com>
      Signed-off-by: default avatarTom Zanussi <tzanussi@gmail.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c3f5fd28
    • Arnaldo Carvalho de Melo's avatar
      perf hist: Calculate max_sym name len and nr_entries · fefb0b94
      Arnaldo Carvalho de Melo authored
      Better done when we are adding entries, be it initially of when we're
      re-sorting the histograms.
      
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      fefb0b94
    • Arnaldo Carvalho de Melo's avatar
      perf hist: Introduce hists class and move lots of methods to it · 1c02c4d2
      Arnaldo Carvalho de Melo authored
      In cbbc79a we introduced support for multiple events by introducing a
      new "event_stat_id" struct and then made several perf_session methods
      receive a point to it instead of a pointer to perf_session, and kept the
      event_stats and hists rb_tree in perf_session.
      
      While working on the new newt based browser, I realised that it would be
      better to introduce a new class, "hists" (short for "histograms"),
      renaming the "event_stat_id" struct and the perf_session methods that
      were really "hists" methods, as they manipulate only struct hists
      members, not touching anything in the other perf_session members.
      
      Other optimizations, such as calculating the maximum lenght of a symbol
      name present in an hists instance will be possible as we add them,
      avoiding a re-traversal just for finding that information.
      
      The rationale for the name "hists" to replace "event_stat_id" is that we
      may have multiple sets of hists for the same event_stat id, as, for
      instance, the 'perf diff' tool has, so event stat id is not what
      characterizes what this struct and the functions that manipulate it do.
      
      Cc: Eric B Munson <ebmunson@us.ibm.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      1c02c4d2
    • Arnaldo Carvalho de Melo's avatar
      perf session: create_kernel_maps should use ->host_machine · d118f8ba
      Arnaldo Carvalho de Melo authored
      Using machines__create_kernel_maps(..., HOST_KERNEL_ID) it would create
      another machine instance for the host machine, and since 1f626bc3 we have
      it out of the machines rb_tree.
      
      Fix it by using machine__create_kernel_maps(&self->host_machine)
      directly.
      
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      d118f8ba
    • Arnaldo Carvalho de Melo's avatar
      perf callchains: Use zalloc to allocate objects · cdd5b75b
      Arnaldo Carvalho de Melo authored
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      cdd5b75b
    • Arnaldo Carvalho de Melo's avatar
      perf newt: Use newtAddComponent() · 7f826453
      Arnaldo Carvalho de Melo authored
      Instead of newtAddComponents(just-one-entry, NULL), that is not needed
      if, like in this browser, we're adding just one component at a time.
      
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      7f826453
    • Ingo Molnar's avatar
      Merge branch 'perf/test' of... · 1f0ac718
      Ingo Molnar authored
      Merge branch 'perf/test' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing into perf/core
      1f0ac718
    • Arnaldo Carvalho de Melo's avatar
      perf report: Allow limiting the number of entries to print in callchains · 232a5c94
      Arnaldo Carvalho de Melo authored
      Works by adding a third parameter to the '-g' argument, after the graph
      type and minimum percentage, for example:
      
      [root@doppio linux-2.6-tip]# perf report -g fractal,0.5,2
      
      Will show only the first two symbols where at least 0.5% of the samples
      took place.
      
      All the other symbols that don't fall outside these constraints will be
      put together in the last entry, prefixed with "[...]" and the total
      percentage for them.
      Suggested-by: default avatarArjan van de Ven <arjan@linux.intel.com>
      Acked-by: default avatarArjan van de Ven <arjan@linux.intel.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      232a5c94
    • Arnaldo Carvalho de Melo's avatar
      perf session: Embed the host machine data on perf_session · 1f626bc3
      Arnaldo Carvalho de Melo authored
      We have just one host on a given session, and that is the most common
      setup right now, so embed a ->host_machine struct machine instance
      directly in the perf_session class, check if we're looking for it before
      going to the rb_tree.
      
      This also fixes a problem found when we try to process old perf.data
      files where we didn't have MMAP events for the kernel and modules and
      thus don't create the kernel maps, do it in event__preprocess_sample if
      it wasn't already.
      Reported-by: default avatarIngo Molnar <mingo@elte.hu>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Zhang, Yanmin <yanmin_zhang@linux.intel.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      1f626bc3
    • Arnaldo Carvalho de Melo's avatar
      perf symbols: Check if a struct machine instance was found · 4cc49458
      Arnaldo Carvalho de Melo authored
      Which can happen when processing old files that had no fake kernel MMAP,
      events.
      
      That shouldn't result in perf_session__create_kernel_maps not being
      called, this will be fixed in a followup patch, for now do these checks
      to avoid segfaulting.
      
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      4cc49458
  2. 09 May, 2010 16 commits
    • Arnaldo Carvalho de Melo's avatar
      perf symbols: Consider unresolved DSOs in the dso__col_widt calculation · 3ceb0d44
      Arnaldo Carvalho de Melo authored
      By using BITS_PER_LONG / 4, that is the number of chars that will be
      used in such cases as the DSO "name".
      
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      3ceb0d44
    • Hitoshi Mitake's avatar
      perf lock: Drop "-a" option from cmd_record() default arguments set · 76ba7e84
      Hitoshi Mitake authored
      This patch drops "-a" from the default arguments passed to
      perf record by perf lock.
      
      If a user wants to do a system wide record of lock events,
              perf lock record -a <program> <argument> ...
      is enough for this purpose.
      
      This can reduce the size of the perf.data file.
      
      % sudo ./perf lock record whoami
      root
      [ perf record: Woken up 1 times to write data ]
      [ perf record: Captured and wrote 0.439 MB perf.data (~19170 samples) ]
      % sudo ./perf lock record -a whoami   # with -a option
      root
      [ perf record: Woken up 0 times to write data ]
      [ perf record: Captured and wrote 48.962 MB perf.data (~2139197 samples) ]
      Signed-off-by: default avatarHitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jens Axboe <jens.axboe@oracle.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
      LKML-Reference: Message-Id: <1273306229-5216-1-git-send-email-mitake@dcl.info.waseda.ac.jp>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      76ba7e84
    • Arnaldo Carvalho de Melo's avatar
      perf hist: Simplify the insertion of new hist_entry instances · 28e2a106
      Arnaldo Carvalho de Melo authored
      And with that fix at least one bug:
      
      The first hit for an entry, the one that calls malloc to create a new
      instance in __perf_session__add_hist_entry, wasn't adding the count to
      the per cpumode (PERF_RECORD_MISC_USER, etc) total variable.
      
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      28e2a106
    • Arnaldo Carvalho de Melo's avatar
      perf report: Fix leak of resolved callchains array on error path · 39d1e1b1
      Arnaldo Carvalho de Melo authored
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      39d1e1b1
    • Arnaldo Carvalho de Melo's avatar
      perf callchain: Move validate_callchain to callchain lib · 139633c6
      Arnaldo Carvalho de Melo authored
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      139633c6
    • Tom Zanussi's avatar
      perf/live-mode: Handle payload-less events · 794e43b5
      Tom Zanussi authored
      Some events, such as the PERF_RECORD_FINISHED_ROUND event consist of
      only an event header and no data.  In this case, a 0-length payload
      will be read, and the 0 return value will be wrongly interpreted as an
      'unexpected end of event stream'.
      
      This patch allows for proper handling of data-less events by skipping
      0-length reads.
      Signed-off-by: default avatarTom Zanussi <tzanussi@gmail.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      LKML-Reference: <1273038527.6383.51.camel@tropicana>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      794e43b5
    • Frederic Weisbecker's avatar
      tracing: Factorize lock events in a lock class · 2c193c73
      Frederic Weisbecker authored
      lock_acquired, lock_contended and lock_release now share the
      same prototype and format. Let's factorize them into a lock
      event class.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      2c193c73
    • Frederic Weisbecker's avatar
      tracing: Drop the nested field from lock_release event · 93135439
      Frederic Weisbecker authored
      Drop the nested field as we don't use it. Every nested state can
      be computed from a state machine on post processing already.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      93135439
    • Frederic Weisbecker's avatar
      tracing: Drop lock_acquired waittime field · 883a2a31
      Frederic Weisbecker authored
      Drop the waittime field from the lock_acquired event, we can
      calculate it by substracting the lock_acquired event timestamp
      with the matching lock_acquire one.
      
      It is not needed and takes useless space in the traces.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      883a2a31
    • Frederic Weisbecker's avatar
      perf lock: Always check min AND max wait time · 90c0e5fc
      Frederic Weisbecker authored
      When a lock is acquired after beeing contended, we update the
      wait time statistics for the given lock.
      But if the min wait time is updated, we don't check the max wait
      time. This is wrong because the first time we update the wait time,
      we want to update both min and max wait time.
      
      Before:
      	Name   acquired  contended total wait (ns)   max wait (ns)   min wait (ns)
      	key          8          1           21656           0           21656
      
      After:
      	Name   acquired  contended total wait (ns)   max wait (ns)   min wait (ns)
      	key          8          1           21656           21656           21656
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
      90c0e5fc
    • Frederic Weisbecker's avatar
      perf: Fix perf lock bad rate · 5efe08cf
      Frederic Weisbecker authored
      Fix the cast made to get the bad rate. It is made in the result
      instead of the operands. We need the operands to be cast in double,
      otherwise the result will always be zero.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
      5efe08cf
    • Frederic Weisbecker's avatar
      perf: Humanize lock flags in perf lock · 84c7a217
      Frederic Weisbecker authored
      Use an enum instead of plain constants for lock flags.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
      84c7a217
    • Frederic Weisbecker's avatar
      perf: Cleanup perf lock broken states · 10350ec3
      Frederic Weisbecker authored
      Use enum to get a human view of bad_hist indexes and
      put bad histogram output in its own function.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
      10350ec3
    • Hitoshi Mitake's avatar
      perf lock: Add "info" subcommand for dumping misc information · 26242d85
      Hitoshi Mitake authored
      This adds the "info" subcommand to perf lock which can be used
      to dump metadata like threads or addresses of lock instances.
      "map" was removed because info should do the work for it.
      
      This will be useful not only for debugging but also for ordinary
      analyzing.
      
      v2: adding example of usage
      % sudo ./perf lock info -t
       | Thread ID: comm
       | 	 0: swapper
       |         1: init
       |        18: migration/5
       |        29: events/2
       |        32: events/5
       |        33: events/6
      ...
      
      % sudo ./perf lock info -m
      | Address of instance: name of class
      |  0xffff8800b95adae0: &(&sighand->siglock)->rlock
      |  0xffff8800bbb41ae0: &(&sighand->siglock)->rlock
      |  0xffff8800bf165ae0: &(&sighand->siglock)->rlock
      |  0xffff8800b9576a98: &p->cred_guard_mutex
      |  0xffff8800bb890a08: &(&p->alloc_lock)->rlock
      |  0xffff8800b9522a08: &(&p->alloc_lock)->rlock
      |  0xffff8800bb8aaa08: &(&p->alloc_lock)->rlock
      |  0xffff8800bba72a08: &(&p->alloc_lock)->rlock
      |  0xffff8800bf18ea08: &(&p->alloc_lock)->rlock
      |  0xffff8800b8a0d8a0: &(&ip->i_lock)->mr_lock
      |  0xffff88009bf818a0: &(&ip->i_lock)->mr_lock
      |  0xffff88004c66b8a0: &(&ip->i_lock)->mr_lock
      |  0xffff8800bb6478a0: &(shost->host_lock)->rlock
      
      v3: fixed some problems Frederic pointed out
       * better rbtree tracking in dump_threads()
       * removed printf() and used pr_info() and pr_debug()
      Signed-off-by: default avatarHitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jens Axboe <jens.axboe@oracle.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
      LKML-Reference: <1272863520-16179-1-git-send-email-mitake@dcl.info.waseda.ac.jp>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      26242d85
    • Frederic Weisbecker's avatar
      perf: Provide a new deterministic events reordering algorithm · d6b17beb
      Frederic Weisbecker authored
      The current events reordering algorithm is based on a heuristic that
      gets broken once we deal with a very fast flow of events.
      
      Indeed the time period based flushing is not suitable anymore
      in the following case, assuming we have a flush period of two
      seconds.
      
          CPU 0           |        CPU 1
                          |
        cnt1 timestamps   |      cnt1 timestamps
                          |
          0               |         0
          1               |         1
          2               |         2
          3               |         3
          [...]           |        [...]
          4 seconds later
      
      If we spend too much time to read the buffers (case of a lot of
      events to record in each buffers or when we have a lot of CPU buffers
      to read), in the next pass the CPU 0 buffer could contain a slice
      of several seconds of events. We'll read them all and notice we've
      reached the period to flush. In the above example we flush the first
      half of the CPU 0 buffer, then we read the CPU 1 buffer where we
      have events that were on the flush slice and then the reordering
      fails.
      
      It's simple to reproduce with:
      
      	perf lock record perf bench sched messaging
      
      To solve this, we use a new solution that doesn't rely on an
      heuristical time slice period anymore but on a deterministic basis
      based on how perf record does its job.
      
      perf record saves the buffers through passes. A pass is a tour
      on every buffers from every CPUs. This is made in order: for
      each CPU we read the buffers of every counters. So the more
      buffers we visit, the later will be the timstamps of their events.
      
      When perf record finishes a pass it records a
      PERF_RECORD_FINISHED_ROUND pseudo event.
      We record the max timestamp t found in the pass n. Assuming these
      timestamps are monotonic across cpus, we know that if a buffer
      still has events with timestamps below t, they will be all available
      and then read in the pass n + 1.
      Hence when we start to read the pass n + 2, we can safely flush every
      events with timestamps below t.
      
            ============ PASS n =================
               CPU 0         |   CPU 1
                             |
            cnt1 timestamps  |   cnt2 timestamps
                  1          |         2
                  2          |         3
                  -          |         4  <--- max recorded
      
            ============ PASS n + 1 ==============
               CPU 0         |   CPU 1
                             |
            cnt1 timestamps  |   cnt2 timestamps
                  3          |         5
                  4          |         6
                  5          |         7 <---- max recorded
      
              Flush every events below timestamp 4
      
            ============ PASS n + 2 ==============
               CPU 0         |   CPU 1
                             |
            cnt1 timestamps  |   cnt2 timestamps
                  6          |         8
                  7          |         9
                  -          |         10
      
              Flush every events below timestamp 7
              etc...
      
      It also works on perf.data versions that don't have
      PERF_RECORD_FINISHED_ROUND pseudo events. The difference is that
      the events will be only flushed in the end of the perf.data
      processing. It will then consume more memory and scale less with
      large perf.data files.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      d6b17beb
    • Frederic Weisbecker's avatar
      perf: Introduce a new "round of buffers read" pseudo event · 98402807
      Frederic Weisbecker authored
      In order to provide a more rubust and deterministic reordering
      algorithm, we need to know when we reach a point where we just
      did a pass through over every counter buffers to read every thing
      they had.
      
      This patch introduces a new PERF_RECORD_FINISHED_ROUND pseudo event
      that only consist in an event header and doesn't need to contain
      anything.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      98402807
  3. 08 May, 2010 8 commits
  4. 07 May, 2010 3 commits
    • Arnaldo Carvalho de Melo's avatar
      perf list: Improve the raw hw event descriptor documentation · 1cf4a063
      Arnaldo Carvalho de Melo authored
      It was x86 specific and imcomplete at that, improve the situation by
      making it clear where the example provided applies and by adding the
      URLs for the Intel and AMD manuals where this is discussed in depth.
      Acked-by: default avatarRobert Richter <robert.richter@amd.com>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Robert Richter <robert.richter@amd.com>
      Reported-by: Robert Richter <robert.richter@amd.com
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      1cf4a063
    • Lin Ming's avatar
      perf, x86: implement group scheduling transactional APIs · 4d1c52b0
      Lin Ming authored
      Convert to the transactional PMU API and remove the duplication of
      group_sched_in().
      Reviewed-by: default avatarStephane Eranian <eranian@google.com>
      Signed-off-by: default avatarLin Ming <ming.m.lin@intel.com>
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: David Miller <davem@davemloft.net>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1272002172.5707.61.camel@minggr.sh.intel.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      4d1c52b0
    • Lin Ming's avatar
      perf: Add group scheduling transactional APIs · 6bde9b6c
      Lin Ming authored
      Add group scheduling transactional APIs to struct pmu.
      These APIs will be implemented in arch code, based on Peter's idea as
      below.
      
      > the idea behind hw_perf_group_sched_in() is to not perform
      > schedulability tests on each event in the group, but to add the group
      > as a whole and then perform one test.
      >
      > Of course, when that test fails, you'll have to roll-back the whole
      > group again.
      >
      > So start_txn (or a better name) would simply toggle a flag in the pmu
      > implementation that will make pmu::enable() not perform the
      > schedulablilty test.
      >
      > Then commit_txn() will perform the schedulability test (so note the
      > method has to have a !void return value.
      >
      > This will allow us to use the regular
      > kernel/perf_event.c::group_sched_in() and all the rollback code.
      > Currently each hw_perf_group_sched_in() implementation duplicates all
      > the rolllback code (with various bugs).
      
      ->start_txn:
      Start group events scheduling transaction, set a flag to make
      pmu::enable() not perform the schedulability test, it will be performed
      at commit time.
      
      ->commit_txn:
      Commit group events scheduling transaction, perform the group
      schedulability as a whole
      
      ->cancel_txn:
      Stop group events scheduling transaction, clear the flag so
      pmu::enable() will perform the schedulability test.
      Reviewed-by: default avatarStephane Eranian <eranian@google.com>
      Reviewed-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: default avatarLin Ming <ming.m.lin@intel.com>
      Cc: David Miller <davem@davemloft.net>
      Cc: Paul Mackerras <paulus@samba.org>
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <1272002160.5707.60.camel@minggr.sh.intel.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      6bde9b6c