1. 22 Jan, 2011 22 commits
    • Frederic Weisbecker's avatar
      perf callchain: Rename cumul_hits into callchain_cumul_hits · f08c3154
      Frederic Weisbecker authored
      That makes the callchain API naming more consistent and
      reduce potential naming clashes.
      
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <1294977121-5700-3-git-send-email-fweisbec@gmail.com>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f08c3154
    • Frederic Weisbecker's avatar
      perf callchain: Feed callchains into a cursor · 1b3a0e95
      Frederic Weisbecker authored
      The callchains are fed with an array of a fixed size.
      As a result we iterate over each callchains three times:
      
      - 1st to resolve symbols
      - 2nd to filter out context boundaries
      - 3rd for the insertion into the tree
      
      This also involves some pairs of memory allocation/deallocation
      everytime we insert a callchain, for the filtered out array of
      addresses and for the array of symbols that comes along.
      
      Instead, feed the callchains through a linked list with persistent
      allocations. It brings several pros like:
      
      - Merge the 1st and 2nd iterations in one. That was possible before
      but in a way that would involve allocating an array slightly taller
      than necessary because we don't know in advance the number of context
      boundaries to filter out.
      
      - Much lesser allocations/deallocations. The linked list keeps
      persistent empty entries for the next usages and is extendable at
      will.
      
      - Makes it easier for multiple sources of callchains to feed a
      stacktrace together. This is deemed to pave the way for cfi based
      callchains wherein traditional frame pointer based kernel
      stacktraces will precede cfi based user ones, producing an overall
      callchain which size is hardly predictable. This requirement
      makes the static array obsolete and makes a linked list based
      iterator a much more flexible fit.
      
      Basic testing on a big perf file containing callchains (~ 176 MB)
      has shown a throughput gain of about 11% with perf report.
      
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <1294977121-5700-2-git-send-email-fweisbec@gmail.com>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      1b3a0e95
    • Arnaldo Carvalho de Melo's avatar
      perf test: Add test for the evlist mmap routines · de5fa3a8
      Arnaldo Carvalho de Melo authored
      This test will generate random numbers of calls to some getpid syscalls,
      then establish an mmap for a group of events that are created to monitor
      these syscalls.
      
      It will receive the events, using mmap, use its PERF_SAMPLE_ID generated
      sample.id field to map back to its respective perf_evsel instance.
      
      Then it checks if the number of syscalls reported as perf events by the
      kernel corresponds to the number of syscalls made.
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      de5fa3a8
    • Arnaldo Carvalho de Melo's avatar
      perf evlist: Steal mmap reading routine from 'perf top' · 04391deb
      Arnaldo Carvalho de Melo authored
      Will be used in the upcoming 'perf test' entry for the evlist mmap
      routines.
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      04391deb
    • Han Pingtian's avatar
      perf test: check if cpu_map__new() return NULL · 98d77b78
      Han Pingtian authored
      It looks like we should check if cpus is NULL after
      
      	cpus = cpu_map__new(NULL);
      
      in test__open_syscall_event_on_all_cpus().
      
      LKML-Reference: <20110114230050.GA7011@localhost>
      Signed-off-by: default avatarHan Pingtian <phan@redhat.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      98d77b78
    • Arnaldo Carvalho de Melo's avatar
      perf test: Check counts on all cpus in test__open_syscall_event_on_all_cpus · d2af9687
      Arnaldo Carvalho de Melo authored
      We were bailing out after the first count mismatch, do it in all to see
      if only some CPUs are not getting the expected number of events.
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      d2af9687
    • Arnaldo Carvalho de Melo's avatar
      perf tools: Add missing cpu_map__delete() · 915fce20
      Arnaldo Carvalho de Melo authored
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      915fce20
    • Arnaldo Carvalho de Melo's avatar
      perf record: Use perf_evlist__mmap · 0a27d7f9
      Arnaldo Carvalho de Melo authored
      There is more stuff that can go to the perf_ev{sel,list} layer, like
      detecting if sample_id_all is available, etc, but lets try using this in
      'perf test' first.
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      0a27d7f9
    • Arnaldo Carvalho de Melo's avatar
      perf evlist: Move the mmap array from perf_evsel · 70db7533
      Arnaldo Carvalho de Melo authored
      Adopting the new model used in 'perf record', where we don't have a map
      per thread per cpu, instead we have an mmap per cpu, established on the
      first fd for that cpu and ask the kernel using the
      PERF_EVENT_IOC_SET_OUTPUT ioctl to send events for the other fds on that
      cpu for the one with the mmap.
      
      The methods moved from perf_evsel to perf_evlist, but for easing review
      they were modified in place, in evsel.c, the next patch will move the
      migrated methods to evlist.c.
      
      With this 'perf top' now uses the same mmap model used by 'perf record'
      and the next patches will make 'perf record' use these new routines,
      establishing a common codebase for both tools.
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      70db7533
    • Arnaldo Carvalho de Melo's avatar
      perf record: Move perf_mmap__write_tail to perf.h · 115d2d89
      Arnaldo Carvalho de Melo authored
      Close to perf_mmap__read_head() and the perf_mmap struct definition.
      This is useful for any recorder, and we will need it in 'perf test'.
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      115d2d89
    • Arnaldo Carvalho de Melo's avatar
      perf record: Use struct perf_mmap and helpers · 744bd8aa
      Arnaldo Carvalho de Melo authored
      Paving the way to using perf_evsel->mmap, do this to reduce the patch
      noise in the next ones.
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      744bd8aa
    • Arnaldo Carvalho de Melo's avatar
      perf evsel: Introduce mmap support · 70082dd9
      Arnaldo Carvalho de Melo authored
      Out of the code in 'perf top'. Record is next in line.
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      70082dd9
    • Arnaldo Carvalho de Melo's avatar
      perf record: Use perf_evsel__open · dd7927f4
      Arnaldo Carvalho de Melo authored
      Now its time to factor out the mmap handling bits into the perf_evsel
      class.
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      dd7927f4
    • Arnaldo Carvalho de Melo's avatar
      perf top: Use perf_evsel__open · 72cb7013
      Arnaldo Carvalho de Melo authored
      Now that it handles group_fd and inherit we can use it, sharing it with
      stat.
      
      Next step: 'perf record' should use, then move the mmap_array out of
      ->priv and into perf_evsel, with top and record sharing this, and at the
      same time, write a 'perf test' stress test.
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      72cb7013
    • Arnaldo Carvalho de Melo's avatar
      perf evsel: Allow specifying if the inherit bit should be set · 9d04f178
      Arnaldo Carvalho de Melo authored
      As this is a per-cpu attribute, we can't set it up in advance and use it
      for all the calls.
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      9d04f178
    • Arnaldo Carvalho de Melo's avatar
      perf evsel: Support event groups · f08199d3
      Arnaldo Carvalho de Melo authored
      The perf_evsel__open now have an extra boolean argument specifying if
      event grouping is desired.
      
      The first file descriptor created on a CPU becomes the group leader.
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f08199d3
    • Arnaldo Carvalho de Melo's avatar
      perf evlist: Adopt the pollfd array · 5c581041
      Arnaldo Carvalho de Melo authored
      Allocating just the space needed for nr_cpus * nr_threads * nr_evsels,
      not the MAX_NR_CPUS and counters.
      
      LKML-Reference: <new-submission>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      5c581041
    • Arnaldo Carvalho de Melo's avatar
      perf evsel: Introduce perf_evlist · 361c99a6
      Arnaldo Carvalho de Melo authored
      Killing two more perf wide global variables: nr_counters and evsel_list
      as a list_head.
      
      There are more operations that will need more fields in perf_evlist,
      like the pollfd for polling all the fds in a list of evsel instances.
      
      Use option->value to pass the evsel_list to parse_{events,filters}.
      
      LKML-Reference: <new-submission>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      361c99a6
    • Thomas Renninger's avatar
      perf tools: Fix time function double declaration with glibc · 00e99a49
      Thomas Renninger authored
      It's enough to include the local "debug.h" file to trigger it.
      
      man time reveals this is already declared in glibc:
      
      time - get time in seconds
      -> rename the variable.
      
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: arjan@infradead.org
      LPU-Reference: <1295620209-13859-2-git-send-email-trenn@suse.de>
      Signed-off-by: default avatarThomas Renninger <trenn@suse.de>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      00e99a49
    • Arnaldo Carvalho de Melo's avatar
      perf tools: Fix build by checking if extra warnings are supported · 065bef5a
      Arnaldo Carvalho de Melo authored
      The -Wstack-protector and -Wvolatile-register-var warnings, for
      instance, are not supported by gcc 3.4.6.
      
      So fix by doing the same check we already do for -fstack-protector-all.
      
      With this and the other patches in this series, perf builds unmodified
      on, for instance, RHEL4.
      
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      065bef5a
    • Arnaldo Carvalho de Melo's avatar
      perf tools: Fix build when using gcc 3.4.6 · 5c7a6682
      Arnaldo Carvalho de Melo authored
      [acme@localhost linux]$ make O=~acme/git/build/perf -C tools/perf
      make: Entering directory `/home/acme/git/linux/tools/perf'
      Makefile:526: No libdw.h found or old libdw.h found or elfutils is older than 0.138, disables dwarf support. Please install new elfutils-devel/libdw-dev
      Makefile:582: newt not found, disables TUI support. Please install newt-devel or libnewt-dev
          CC /home/acme/git/build/perf/builtin-annotate.o
      In file included from builtin-annotate.c:23:
      util/parse-events.h:26: warning: declaration of 'evsel_list' shadows a global declaration
      util/parse-events.h:12: warning: shadowed declaration is here
      make: *** [/home/acme/git/build/perf/builtin-annotate.o] Error 1
      make: Leaving directory `/home/acme/git/linux/tools/perf'
      [acme@localhost linux]$ gcc --version | head -1
      gcc (GCC) 3.4.6 20060404 (Red Hat 3.4.6-11)
      [acme@localhost linux]$
      
      Fix it by renaming the parameter to evlist.
      
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      5c7a6682
    • Arnaldo Carvalho de Melo's avatar
      perf tools: Add missing header, fixes build · a860a608
      Arnaldo Carvalho de Melo authored
      We need the definiton for __always_inline in bitops.h to fix the build
      on distros where it isn't available or compiler.h doesn't get included
      indirectly.
      
      One of the fixes needed to build perf on RHEL4 systems, for instance.
      
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      a860a608
  2. 23 Jan, 2011 2 commits
    • Arnaldo Carvalho de Melo's avatar
      perf tools: Fix 64 bit integer format strings · 9486aa38
      Arnaldo Carvalho de Melo authored
      Using %L[uxd] has issues in some architectures, like on ppc64.  Fix it
      by making our 64 bit integers typedefs of stdint.h types and using
      PRI[ux]64 like, for instance, git does.
      
      Reported by Denis Kirjanov that provided a patch for one case, I went
      and changed all cases.
      Reported-by: default avatarDenis Kirjanov <dkirjanov@kernel.org>
      Tested-by: default avatarDenis Kirjanov <dkirjanov@kernel.org>
      LKML-Reference: <20110120093246.GA8031@hera.kernel.org>
      Cc: Denis Kirjanov <dkirjanov@kernel.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Pingtian Han <phan@redhat.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      9486aa38
    • Arnaldo Carvalho de Melo's avatar
      perf test: Fix build on older glibcs · 57b84e53
      Arnaldo Carvalho de Melo authored
      Where we don't have CPU_ALLOC & friends. As the tools are being used in older
      distros where the only allowed change are to replace the kernel, like RHEL4 and
      5.
      Reported-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Stephane Eranian <eranian@google.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      57b84e53
  3. 22 Jan, 2011 1 commit
  4. 21 Jan, 2011 4 commits
    • Oleg Nesterov's avatar
      perf: perf_event_exit_task_context: s/rcu_dereference/rcu_dereference_raw/ · 806839b2
      Oleg Nesterov authored
      In theory, almost every user of task->child->perf_event_ctxp[]
      is wrong. find_get_context() can install the new context at any
      moment, we need read_barrier_depends().
      
      dbe08d82 "perf: Fix
      find_get_context() vs perf_event_exit_task() race" added
      rcu_dereference() into perf_event_exit_task_context() to make
      the precedent, but this makes __rcu_dereference_check() unhappy.
      Use rcu_dereference_raw() to shut up the warning.
      Reported-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Cc: acme@redhat.com
      Cc: paulus@samba.org
      Cc: stern@rowland.harvard.edu
      Cc: a.p.zijlstra@chello.nl
      Cc: fweisbec@gmail.com
      Cc: roland@redhat.com
      Cc: prasad@linux.vnet.ibm.com
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      LKML-Reference: <20110121174547.GA8796@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      806839b2
    • Han Pingtian's avatar
      perf test: Use cpu_map->[cpu] when setting affinity · ffb5e0fb
      Han Pingtian authored
      When some of CPUs are offline:
      
       # cat /sys/devices/system/cpu/online
       0,6-31
      
      perf test will fail on #3 testcase:
      
         3: detect open syscall event on all cpus:
         --- start ---
         perf_evsel__read_on_cpu: expected to intercept 111 calls on cpu 0, got 681
         perf_evsel__read_on_cpu: expected to intercept 112 calls on cpu 1, got 117
         perf_evsel__read_on_cpu: expected to intercept 113 calls on cpu 2, got 118
         perf_evsel__read_on_cpu: expected to intercept 114 calls on cpu 3, got 119
         perf_evsel__read_on_cpu: expected to intercept 115 calls on cpu 4, got 120
         perf_evsel__read_on_cpu: expected to intercept 116 calls on cpu 5, got 121
         perf_evsel__read_on_cpu: expected to intercept 117 calls on cpu 6, got 122
         perf_evsel__read_on_cpu: expected to intercept 118 calls on cpu 7, got 123
         perf_evsel__read_on_cpu: expected to intercept 119 calls on cpu 8, got 124
         perf_evsel__read_on_cpu: expected to intercept 120 calls on cpu 9, got 125
         perf_evsel__read_on_cpu: expected to intercept 121 calls on cpu 10, got 126
         ....
      
      This patch try to use 'cpus->map[cpu]' when setting cpu affinity, and
      will check the return code of sched_setaffinity()
      
      LKML-Reference: <20110120114707.GA11781@hpt.nay.redhat.com>
      Signed-off-by: default avatarHan Pingtian <phan@redhat.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ffb5e0fb
    • Dr. David Alan Gilbert's avatar
      perf symbols: Fix annotation of thumb code · b2f8fb23
      Dr. David Alan Gilbert authored
      In ARM's Thumb mode the bottom bit of the symbol address is set to mark
      the function as Thumb; the instructions are in reality 2 or 4 byte on 2
      byte alignments, and when the +1 address is used in annotate it causes
      objdump to disassemble invalid instructions.
      
      The patch removes that bottom bit during symbol loading.
      
      Many thinks to Dave Martin for comments on an initial version of the
      patch.
      
      (For reference this corresponds to this bug
      https://bugs.launchpad.net/linux-linaro/+bug/677547 )
      
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Dave Martin <dave.martin@linaro.org>
      LKML-Reference: <20110121163922.GA31398@davesworkthinkpad>
      Signed-off-by: default avatarDr. David Alan Gilbert <david.gilbert@linaro.org>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      b2f8fb23
    • Peter Zijlstra's avatar
      perf: Annotate cpuctx->ctx.mutex to avoid a lockdep splat · 547e9fd7
      Peter Zijlstra authored
      Lockdep spotted:
      
      	loop_1b_instruc/1899 is trying to acquire lock:
      	 (event_mutex){+.+.+.}, at: [<ffffffff810e1908>] perf_trace_init+0x3b/0x2f7
      
      	but task is already holding lock:
      	 (&ctx->mutex){+.+.+.}, at: [<ffffffff810eb45b>] perf_event_init_context+0xc0/0x218
      
      	which lock already depends on the new lock.
      
      	the existing dependency chain (in reverse order) is:
      
      	-> #3 (&ctx->mutex){+.+.+.}:
      	-> #2 (cpu_hotplug.lock){+.+.+.}:
      	-> #1 (module_mutex){+.+...}:
      	-> #0 (event_mutex){+.+.+.}:
      
      But because the deadlock would be cpuhotplug (cpu-event) vs fork
      (task-event) it cannot, in fact, happen. We can annotate this by giving the
      perf_event_context used for the cpuctx a different lock class from those
      used by tasks.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      547e9fd7
  5. 19 Jan, 2011 3 commits
    • Anton Blanchard's avatar
      powerpc, perf: Fix frequency calculation for overflowing counters (FSL version) · 8c8a9b25
      Anton Blanchard authored
      When fixing the frequency calculations for perf on powerpc I
      forgot to fix the FSL version.
      
      If we dont set event->hw.last_period the frequency to period
      calculations in perf go haywire and we continually
      throttle/unthrottle the PMU.
      Signed-off-by: default avatarAnton Blanchard <anton@samba.org>
      Acked-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Scott Wood <scottwood@freescale.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
      Cc: linuxppc-dev@lists.ozlabs.org
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <20110118214404.2f42e634@kryten>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      8c8a9b25
    • Oleg Nesterov's avatar
      perf: Fix perf_event_init_task()/perf_event_free_task() interaction · 8550d7cb
      Oleg Nesterov authored
      perf_event_init_task() should clear child->perf_event_ctxp[]
      before anything else. Otherwise, if
      perf_event_init_context(perf_hw_context) fails,
      perf_event_free_task() can free perf_event_ctxp[perf_sw_context]
      copied from parent->perf_event_ctxp[] by dup_task_struct().
      
      Also move the initialization of perf_event_mutex and
      perf_event_list from perf_event_init_context() to
      perf_event_init_context().
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Acked-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Prasad <prasad@linux.vnet.ibm.com>
      Cc: Roland McGrath <roland@redhat.com>
      LKML-Reference: <20110119182228.GC12183@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      8550d7cb
    • Oleg Nesterov's avatar
      perf: Fix find_get_context() vs perf_event_exit_task() race · dbe08d82
      Oleg Nesterov authored
      find_get_context() must not install the new perf_event_context
      if the task has already passed perf_event_exit_task().
      
      If nothing else, this means the memory leak. Initially
      ctx->refcount == 2, it is supposed that
      perf_event_exit_task_context() should participate and do the
      necessary put_ctx().
      
      find_lively_task_by_vpid() checks PF_EXITING but this buys
      nothing, by the time we call find_get_context() this task can be
      already dead. To the point, cmpxchg() can succeed when the task
      has already done the last schedule().
      
      Change find_get_context() to populate task->perf_event_ctxp[]
      under task->perf_event_mutex, this way we can trust PF_EXITING
      because perf_event_exit_task() takes the same mutex.
      
      Also, change perf_event_exit_task_context() to use
      rcu_dereference(). Probably this is not strictly needed, but
      with or without this change find_get_context() can race with
      setup_new_exec()->perf_event_exit_task(), rcu_dereference()
      looks better.
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Acked-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Prasad <prasad@linux.vnet.ibm.com>
      Cc: Roland McGrath <roland@redhat.com>
      LKML-Reference: <20110119182207.GB12183@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      dbe08d82
  6. 18 Jan, 2011 8 commits