1. 23 May, 2012 1 commit
    • Jiri Olsa's avatar
      Revert "sched, perf: Use a single callback into the scheduler" · ab0cce56
      Jiri Olsa authored
      This reverts commit cb04ff9a ("sched, perf: Use a single
      callback into the scheduler").
      
      Before this change was introduced, the process switch worked
      like this (wrt. to perf event schedule):
      
           schedule (prev, next)
             - schedule out all perf events for prev
             - switch to next
             - schedule in all perf events for current (next)
      
      After the commit, the process switch looks like:
      
           schedule (prev, next)
             - schedule out all perf events for prev
             - schedule in all perf events for (next)
             - switch to next
      
      The problem is, that after we schedule perf events in, the pmu
      is enabled and we can receive events even before we make the
      switch to next - so "current" still being prev process (event
      SAMPLE data are filled based on the value of the "current"
      process).
      
      Thats exactly what we see for test__PERF_RECORD test. We receive
      SAMPLES with PID of the process that our tracee is scheduled
      from.
      
      Discussed with Peter Zijlstra:
      
       > Bah!, yeah I guess reverting is the right thing for now. Sad
       > though.
       >
       > So by having the two hooks we have a black-spot between them
       > where we receive no events at all, this black-spot covers the
       > hand-over of current and we thus don't receive the 'wrong'
       > events.
       >
       > I rather liked we could do away with both that black-spot and
       > clean up the code a little, but apparently people rely on it.
      Signed-off-by: default avatarJiri Olsa <jolsa@redhat.com>
      Acked-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: acme@redhat.com
      Cc: paulus@samba.org
      Cc: cjashfor@linux.vnet.ibm.com
      Cc: fweisbec@gmail.com
      Cc: eranian@google.com
      Link: http://lkml.kernel.org/r/20120523111302.GC1638@m.brq.redhat.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      ab0cce56
  2. 22 May, 2012 19 commits
  3. 21 May, 2012 3 commits
  4. 19 May, 2012 3 commits
  5. 18 May, 2012 3 commits
  6. 17 May, 2012 5 commits
    • Jiri Olsa's avatar
      perf hists: Fix callchain ip printf format · a0187060
      Jiri Olsa authored
      The callchain address is stored as u64. Current code uses following
      format string to display callchain address:
      
        "%p\n", (void *)(long)chain->ip
      
      This way we lose upper 32 bits if we report 64 bit addresses in 32 bit
      environment. Fixing this to always display whole 64 bits.
      
      Note, running following to test perf endianity handling:
      test 1)
        - origin system:
          # perf record -a -- sleep 10 (any perf record will do)
          # perf report > report.origin
          # perf archive perf.data
      
        - copy the perf.data, report.origin and perf.data.tar.bz2
          to a target system and run:
          # tar xjvf perf.data.tar.bz2 -C ~/.debug
          # perf report > report.target
          # diff -u report.origin report.target
      
        - the diff should produce no output
          (besides some white space stuff and possibly different
           date/TZ output)
      
      test 2)
        - origin system:
          # perf record -ag -fo /tmp/perf.data -- sleep 1
        - mount origin system root to the target system on /mnt/origin
        - target system:
          # perf script --symfs /mnt/origin -I -i /mnt/origin/tmp/perf.data \
           --kallsyms /mnt/origin/proc/kallsyms
        - complete perf.data header is displayed
      Signed-off-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1337151548-2396-8-git-send-email-jolsa@redhat.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      a0187060
    • Namhyung Kim's avatar
      perf target: Add uses_mmap field · d1cb9fce
      Namhyung Kim authored
      If perf doesn't mmap on event (like perf stat), it should not create
      per-task-per-cpu events. So just use a dummy cpu map to create a
      per-task event for this case.
      Signed-off-by: default avatarNamhyung Kim <namhyung.kim@lge.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1337161549-9870-3-git-send-email-namhyung.kim@lge.com
      [ committer note: renamed .need_mmap to .uses_mmap ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      d1cb9fce
    • Steven Rostedt's avatar
      ftrace: Remove selecting FRAME_POINTER with FUNCTION_TRACER · b732d439
      Steven Rostedt authored
      The function tracer will enable the -pg option with gcc, which requires
      that frame pointers. When FRAME_POINTER is defined in the kernel config
      it adds the gcc option -fno-omit-frame-pointer which causes some problems
      on some architectures. For those architectures, the FRAME_POINTER select
      was not set.
      
      When FUNCTION_TRACER was selected on these architectures that can not have
      -fno-omit-frame-pointer, the -pg option is still set. But when
      FRAME_POINTER is not selected, the kernel config would add the gcc option
      -fomit-frame-pointer. Adding this option is incompatible with -pg
      even on archs that do not need frame pointers with -pg.
      
      The answer to this was to just not add either -fno-omit-frame-pointer
      or -fomit-frame-pointer on these archs that want function tracing
      but do not set FRAME_POINTER.
      
      As it turns out, for archs that require frame pointers for function
      tracing, the same can be used. If gcc requires frame pointers with
      -pg, it will simply add it. The best thing to do is not select FRAME_POINTER
      when function tracing is selected, and let gcc add it if needed.
      
      Only add the -fno-omit-frame-pointer when something else selects
      FRAME_POINTER, but do not add -fomit-frame-pointer if function tracing
      is selected.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      b732d439
    • Steven Rostedt's avatar
      ftrace/x86: Have x86 ftrace use the ftrace_modify_all_code() · e4f5d544
      Steven Rostedt authored
      To remove duplicate code, have the ftrace arch_ftrace_update_code()
      use the generic ftrace_modify_all_code(). This requires that the
      default ftrace_replace_code() becomes a weak function so that an
      arch may override it.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      e4f5d544
    • Steven Rostedt's avatar
      ftrace: Make ftrace_modify_all_code() global for archs to use · 8ed3e2cf
      Steven Rostedt authored
      Rename __ftrace_modify_code() to ftrace_modify_all_code() and make
      it global for all archs to use. This will remove the duplication
      of code, as archs that can modify code without stop_machine()
      can use it directly outside of the stop_machine() call.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      8ed3e2cf
  7. 16 May, 2012 6 commits
    • Steven Rostedt's avatar
      ftrace: Return record ip addr for ftrace_location() · f0cf973a
      Steven Rostedt authored
      ftrace_location() is passed an addr, and returns 1 if the addr is
      on a ftrace nop (or caller to ftrace_caller), and 0 otherwise.
      
      To let kprobes know if it should move a breakpoint or not, it
      must return the actual addr that is the start of the ftrace nop.
      This way a kprobe placed on the location of a ftrace nop, can
      instead be placed on the instruction after the nop. Even if the
      probe addr is on the second or later byte of the nop, it can
      simply be moved forward.
      
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      f0cf973a
    • Steven Rostedt's avatar
      ftrace: Consolidate ftrace_location() and ftrace_text_reserved() · a650e02a
      Steven Rostedt authored
      Both ftrace_location() and ftrace_text_reserved() do basically the same thing.
      They search to see if an address is in the ftace table (contains an address
      that may change from nop to call ftrace_caller). The difference is
      that ftrace_location() searches a single address, but ftrace_text_reserved()
      searches a range.
      
      This also makes the ftrace_text_reserved() faster as it now uses a bsearch()
      instead of linearly searching all the addresses within a page.
      
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      a650e02a
    • Steven Rostedt's avatar
      ftrace: Speed up search by skipping pages by address · 9644302e
      Steven Rostedt authored
      As all records in a page of the ftrace table are sorted, we can
      speed up the search algorithm by checking if the address to look for
      falls in between the first and last record ip on the page.
      
      This speeds up both the ftrace_location() and ftrace_text_reserved()
      algorithms, as it can skip full pages when the search address is
      not in them.
      
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      9644302e
    • Steven Rostedt's avatar
      ftrace: Remove extra helper functions · 706c81f8
      Steven Rostedt authored
      The ftrace_record_ip() and ftrace_alloc_dyn_node() were from the
      time of the ftrace daemon. Although they were still used, they
      still make things a bit more complex than necessary.
      
      Move the code into the one function that uses it, and remove the
      helper functions.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      706c81f8
    • Steven Rostedt's avatar
      ftrace: Sort all function addresses, not just per page · 9fd49328
      Steven Rostedt authored
      Instead of just sorting the ip's of the functions per ftrace page,
      sort the entire list before adding them to the ftrace pages.
      
      This will allow the bsearch algorithm to be sped up as it can
      also sort by pages, not just records within a page.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      9fd49328
    • Vaibhav Nagarnaik's avatar
      tracing: change CPU ring buffer state from tracing_cpumask · 71babb27
      Vaibhav Nagarnaik authored
      According to Documentation/trace/ftrace.txt:
      
      tracing_cpumask:
      
              This is a mask that lets the user only trace
              on specified CPUS. The format is a hex string
              representing the CPUS.
      
      The tracing_cpumask currently doesn't affect the tracing state of
      per-CPU ring buffers.
      
      This patch enables/disables CPU recording as its corresponding bit in
      tracing_cpumask is set/unset.
      
      Link: http://lkml.kernel.org/r/1336096792-25373-3-git-send-email-vnagarnaik@google.com
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Laurent Chavey <chavey@google.com>
      Cc: Justin Teravest <teravest@google.com>
      Cc: David Sharp <dhsharp@google.com>
      Signed-off-by: default avatarVaibhav Nagarnaik <vnagarnaik@google.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      71babb27