1. 13 Apr, 2009 4 commits
    • Steven Rostedt's avatar
      tracing/filters: use ring_buffer_discard_commit for discarded events · 77d9f465
      Steven Rostedt authored
      The ring_buffer_discard_commit makes better usage of the ring_buffer
      when an event has been discarded. It tries to remove it completely if
      possible.
      
      This patch converts the trace event filtering to use
      ring_buffer_discard_commit instead of the ring_buffer_event_discard.
      Signed-off-by: default avatarSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      77d9f465
    • Steven Rostedt's avatar
      ring-buffer: add ring_buffer_discard_commit · fa1b47dd
      Steven Rostedt authored
      The ring_buffer_discard_commit is similar to ring_buffer_event_discard
      but it can only be done on an event that has yet to be commited.
      Unpredictable results can happen otherwise.
      
      The main difference between ring_buffer_discard_commit and
      ring_buffer_event_discard is that ring_buffer_discard_commit will try
      to free the data in the ring buffer if nothing has addded data
      after the reserved event. If something did, then it acts almost the
      same as ring_buffer_event_discard followed by a
      ring_buffer_unlock_commit.
      
      Note, either ring_buffer_commit_discard and ring_buffer_unlock_commit
      can be called on an event, not both.
      
      This commit also exports both discard functions to be usable by
      GPL modules.
      Signed-off-by: default avatarSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      fa1b47dd
    • Tom Zanussi's avatar
      tracing/filters: add TRACE_EVENT_FORMAT_NOFILTER event macro · e45f2e2b
      Tom Zanussi authored
      Frederic Weisbecker suggested that the trace_special event shouldn't be
      filterable; this patch adds a TRACE_EVENT_FORMAT_NOFILTER event macro
      that allows an event format to be exported without having a filter
      attached, and removes filtering from the trace_special event.
      Signed-off-by: default avatarTom Zanussi <tzanussi@gmail.com>
      Signed-off-by: default avatarSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      e45f2e2b
    • Tom Zanussi's avatar
      tracing/filters: add run-time field descriptions to TRACE_EVENT_FORMAT events · e1112b4d
      Tom Zanussi authored
      This patch adds run-time field descriptions to all the event formats
      exported using TRACE_EVENT_FORMAT.  It also hooks up all the tracers
      that use them (i.e. the tracers in the 'ftrace subsystem') so they can
      also have their output filtered by the event-filtering mechanism.
      
      When I was testing this, there were a couple of things that fooled me
      into thinking the filters weren't working, when actually they were -
      I'll mention them here so others don't make the same mistakes (and file
      bug reports. ;-)
      
      One is that some of the tracers trace multiple events e.g. the
      sched_switch tracer uses the context_switch and wakeup events, and if
      you don't set filters on all of the traced events, the unfiltered output
      from the events without filters on them can make it look like the
      filtering as a whole isn't working properly, when actually it is doing
      what it was asked to do - it just wasn't asked to do the right thing.
      
      The other is that for the really high-volume tracers e.g. the function
      tracer, the volume of filtered events can be so high that it pushes the
      unfiltered events out of the ring buffer before they can be read so e.g.
      cat'ing the trace file repeatedly shows either no output, or once in
      awhile some output but that isn't there the next time you read the
      trace, which isn't what you normally expect when reading the trace file.
      If you read from the trace_pipe file though, you can catch them before
      they disappear.
      
      Changes from v1:
      
      As suggested by Frederic Weisbecker:
      
      - get rid of externs in functions
      - added unlikely() to filter_check_discard()
      Signed-off-by: default avatarTom Zanussi <tzanussi@gmail.com>
      Signed-off-by: default avatarSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      e1112b4d
  2. 12 Apr, 2009 6 commits
    • Li Zefan's avatar
      blktrace: fix output of BLK_TC_PC events · 66de7792
      Li Zefan authored
      BLK_TC_PC events should be treated differently with BLK_TC_FS events.
      
      Before this patch:
      
       # echo 1 > /sys/block/sda/sda1/trace/enable
       # echo pc > /sys/block/sda/sda1/trace/act_mask
       # echo blk > /debugfs/tracing/current_tracer
       # (generate some BLK_TC_PC events)
       # cat trace
              bash-2184  [000]  1774.275413:   8,7    I   N [bash]
              bash-2184  [000]  1774.275435:   8,7    D   N [bash]
              bash-2184  [000]  1774.275540:   8,7    I   R [bash]
              bash-2184  [000]  1774.275547:   8,7    D   R [bash]
       ksoftirqd/0-4     [000]  1774.275580:   8,7    C   N 0 [0]
              bash-2184  [000]  1774.275648:   8,7    I   R [bash]
              bash-2184  [000]  1774.275653:   8,7    D   R [bash]
       ksoftirqd/0-4     [000]  1774.275682:   8,7    C   N 0 [0]
              bash-2184  [000]  1774.275739:   8,7    I   R [bash]
              bash-2184  [000]  1774.275744:   8,7    D   R [bash]
       ksoftirqd/0-4     [000]  1774.275771:   8,7    C   N 0 [0]
              bash-2184  [000]  1774.275804:   8,7    I   R [bash]
              bash-2184  [000]  1774.275808:   8,7    D   R [bash]
       ksoftirqd/0-4     [000]  1774.275836:   8,7    C   N 0 [0]
      
      After this patch:
      
       # cat trace
              bash-2263  [000]   366.782149:   8,7    I   N 0 (00 ..) [bash]
              bash-2263  [000]   366.782323:   8,7    D   N 0 (00 ..) [bash]
              bash-2263  [000]   366.782557:   8,7    I   R 8 (25 00 ..) [bash]
              bash-2263  [000]   366.782560:   8,7    D   R 8 (25 00 ..) [bash]
       ksoftirqd/0-4     [000]   366.782582:   8,7    C   N (25 00 ..) [0]
              bash-2263  [000]   366.782648:   8,7    I   R 8 (5a 00 3f 00) [bash]
              bash-2263  [000]   366.782650:   8,7    D   R 8 (5a 00 3f 00) [bash]
       ksoftirqd/0-4     [000]   366.782669:   8,7    C   N (5a 00 3f 00) [0]
              bash-2263  [000]   366.782710:   8,7    I   R 8 (5a 00 08 00) [bash]
              bash-2263  [000]   366.782713:   8,7    D   R 8 (5a 00 08 00) [bash]
       ksoftirqd/0-4     [000]   366.782730:   8,7    C   N (5a 00 08 00) [0]
              bash-2263  [000]   366.783375:   8,7    I   R 36 (5a 00 08 00) [bash]
              bash-2263  [000]   366.783379:   8,7    D   R 36 (5a 00 08 00) [bash]
       ksoftirqd/0-4     [000]   366.783404:   8,7    C   N (5a 00 08 00) [0]
      
      This is what we do with PC events in user-space blktrace.
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      Acked-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jens Axboe <jens.axboe@oracle.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <49D32387.9040106@cn.fujitsu.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      66de7792
    • Li Zefan's avatar
      blktrace: fix output of unknown events · b78825d6
      Li Zefan authored
      Not all events are pc (packet command) events. An event is a pc
      event only if it has BLK_TC_PC bit set.
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      Acked-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jens Axboe <jens.axboe@oracle.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <49D3236D.3090705@cn.fujitsu.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      b78825d6
    • Zhaolei's avatar
      tracing, kmemtrace: Make kmem tracepoints use TRACE_EVENT macro · fc182a43
      Zhaolei authored
      TRACE_EVENT is a more generic way to define tracepoints.
      Doing so adds these new capabilities to this tracepoint:
      
        - zero-copy and per-cpu splice() tracing
        - binary tracing without printf overhead
        - structured logging records exposed under /debug/tracing/events
        - trace events embedded in function tracer output and other plugins
        - user-defined, per tracepoint filter expressions
      Signed-off-by: default avatarZhao Lei <zhaolei@cn.fujitsu.com>
      Acked-by: default avatarEduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
      Acked-by: default avatarPekka Enberg <penberg@cs.helsinki.fi>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <49DEE6DA.80600@cn.fujitsu.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      fc182a43
    • Zhaolei's avatar
      tracing, kmemtrace: Separate include/trace/kmemtrace.h to kmemtrace part and tracepoint part · 02af61bb
      Zhaolei authored
      Impact: refactor code for future changes
      
      Current kmemtrace.h is used both as header file of kmemtrace and kmem's
      tracepoints definition.
      
      Tracepoints' definition file may be used by other code, and should only have
      definition of tracepoint.
      
      We can separate include/trace/kmemtrace.h into 2 files:
      
        include/linux/kmemtrace.h: header file for kmemtrace
        include/trace/kmem.h:      definition of kmem tracepoints
      Signed-off-by: default avatarZhao Lei <zhaolei@cn.fujitsu.com>
      Acked-by: default avatarEduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
      Acked-by: default avatarPekka Enberg <penberg@cs.helsinki.fi>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <49DEE68A.5040902@cn.fujitsu.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      02af61bb
    • Theodore Ts'o's avatar
      tracing: Document the event tracing system · abd41443
      Theodore Ts'o authored
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <1239479479-2603-3-git-send-email-tytso@mit.edu>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      abd41443
    • Theodore Ts'o's avatar
      tracing: Add documentation for the power tracer · 56c49951
      Theodore Ts'o authored
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      Acked-by: default avatarArjan van de Ven <arjan@linux.intel.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <srostedt@redhat.com>
      LKML-Reference: <1239479479-2603-4-git-send-email-tytso@mit.edu>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      56c49951
  3. 10 Apr, 2009 8 commits
    • Zhaolei's avatar
      tracing, net, skb tracepoint: make skb tracepoint use the TRACE_EVENT() macro · 5cb3d1d9
      Zhaolei authored
      TRACE_EVENT is a more generic way to define a tracepoint.
      Doing so adds these new capabilities to this tracepoint:
      
        - zero-copy and per-cpu splice() tracing
        - binary tracing without printf overhead
        - structured logging records exposed under /debug/tracing/events
        - trace events embedded in function tracer output and other plugins
        - user-defined, per tracepoint filter expressions
      Signed-off-by: default avatarZhao Lei <zhaolei@cn.fujitsu.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: "Steven Rostedt ;" <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <49DD90D2.5020604@cn.fujitsu.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      5cb3d1d9
    • Steven Rostedt's avatar
      x86, function-graph: only save return values on x86_64 · e71e99c2
      Steven Rostedt authored
      Impact: speed up
      
      The return to handler portion of the function graph tracer should only
      need to save the return values. The caller already saved off the
      registers that the callee can modify. The returning function already
      saved the registers it modified. When we call our own trace function
      it too will save the registers that the callee must restore.
      
      There's no reason to save off anything more that the registers used
      to return the values.
      
      Note, I did a complete kernel build with this modification and the
      function graph tracer running on x86_64.
      Signed-off-by: default avatarSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      e71e99c2
    • Frederic Weisbecker's avatar
      tracing/lockdep: report the time waited for a lock · 2062501a
      Frederic Weisbecker authored
      While trying to optimize the new lock on reiserfs to replace
      the bkl, I find the lock tracing very useful though it lacks
      something important for performance (and latency) instrumentation:
      the time a task waits for a lock.
      
      That's what this patch implements:
      
        bash-4816  [000]   202.652815: lock_contended: lock_contended: &sb->s_type->i_mutex_key
        bash-4816  [000]   202.652819: lock_acquired: &rq->lock (0.000 us)
       <...>-4787  [000]   202.652825: lock_acquired: &rq->lock (0.000 us)
       <...>-4787  [000]   202.652829: lock_acquired: &rq->lock (0.000 us)
        bash-4816  [000]   202.652833: lock_acquired: &sb->s_type->i_mutex_key (16.005 us)
      
      As shown above, the "lock acquired" field is followed by the time
      it has been waiting for the lock. Usually, a lock contended entry
      is followed by a near lock_acquired entry with a non-zero time waited.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <1238975373-15739-1-git-send-email-fweisbec@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      2062501a
    • Ingo Molnar's avatar
      Merge branch 'tracing/urgent' into tracing/core · 1cad1252
      Ingo Molnar authored
      Merge reason: pick up both v2.6.30-rc1 [which includes tracing/urgent fixes]
                    and pick up the current lineup of tracing/urgent fixes as well
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      1cad1252
    • Lai Jiangshan's avatar
      tracing: fix splice return too large · 93cfb3c9
      Lai Jiangshan authored
      I got these from strace:
      
       splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 12288
       splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 12288
       splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 12288
       splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 16384
       splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 8192
       splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 8192
       splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 8192
      
      I wanted to splice_read 4096 bytes, but it returns 8192 or larger.
      
      It is because the return value of tracing_buffers_splice_read()
      does not include "zero out any left over data" bytes.
      
      But tracing_buffers_read() includes these bytes, we make them
      consistent.
      Signed-off-by: default avatarLai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <srostedt@redhat.com>
      LKML-Reference: <49D46674.9030804@cn.fujitsu.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      93cfb3c9
    • Lai Jiangshan's avatar
      tracing: update file->f_pos when splice(2) it · c7625a55
      Lai Jiangshan authored
      Impact: Cleanup
      
      These two lines:
      
      	if (unlikely(*ppos))
      		return -ESPIPE;
      
      in tracing_buffers_splice_read() are not needed, VFS layer
      has disabled seek(2).
      
      We remove these two lines, and then we can update file->f_pos.
      
      And tracing_buffers_read() updates file->f_pos, this fix
      make tracing_buffers_splice_read() updates file->f_pos too.
      Signed-off-by: default avatarLai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <srostedt@redhat.com>
      LKML-Reference: <49D46670.4010503@cn.fujitsu.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      c7625a55
    • Lai Jiangshan's avatar
      tracing: allocate page when needed · ddd538f3
      Lai Jiangshan authored
      Impact: Cleanup
      
      Sometimes, we open trace_pipe_raw, but we don't read(2) it,
      we just splice(2) it, thus, the page is not used.
      Signed-off-by: default avatarLai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <srostedt@redhat.com>
      LKML-Reference: <49D4666B.4010608@cn.fujitsu.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      ddd538f3
    • Lai Jiangshan's avatar
      tracing: disable seeking for trace_pipe_raw · d1e7e02f
      Lai Jiangshan authored
      Impact: disable pread()
      
      We set tracing_buffers_fops.llseek to no_llseek,
      but we can still perform pread() to read this file.
      
      That is not expected.
      
      This fix uses nonseekable_open() to disable it.
      
      tracing_buffers_fops.llseek is still set to no_llseek,
      it mark this file is a "non-seekable device" and is used by
      sys_splice(). See also do_splice() or manual of splice(2):
      
      ERRORS
             EINVAL Target file system doesn't support  splicing;
                    neither  of the descriptors refers to a pipe;
                    or offset given for non-seekable device.
      Signed-off-by: default avatarLai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <srostedt@redhat.com>
      LKML-Reference: <49D46668.8030806@cn.fujitsu.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      d1e7e02f
  4. 09 Apr, 2009 22 commits