1. 12 Apr, 2009 6 commits
    • Li Zefan's avatar
      blktrace: fix output of BLK_TC_PC events · 66de7792
      Li Zefan authored
      BLK_TC_PC events should be treated differently with BLK_TC_FS events.
      
      Before this patch:
      
       # echo 1 > /sys/block/sda/sda1/trace/enable
       # echo pc > /sys/block/sda/sda1/trace/act_mask
       # echo blk > /debugfs/tracing/current_tracer
       # (generate some BLK_TC_PC events)
       # cat trace
              bash-2184  [000]  1774.275413:   8,7    I   N [bash]
              bash-2184  [000]  1774.275435:   8,7    D   N [bash]
              bash-2184  [000]  1774.275540:   8,7    I   R [bash]
              bash-2184  [000]  1774.275547:   8,7    D   R [bash]
       ksoftirqd/0-4     [000]  1774.275580:   8,7    C   N 0 [0]
              bash-2184  [000]  1774.275648:   8,7    I   R [bash]
              bash-2184  [000]  1774.275653:   8,7    D   R [bash]
       ksoftirqd/0-4     [000]  1774.275682:   8,7    C   N 0 [0]
              bash-2184  [000]  1774.275739:   8,7    I   R [bash]
              bash-2184  [000]  1774.275744:   8,7    D   R [bash]
       ksoftirqd/0-4     [000]  1774.275771:   8,7    C   N 0 [0]
              bash-2184  [000]  1774.275804:   8,7    I   R [bash]
              bash-2184  [000]  1774.275808:   8,7    D   R [bash]
       ksoftirqd/0-4     [000]  1774.275836:   8,7    C   N 0 [0]
      
      After this patch:
      
       # cat trace
              bash-2263  [000]   366.782149:   8,7    I   N 0 (00 ..) [bash]
              bash-2263  [000]   366.782323:   8,7    D   N 0 (00 ..) [bash]
              bash-2263  [000]   366.782557:   8,7    I   R 8 (25 00 ..) [bash]
              bash-2263  [000]   366.782560:   8,7    D   R 8 (25 00 ..) [bash]
       ksoftirqd/0-4     [000]   366.782582:   8,7    C   N (25 00 ..) [0]
              bash-2263  [000]   366.782648:   8,7    I   R 8 (5a 00 3f 00) [bash]
              bash-2263  [000]   366.782650:   8,7    D   R 8 (5a 00 3f 00) [bash]
       ksoftirqd/0-4     [000]   366.782669:   8,7    C   N (5a 00 3f 00) [0]
              bash-2263  [000]   366.782710:   8,7    I   R 8 (5a 00 08 00) [bash]
              bash-2263  [000]   366.782713:   8,7    D   R 8 (5a 00 08 00) [bash]
       ksoftirqd/0-4     [000]   366.782730:   8,7    C   N (5a 00 08 00) [0]
              bash-2263  [000]   366.783375:   8,7    I   R 36 (5a 00 08 00) [bash]
              bash-2263  [000]   366.783379:   8,7    D   R 36 (5a 00 08 00) [bash]
       ksoftirqd/0-4     [000]   366.783404:   8,7    C   N (5a 00 08 00) [0]
      
      This is what we do with PC events in user-space blktrace.
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      Acked-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jens Axboe <jens.axboe@oracle.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <49D32387.9040106@cn.fujitsu.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      66de7792
    • Li Zefan's avatar
      blktrace: fix output of unknown events · b78825d6
      Li Zefan authored
      Not all events are pc (packet command) events. An event is a pc
      event only if it has BLK_TC_PC bit set.
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      Acked-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jens Axboe <jens.axboe@oracle.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <49D3236D.3090705@cn.fujitsu.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      b78825d6
    • Zhaolei's avatar
      tracing, kmemtrace: Make kmem tracepoints use TRACE_EVENT macro · fc182a43
      Zhaolei authored
      TRACE_EVENT is a more generic way to define tracepoints.
      Doing so adds these new capabilities to this tracepoint:
      
        - zero-copy and per-cpu splice() tracing
        - binary tracing without printf overhead
        - structured logging records exposed under /debug/tracing/events
        - trace events embedded in function tracer output and other plugins
        - user-defined, per tracepoint filter expressions
      Signed-off-by: default avatarZhao Lei <zhaolei@cn.fujitsu.com>
      Acked-by: default avatarEduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
      Acked-by: default avatarPekka Enberg <penberg@cs.helsinki.fi>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <49DEE6DA.80600@cn.fujitsu.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      fc182a43
    • Zhaolei's avatar
      tracing, kmemtrace: Separate include/trace/kmemtrace.h to kmemtrace part and tracepoint part · 02af61bb
      Zhaolei authored
      Impact: refactor code for future changes
      
      Current kmemtrace.h is used both as header file of kmemtrace and kmem's
      tracepoints definition.
      
      Tracepoints' definition file may be used by other code, and should only have
      definition of tracepoint.
      
      We can separate include/trace/kmemtrace.h into 2 files:
      
        include/linux/kmemtrace.h: header file for kmemtrace
        include/trace/kmem.h:      definition of kmem tracepoints
      Signed-off-by: default avatarZhao Lei <zhaolei@cn.fujitsu.com>
      Acked-by: default avatarEduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
      Acked-by: default avatarPekka Enberg <penberg@cs.helsinki.fi>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <49DEE68A.5040902@cn.fujitsu.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      02af61bb
    • Theodore Ts'o's avatar
      tracing: Document the event tracing system · abd41443
      Theodore Ts'o authored
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <1239479479-2603-3-git-send-email-tytso@mit.edu>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      abd41443
    • Theodore Ts'o's avatar
      tracing: Add documentation for the power tracer · 56c49951
      Theodore Ts'o authored
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      Acked-by: default avatarArjan van de Ven <arjan@linux.intel.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <srostedt@redhat.com>
      LKML-Reference: <1239479479-2603-4-git-send-email-tytso@mit.edu>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      56c49951
  2. 10 Apr, 2009 8 commits
    • Zhaolei's avatar
      tracing, net, skb tracepoint: make skb tracepoint use the TRACE_EVENT() macro · 5cb3d1d9
      Zhaolei authored
      TRACE_EVENT is a more generic way to define a tracepoint.
      Doing so adds these new capabilities to this tracepoint:
      
        - zero-copy and per-cpu splice() tracing
        - binary tracing without printf overhead
        - structured logging records exposed under /debug/tracing/events
        - trace events embedded in function tracer output and other plugins
        - user-defined, per tracepoint filter expressions
      Signed-off-by: default avatarZhao Lei <zhaolei@cn.fujitsu.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: "Steven Rostedt ;" <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <49DD90D2.5020604@cn.fujitsu.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      5cb3d1d9
    • Steven Rostedt's avatar
      x86, function-graph: only save return values on x86_64 · e71e99c2
      Steven Rostedt authored
      Impact: speed up
      
      The return to handler portion of the function graph tracer should only
      need to save the return values. The caller already saved off the
      registers that the callee can modify. The returning function already
      saved the registers it modified. When we call our own trace function
      it too will save the registers that the callee must restore.
      
      There's no reason to save off anything more that the registers used
      to return the values.
      
      Note, I did a complete kernel build with this modification and the
      function graph tracer running on x86_64.
      Signed-off-by: default avatarSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      e71e99c2
    • Frederic Weisbecker's avatar
      tracing/lockdep: report the time waited for a lock · 2062501a
      Frederic Weisbecker authored
      While trying to optimize the new lock on reiserfs to replace
      the bkl, I find the lock tracing very useful though it lacks
      something important for performance (and latency) instrumentation:
      the time a task waits for a lock.
      
      That's what this patch implements:
      
        bash-4816  [000]   202.652815: lock_contended: lock_contended: &sb->s_type->i_mutex_key
        bash-4816  [000]   202.652819: lock_acquired: &rq->lock (0.000 us)
       <...>-4787  [000]   202.652825: lock_acquired: &rq->lock (0.000 us)
       <...>-4787  [000]   202.652829: lock_acquired: &rq->lock (0.000 us)
        bash-4816  [000]   202.652833: lock_acquired: &sb->s_type->i_mutex_key (16.005 us)
      
      As shown above, the "lock acquired" field is followed by the time
      it has been waiting for the lock. Usually, a lock contended entry
      is followed by a near lock_acquired entry with a non-zero time waited.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <1238975373-15739-1-git-send-email-fweisbec@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      2062501a
    • Ingo Molnar's avatar
      Merge branch 'tracing/urgent' into tracing/core · 1cad1252
      Ingo Molnar authored
      Merge reason: pick up both v2.6.30-rc1 [which includes tracing/urgent fixes]
                    and pick up the current lineup of tracing/urgent fixes as well
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      1cad1252
    • Lai Jiangshan's avatar
      tracing: fix splice return too large · 93cfb3c9
      Lai Jiangshan authored
      I got these from strace:
      
       splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 12288
       splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 12288
       splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 12288
       splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 16384
       splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 8192
       splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 8192
       splice(0x3, 0, 0x5, 0, 0x1000, 0x1) = 8192
      
      I wanted to splice_read 4096 bytes, but it returns 8192 or larger.
      
      It is because the return value of tracing_buffers_splice_read()
      does not include "zero out any left over data" bytes.
      
      But tracing_buffers_read() includes these bytes, we make them
      consistent.
      Signed-off-by: default avatarLai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <srostedt@redhat.com>
      LKML-Reference: <49D46674.9030804@cn.fujitsu.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      93cfb3c9
    • Lai Jiangshan's avatar
      tracing: update file->f_pos when splice(2) it · c7625a55
      Lai Jiangshan authored
      Impact: Cleanup
      
      These two lines:
      
      	if (unlikely(*ppos))
      		return -ESPIPE;
      
      in tracing_buffers_splice_read() are not needed, VFS layer
      has disabled seek(2).
      
      We remove these two lines, and then we can update file->f_pos.
      
      And tracing_buffers_read() updates file->f_pos, this fix
      make tracing_buffers_splice_read() updates file->f_pos too.
      Signed-off-by: default avatarLai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <srostedt@redhat.com>
      LKML-Reference: <49D46670.4010503@cn.fujitsu.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      c7625a55
    • Lai Jiangshan's avatar
      tracing: allocate page when needed · ddd538f3
      Lai Jiangshan authored
      Impact: Cleanup
      
      Sometimes, we open trace_pipe_raw, but we don't read(2) it,
      we just splice(2) it, thus, the page is not used.
      Signed-off-by: default avatarLai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <srostedt@redhat.com>
      LKML-Reference: <49D4666B.4010608@cn.fujitsu.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      ddd538f3
    • Lai Jiangshan's avatar
      tracing: disable seeking for trace_pipe_raw · d1e7e02f
      Lai Jiangshan authored
      Impact: disable pread()
      
      We set tracing_buffers_fops.llseek to no_llseek,
      but we can still perform pread() to read this file.
      
      That is not expected.
      
      This fix uses nonseekable_open() to disable it.
      
      tracing_buffers_fops.llseek is still set to no_llseek,
      it mark this file is a "non-seekable device" and is used by
      sys_splice(). See also do_splice() or manual of splice(2):
      
      ERRORS
             EINVAL Target file system doesn't support  splicing;
                    neither  of the descriptors refers to a pipe;
                    or offset given for non-seekable device.
      Signed-off-by: default avatarLai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <srostedt@redhat.com>
      LKML-Reference: <49D46668.8030806@cn.fujitsu.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      d1e7e02f
  3. 09 Apr, 2009 25 commits
  4. 08 Apr, 2009 1 commit
    • Mikulas Patocka's avatar
      dm kcopyd: fix callback race · 340cd444
      Mikulas Patocka authored
      If the thread calling dm_kcopyd_copy is delayed due to scheduling inside
      split_job/segment_complete and the subjobs complete before the loop in
      split_job completes, the kcopyd callback could be invoked from the
      thread that called dm_kcopyd_copy instead of the kcopyd workqueue.
      
      dm_kcopyd_copy -> split_job -> segment_complete -> job->fn()
      
      Snapshots depend on the fact that callbacks are called from the singlethreaded
      kcopyd workqueue and expect that there is no racing between individual
      callbacks. The racing between callbacks can lead to corruption of exception
      store and it can also mean that exception store callbacks are called twice
      for the same exception - a likely reason for crashes reported inside
      pending_complete() / remove_exception().
      
      This patch fixes two problems:
      
      1. job->fn being called from the thread that submitted the job (see above).
      
      - Fix: hand over the completion callback to the kcopyd thread.
      
      2. job->fn(read_err, write_err, job->context); in segment_complete
      reports the error of the last subjob, not the union of all errors.
      
      - Fix: pass job->write_err to the callback to report all error bits
        (it is done already in run_complete_job)
      
      Cc: stable@kernel.org
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      340cd444