1. 02 Apr, 2015 20 commits
    • Yunlong Song's avatar
      perf kvm: Support using -f to override perf.data.guest file ownership · 8cc5ec1f
      Yunlong Song authored
      Enable perf kvm to use perf.data.guest when it is not owned by current
      user or root.
      
      Example:
      
       # perf kvm stat record ls
       # chown Yunlong.Song:Yunlong.Song perf.data.guest
       # ls -al perf.data.guest
       -rw------- 1 Yunlong.Song Yunlong.Song 4128937 Apr  2 11:05 perf.data.guest
       # id
       uid=0(root) gid=0(root) groups=0(root),64(pkcs11)
      
      Before this patch:
      
       # perf kvm stat report
       File perf.data.guest not owned by current user or root (use -f to override)
       Initializing perf session failed
       # perf kvm stat report -f
         Error: unknown switch `f'
      
        usage: perf kvm stat report [<options>]
      
               --event <report event>
                                 event for reporting: vmexit, mmio (x86 only),
                                 ioport (x86 only)
               --vcpu <n>        vcpu id to report
           -k, --key <sort-key>  key for sorting: sample(sort by samples
       						   number) time (sort by avg time)
           -p, --pid <pid>       analyze events only for given process id(s)
      
      As shown above, the -f option does not work at all.
      
      After this patch:
      
       # perf kvm stat report
       File perf.data.guest not owned by current user or root (use -f to override)
       Initializing perf session failed
       # perf kvm stat report -f
       Analyze events for all VMs, all VCPUs:
      
         VM-EXIT    Samples  Samples%     Time%    Min Time    Max Time   Avg time
      
       Total Samples:0, Total events handled time:0.00us.
      
      As shown above, the -f option really works now. Since we have not
      launched any KVM related process, the result shows 0 sample here.
      Signed-off-by: default avatarYunlong Song <yunlong.song@huawei.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1427982439-27388-5-git-send-email-yunlong.song@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      8cc5ec1f
    • Yunlong Song's avatar
      perf kmem: Support using -f to override perf.data file ownership · d1eeb77c
      Yunlong Song authored
      Enable perf kmem to use perf.data when it is not owned by current user
      or root.
      
      Example:
      
       # perf kmem record ls
       # chown Yunlong.Song:Yunlong.Song perf.data
       # ls -al perf.data
       -rw------- 1 Yunlong.Song Yunlong.Song 5315665 Apr  2 10:54 perf.data
       # id
       uid=0(root) gid=0(root) groups=0(root),64(pkcs11)
      
      Before this patch:
      
       # perf kmem stat
       File perf.data not owned by current user or root (use -f to override)
       # perf kmem stat -f
         Error: unknown switch `f'
      
        usage: perf kmem [<options>] {record|stat}
      
           -i, --input <file>    input file name
           -v, --verbose         be more verbose (show symbol address, etc)
               --caller          show per-callsite statistics
               --alloc           show per-allocation statistics
           -s, --sort <key[,key2...]>
                                 sort by keys: ptr, call_site, bytes, hit,
                                 pingpong, frag
           -l, --line <num>      show n lines
               --raw-ip          show raw ip instead of symbol
      
      As shown above, the -f option does not work at all.
      
      After this patch:
      
       # perf kmem stat
       File perf.data not owned by current user or root (use -f to override)
       # perf kmem stat -f
       SUMMARY
       =======
       Total bytes requested: 437599
       Total bytes allocated: 615472
       Total bytes wasted on internal fragmentation: 177873
       Internal fragmentation: 28.900259%
       Cross CPU allocations: 6/1192
      
      As shown above, the -f option really works now.
      Signed-off-by: default avatarYunlong Song <yunlong.song@huawei.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1427982439-27388-4-git-send-email-yunlong.song@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      d1eeb77c
    • Yunlong Song's avatar
      perf inject: Support using -f to override perf.data file ownership · ccaa474c
      Yunlong Song authored
      Enable perf inject to use perf.data when it is not owned by current user
      or root.
      
      Example:
      
       # perf record ls
       # chown Yunlong.Song:Yunlong.Song perf.data
       # ls -al perf.data
       -rw------- 1 Yunlong.Song Yunlong.Song 28260 Apr  2 10:37 perf.data
       # id
       uid=0(root) gid=0(root) groups=0(root),64(pkcs11)
      
      Before this patch:
      
       # perf inject -v -b -i perf.data -o perf.data.new
       File perf.data not owned by current user or root (use -f to override)
       # perf inject -v -b -i perf.data -o perf.data.new -f
         Error: unknown switch `f'
      
        usage: perf inject [<options>]
      
           -b, --build-ids       Inject build-ids into the output stream
           -i, --input <file>    input file name
           -o, --output <file>   output file name
           -s, --sched-stat      Merge sched-stat and sched-switch for getting
           events where and how long tasks slept
           -v, --verbose         be more verbose (show build ids, etc)
               --kallsyms <file>
                                 kallsyms pathname
      
      As shown above, the -f option does not work at all.
      
      After this patch:
      
       # perf inject -v -b -i perf.data -o perf.data.new
       File perf.data not owned by current user or root (use -f to override)
       # perf inject -v -b -i perf.data -o perf.data.new -f
       build id event received for [kernel.kallsyms]:
       f6dcb66d8b98f1c0d9eb87bf043444b69f91d30c
       symsrc__init: cannot get elf header.
       Looking at the vmlinux_path (7 entries long)
       Using /proc/kcore for kernel object code
       Using /proc/kallsyms for symbols
      
      As shown above, the -f option really works now.
      Signed-off-by: default avatarYunlong Song <yunlong.song@huawei.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1427982439-27388-3-git-send-email-yunlong.song@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ccaa474c
    • Yunlong Song's avatar
      perf evlist: Support using -f to override perf.data file ownership · 9e3b6ec1
      Yunlong Song authored
      Enable perf evlist to use perf.data when it is not owned by current user
      or root.
      
      Example:
      
       # perf record ls
       # chown Yunlong.Song:Yunlong.Song perf.data
       # ls -al perf.data
       -rw------- 1 Yunlong.Song Yunlong.Song 28260 Apr  2 10:18 perf.data
       # id
       uid=0(root) gid=0(root) groups=0(root),64(pkcs11)
      
      Before this patch:
      
       # perf evlist
       File perf.data not owned by current user or root (use -f to override)
       # perf evlist -f
         Error: unknown switch `f'
      
        usage: perf evlist [<options>]
      
           -i, --input <file>    Input file name
           -F, --freq            Show the sample frequency
           -v, --verbose         Show all event attr details
           -g, --group           Show event group information
      
      As shown above, the -f option does not work at all.
      
      After this patch:
      
       # perf evlist
       File perf.data not owned by current user or root (use -f to override)
       # perf evlist -f
       cycles
      
      As shown above, the -f option really works now.
      Signed-off-by: default avatarYunlong Song <yunlong.song@huawei.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1427982439-27388-2-git-send-email-yunlong.song@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      9e3b6ec1
    • Masami Hiramatsu's avatar
      perf probe: Fix to track down unnamed union/structure members · c7273835
      Masami Hiramatsu authored
      Fix 'perf probe' to track down unnamed union/structure members.
      
      perf probe did not track down the tree of unnamed union/structure
      members, since it just failed to find given "name" in a parent
      structure/union.  To solve this issue, I've introduced 2 changes.
      
      - Fix die_find_member() to track down the type-DIE if it is
        unnamed, and if it contains the specified member, returns the
        unnamed member.
        (note that we don't return found member, since unnamed member
         has the offset in the parent structure)
      - Fix convert_variable_fields() to track down the unnamed union/
        structure (one-by-one).
      
      With this patch, perf probe can access unnamed fields:
        -----
        #./perf probe -nfx ./perf lock__delete ops 'locked_ops=ops->locked.ops'
        Added new event:
          probe_perf:lock__delete (on lock__delete in /home/mhiramat/ksrc/linux-3/tools/perf/perf with ops locked_ops=ops->locked.ops)
      
        You can now use it in all perf tools, such as:
      
                perf record -e probe_perf:lock__delete -aR sleep 1
        -----
      Reported-by: default avatarArnaldo Carvalho de Melo <acme@kernel.org>
      Report-Link: https://lkml.org/lkml/2015/3/5/431Signed-off-by: default avatarMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20150402073312.14482.37942.stgit@localhost.localdomainSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c7273835
    • Arnaldo Carvalho de Melo's avatar
      perf db-export: No need to have ->thread twice in struct export_sample · b83e868d
      Arnaldo Carvalho de Melo authored
      As it comes from address_location->thread, that is already stored as
      export_sample->al, where the thread can be obtained.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/20150402141542.GA9630@kernel.org
      Link: http://lkml.kernel.org/n/tip-bzotbl4epoztw0jd6sm2stpf@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      b83e868d
    • Arnaldo Carvalho de Melo's avatar
      perf db-export: No need to pass thread twice to db_export__sample · 7327259d
      Arnaldo Carvalho de Melo authored
      As it is available via another parameter, address_location->thread.
      Acked-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: lkml.kernel.org/r/551D08F8.3040706@intel.com
      Link: http://lkml.kernel.org/n/tip-6dbn0tcm9hyv92g7h3zj2dbt@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      7327259d
    • Arnaldo Carvalho de Melo's avatar
      perf scripting: No need to pass thread twice to the scripting callbacks · f9d5d549
      Arnaldo Carvalho de Melo authored
      It is already in the addr_location, so remove the redundant 'thread'
      parameter from the callback signatures.
      Acked-by: default avatarJiri Olsa <jolsa@redhat.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1427906210-10519-3-git-send-email-acme@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f9d5d549
    • Arnaldo Carvalho de Melo's avatar
      perf script: No need to lookup thread twice · 79628f2c
      Arnaldo Carvalho de Melo authored
      We get the thread when we call perf_event__preprocess_sample(), no need
      to do it before that.
      Acked-by: default avatarJiri Olsa <jolsa@redhat.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1427906210-10519-2-git-send-email-acme@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      79628f2c
    • Ingo Molnar's avatar
      bpf: Fix the build on BPF_SYSCALL=y && !CONFIG_TRACING kernels, make it more configurable · e1abf2cc
      Ingo Molnar authored
      So bpf_tracing.o depends on CONFIG_BPF_SYSCALL - but that's not its only
      dependency, it also depends on the tracing infrastructure and on kprobes,
      without which it will fail to build with:
      
        In file included from kernel/trace/bpf_trace.c:14:0:
        kernel/trace/trace.h: In function ‘trace_test_and_set_recursion’:
        kernel/trace/trace.h:491:28: error: ‘struct task_struct’ has no member named ‘trace_recursion’
          unsigned int val = current->trace_recursion;
        [...]
      
      It took quite some time to trigger this build failure, because right now
      BPF_SYSCALL is very obscure, depends on CONFIG_EXPERT. So also make BPF_SYSCALL
      more configurable, not just under CONFIG_EXPERT.
      
      If BPF_SYSCALL, tracing and kprobes are enabled then enable the bpf_tracing
      gateway as well.
      
      We might want to make this an interactive option later on, although
      I'd not complicate it unnecessarily: enabling BPF_SYSCALL is enough of
      an indicator that the user wants BPF support.
      
      Cc: Alexei Starovoitov <ast@plumgrid.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      e1abf2cc
    • Alexei Starovoitov's avatar
      samples/bpf: Add kmem_alloc()/free() tracker tool · 9811e353
      Alexei Starovoitov authored
      One BPF program attaches to kmem_cache_alloc_node() and
      remembers all allocated objects in the map.
      Another program attaches to kmem_cache_free() and deletes
      corresponding object from the map.
      
      User space walks the map every second and prints any objects
      which are older than 1 second.
      
      Usage:
      
      	$ sudo tracex4
      
      Then start few long living processes. The 'tracex4' will print
      something like this:
      
      	obj 0xffff880465928000 is 13sec old was allocated at ip ffffffff8105dc32
      	obj 0xffff88043181c280 is 13sec old was allocated at ip ffffffff8105dc32
      	obj 0xffff880465848000 is  8sec old was allocated at ip ffffffff8105dc32
      	obj 0xffff8804338bc280 is 15sec old was allocated at ip ffffffff8105dc32
      
      	$ addr2line -fispe vmlinux ffffffff8105dc32
      	do_fork at fork.c:1665
      
      As soon as processes exit the memory is reclaimed and 'tracex4'
      prints nothing.
      
      Similar experiment can be done with the __kmalloc()/kfree() pair.
      Signed-off-by: default avatarAlexei Starovoitov <ast@plumgrid.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/r/1427312966-8434-10-git-send-email-ast@plumgrid.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      9811e353
    • Alexei Starovoitov's avatar
      samples/bpf: Add IO latency analysis (iosnoop/heatmap) tool · 5c7fc2d2
      Alexei Starovoitov authored
      BPF C program attaches to
      blk_mq_start_request()/blk_update_request() kprobe events to
      calculate IO latency.
      
      For every completed block IO event it computes the time delta
      in nsec and records in a histogram map:
      
      	map[log10(delta)*10]++
      
      User space reads this histogram map every 2 seconds and prints
      it as a 'heatmap' using gray shades of text terminal. Black
      spaces have many events and white spaces have very few events.
      Left most space is the smallest latency, right most space is
      the largest latency in the range.
      
      Usage:
      
      	$ sudo ./tracex3
      	and do 'sudo dd if=/dev/sda of=/dev/null' in other terminal.
      
      Observe IO latencies and how different activity (like 'make
      kernel') affects it.
      
      Similar experiments can be done for network transmit latencies,
      syscalls, etc.
      
      '-t' flag prints the heatmap using normal ascii characters:
      
      $ sudo ./tracex3 -t
        heatmap of IO latency
        # - many events with this latency
          - few events
      	|1us      |10us     |100us    |1ms      |10ms     |100ms    |1s |10s
      				 *ooo. *O.#.                                    # 221
      			      .  *#     .                                       # 125
      				 ..   .o#*..                                    # 55
      			    .  . .  .  .#O                                      # 37
      				 .#                                             # 175
      				       .#*.                                     # 37
      				  #                                             # 199
      		      .              . *#*.                                     # 55
      				       *#..*                                    # 42
      				  #                                             # 266
      			      ...***Oo#*OO**o#* .                               # 629
      				  #                                             # 271
      				      . .#o* o.*o*                              # 221
      				. . o* *#O..                                    # 50
      Signed-off-by: default avatarAlexei Starovoitov <ast@plumgrid.com>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/r/1427312966-8434-9-git-send-email-ast@plumgrid.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      5c7fc2d2
    • Alexei Starovoitov's avatar
      samples/bpf: Add counting example for kfree_skb() function calls and the write() syscall · d822a192
      Alexei Starovoitov authored
      this example has two probes in one C file that attach to
      different kprove events and use two different maps.
      
      1st probe is x64 specific equivalent of dropmon. It attaches to
      kfree_skb, retrevies 'ip' address of kfree_skb() caller and
      counts number of packet drops at that 'ip' address. User space
      prints 'location - count' map every second.
      
      2nd probe attaches to kprobe:sys_write and computes a histogram
      of different write sizes
      
      Usage:
      	$ sudo tracex2
      	location 0xffffffff81695995 count 1
      	location 0xffffffff816d0da9 count 2
      
      	location 0xffffffff81695995 count 2
      	location 0xffffffff816d0da9 count 2
      
      	location 0xffffffff81695995 count 3
      	location 0xffffffff816d0da9 count 2
      
      	557145+0 records in
      	557145+0 records out
      	285258240 bytes (285 MB) copied, 1.02379 s, 279 MB/s
      		   syscall write() stats
      	     byte_size       : count     distribution
      	       1 -> 1        : 3        |                                      |
      	       2 -> 3        : 0        |                                      |
      	       4 -> 7        : 0        |                                      |
      	       8 -> 15       : 0        |                                      |
      	      16 -> 31       : 2        |                                      |
      	      32 -> 63       : 3        |                                      |
      	      64 -> 127      : 1        |                                      |
      	     128 -> 255      : 1        |                                      |
      	     256 -> 511      : 0        |                                      |
      	     512 -> 1023     : 1118968  |************************************* |
      
      Ctrl-C at any time. Kernel will auto cleanup maps and programs
      
      	$ addr2line -ape ./bld_x64/vmlinux 0xffffffff81695995
      	0xffffffff816d0da9 0xffffffff81695995:
      	./bld_x64/../net/ipv4/icmp.c:1038 0xffffffff816d0da9:
      	./bld_x64/../net/unix/af_unix.c:1231
      Signed-off-by: default avatarAlexei Starovoitov <ast@plumgrid.com>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/r/1427312966-8434-8-git-send-email-ast@plumgrid.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      d822a192
    • Alexei Starovoitov's avatar
      samples/bpf: Add simple non-portable kprobe filter example · b896c4f9
      Alexei Starovoitov authored
      tracex1_kern.c - C program compiled into BPF.
      
      It attaches to kprobe:netif_receive_skb()
      
      When skb->dev->name == "lo", it prints sample debug message into
      trace_pipe via bpf_trace_printk() helper function.
      
      tracex1_user.c - corresponding user space component that:
        - loads BPF program via bpf() syscall
        - opens kprobes:netif_receive_skb event via perf_event_open()
          syscall
        - attaches the program to event via ioctl(event_fd,
          PERF_EVENT_IOC_SET_BPF, prog_fd);
        - prints from trace_pipe
      
      Note, this BPF program is non-portable. It must be recompiled
      with current kernel headers. kprobe is not a stable ABI and
      BPF+kprobe scripts may no longer be meaningful when kernel
      internals change.
      
      No matter in what way the kernel changes, neither the kprobe,
      nor the BPF program can ever crash or corrupt the kernel,
      assuming the kprobes, perf and BPF subsystem has no bugs.
      
      The verifier will detect that the program is using
      bpf_trace_printk() and the kernel will print 'this is a DEBUG
      kernel' warning banner, which means that bpf_trace_printk()
      should be used for debugging of the BPF program only.
      
      Usage:
      $ sudo tracex1
                  ping-19826 [000] d.s2 63103.382648: : skb ffff880466b1ca00 len 84
                  ping-19826 [000] d.s2 63103.382684: : skb ffff880466b1d300 len 84
      
                  ping-19826 [000] d.s2 63104.382533: : skb ffff880466b1ca00 len 84
                  ping-19826 [000] d.s2 63104.382594: : skb ffff880466b1d300 len 84
      Signed-off-by: default avatarAlexei Starovoitov <ast@plumgrid.com>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/r/1427312966-8434-7-git-send-email-ast@plumgrid.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      b896c4f9
    • Alexei Starovoitov's avatar
      tracing: Allow BPF programs to call bpf_trace_printk() · 9c959c86
      Alexei Starovoitov authored
      Debugging of BPF programs needs some form of printk from the
      program, so let programs call limited trace_printk() with %d %u
      %x %p modifiers only.
      
      Similar to kernel modules, during program load verifier checks
      whether program is calling bpf_trace_printk() and if so, kernel
      allocates trace_printk buffers and emits big 'this is debug
      only' banner.
      Signed-off-by: default avatarAlexei Starovoitov <ast@plumgrid.com>
      Reviewed-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1427312966-8434-6-git-send-email-ast@plumgrid.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      9c959c86
    • Alexei Starovoitov's avatar
      tracing: Allow BPF programs to call bpf_ktime_get_ns() · d9847d31
      Alexei Starovoitov authored
      bpf_ktime_get_ns() is used by programs to compute time delta
      between events or as a timestamp
      Signed-off-by: default avatarAlexei Starovoitov <ast@plumgrid.com>
      Reviewed-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1427312966-8434-5-git-send-email-ast@plumgrid.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      d9847d31
    • Alexei Starovoitov's avatar
      tracing, perf: Implement BPF programs attached to kprobes · 2541517c
      Alexei Starovoitov authored
      BPF programs, attached to kprobes, provide a safe way to execute
      user-defined BPF byte-code programs without being able to crash or
      hang the kernel in any way. The BPF engine makes sure that such
      programs have a finite execution time and that they cannot break
      out of their sandbox.
      
      The user interface is to attach to a kprobe via the perf syscall:
      
      	struct perf_event_attr attr = {
      		.type	= PERF_TYPE_TRACEPOINT,
      		.config	= event_id,
      		...
      	};
      
      	event_fd = perf_event_open(&attr,...);
      	ioctl(event_fd, PERF_EVENT_IOC_SET_BPF, prog_fd);
      
      'prog_fd' is a file descriptor associated with BPF program
      previously loaded.
      
      'event_id' is an ID of the kprobe created.
      
      Closing 'event_fd':
      
      	close(event_fd);
      
      ... automatically detaches BPF program from it.
      
      BPF programs can call in-kernel helper functions to:
      
        - lookup/update/delete elements in maps
      
        - probe_read - wraper of probe_kernel_read() used to access any
          kernel data structures
      
      BPF programs receive 'struct pt_regs *' as an input ('struct pt_regs' is
      architecture dependent) and return 0 to ignore the event and 1 to store
      kprobe event into the ring buffer.
      
      Note, kprobes are a fundamentally _not_ a stable kernel ABI,
      so BPF programs attached to kprobes must be recompiled for
      every kernel version and user must supply correct LINUX_VERSION_CODE
      in attr.kern_version during bpf_prog_load() call.
      Signed-off-by: default avatarAlexei Starovoitov <ast@plumgrid.com>
      Reviewed-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Reviewed-by: default avatarMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1427312966-8434-4-git-send-email-ast@plumgrid.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      2541517c
    • Alexei Starovoitov's avatar
      tracing: Add kprobe flag · 72cbbc89
      Alexei Starovoitov authored
      add TRACE_EVENT_FL_KPROBE flag to differentiate kprobe type of
      tracepoints, since bpf programs can only be attached to kprobe
      type of PERF_TYPE_TRACEPOINT perf events.
      Signed-off-by: default avatarAlexei Starovoitov <ast@plumgrid.com>
      Reviewed-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Reviewed-by: default avatarMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1427312966-8434-3-git-send-email-ast@plumgrid.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      72cbbc89
    • Daniel Borkmann's avatar
      bpf: Make internal bpf API independent of CONFIG_BPF_SYSCALL #ifdefs · 4e537f7f
      Daniel Borkmann authored
      Socket filter code and other subsystems with upcoming eBPF
      support should not need to deal with the fact that we have
      CONFIG_BPF_SYSCALL defined or not.
      
      Having the bpf syscall as a config option is a nice thing and
      I'd expect it to stay that way for expert users (I presume one
      day the default setting of it might change, though), but code
      making use of it should not care if it's actually enabled or
      not.
      
      Instead, hide this via header files and let the rest deal with it.
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarAlexei Starovoitov <ast@plumgrid.com>
      Reviewed-by: default avatarMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/r/1427312966-8434-2-git-send-email-ast@plumgrid.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      4e537f7f
    • Ingo Molnar's avatar
      Merge branch 'perf/timer' into perf/core · 223aa646
      Ingo Molnar authored
      This WIP branch is now ready to be merged.
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      223aa646
  2. 01 Apr, 2015 1 commit
  3. 31 Mar, 2015 5 commits
  4. 30 Mar, 2015 1 commit
  5. 27 Mar, 2015 13 commits