1. 03 Mar, 2016 8 commits
    • Wang Nan's avatar
      perf record: Ensure return non-zero rc when mmap fail · 95c36561
      Wang Nan authored
      perf_evlist__mmap_ex() can fail without setting errno (for example, fail
      in condition checking. In this case all syscall is success).
      
      If this happen, record__open() incorrectly returns 0. Force setting rc
      is a quick way to avoid this problem, or we have to follow all possible
      code path in perf_evlist__mmap_ex() to make sure there's at least one
      system call before returning an error.
      Signed-off-by: default avatarWang Nan <wangnan0@huawei.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Li Zefan <lizefan@huawei.com>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1456479154-136027-30-git-send-email-wangnan0@huawei.comSigned-off-by: default avatarHe Kuang <hekuang@huawei.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      95c36561
    • Wang Nan's avatar
      perf record: Introduce record__finish_output() to finish a perf.data · e1ab48ba
      Wang Nan authored
      Move code for finalizing 'perf.data' to record__finish_output(). It will
      be used by following commits to split output to multiple files.
      Signed-off-by: default avatarHe Kuang <hekuang@huawei.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Li Zefan <lizefan@huawei.com>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1456479154-136027-23-git-send-email-wangnan0@huawei.comSigned-off-by: default avatarWang Nan <wangnan0@huawei.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e1ab48ba
    • Wang Nan's avatar
      perf record: Extract synthesize code to record__synthesize() · c45c86eb
      Wang Nan authored
      Create record__synthesize(). It can be used to create tracking events
      for each perf.data after perf supporting splitting into multiple
      outputs.
      Signed-off-by: default avatarHe Kuang <hekuang@huawei.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Li Zefan <lizefan@huawei.com>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1456479154-136027-20-git-send-email-wangnan0@huawei.comSigned-off-by: default avatarWang Nan <wangnan0@huawei.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c45c86eb
    • Wang Nan's avatar
      perf record: Use WARN_ONCE to replace 'if' condition · d8871ea7
      Wang Nan authored
      Commits in a BPF patchkit will extract kernel and module synthesizing
      code into a separated function and call it multiple times. This patch
      replace 'if (err < 0)' using WARN_ONCE, makes sure the error message
      show one time.
      Signed-off-by: default avatarWang Nan <wangnan0@huawei.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Li Zefan <lizefan@huawei.com>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1456479154-136027-19-git-send-email-wangnan0@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      d8871ea7
    • Wang Nan's avatar
      perf data: Explicitly set byte order for integer types · f8dd2d5f
      Wang Nan authored
      After babeltrace commit 5cec03e402aa ("ir: copy variants and sequences
      when setting a field path"), 'perf data convert' gets incorrect result
      if there's bpf output data. For example:
      
       # perf data convert --to-ctf ./out.ctf
       # babeltrace ./out.ctf
       [10:44:31.186045346] (+?.?????????) evt: { cpu_id = 0 }, { perf_ip = 0xFFFFFFFF810E7DD1, perf_tid = 23819, perf_pid = 23819, perf_id = 518, raw_len = 3, raw_data = [ [0] = 0xC028E32F, [1] = 0x815D0100, [2] = 0x1000000 ] }
       [10:44:31.286101003] (+0.100055657) evt: { cpu_id = 0 }, { perf_ip = 0xFFFFFFFF8105B609, perf_tid = 23819, perf_pid = 23819, perf_id = 518, raw_len = 3, raw_data = [ [0] = 0x35D9F1EB, [1] = 0x15D81, [2] = 0x2 ] }
      
      The expected result of the first sample should be:
      
       raw_data = [ [0] = 0x2FE328C0, [1] = 0x15D81, [2] = 0x1 ] }
      
      however, 'perf data convert' output big endian value to resuling CTF
      file.
      
      The reason is a internal change (or a bug?) of babeltrace.
      
      Before this patch, at the first add_bpf_output_values(), byte order of
      all integer type is uncertain (is 0, neither 1234 (le) nor 4321 (be)).
      It would be fixed by:
      
      perf_evlist__deliver_sample
       -> process_sample_event
         -> ctf_stream
            ...
            ->bt_ctf_trace_add_stream_class
              ->bt_ctf_field_type_structure_set_byte_order
                ->bt_ctf_field_type_integer_set_byte_order
      
      during creating the stream.
      
      However, the babeltrace commit mentioned above duplicates types in
      sequence to prevent potential conflict in following call stack and link
      the newly allocated type into the 'raw_data' sequence:
      
      perf_evlist__deliver_sample
       -> process_sample_event
         -> ctf_stream
            ...
            -> bt_ctf_trace_add_stream_class
              -> bt_ctf_stream_class_resolve_types
                 ...
                 -> bt_ctf_field_type_sequence_copy
                   ->bt_ctf_field_type_integer_copy
      
      This happens before byte order setting, so only the newly allocated
      type is initialized, the byte order of original type perf choose to
      create the first raw_data is still uncertain.
      
      Byte order in CTF output is not related to byte order in perf.data.
      Setting it to anything other than BT_CTF_BYTE_ORDER_NATIVE solves this
      problem (only BT_CTF_BYTE_ORDER_NATIVE needs to be fixed). To reduce
      behavior changing, set byte order according to compiling options.
      Signed-off-by: default avatarWang Nan <wangnan0@huawei.com>
      Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Jérémie Galarneau <jeremie.galarneau@efficios.com>
      Cc: Li Zefan <lizefan@huawei.com>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1456479154-136027-10-git-send-email-wangnan0@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f8dd2d5f
    • Wang Nan's avatar
      perf data: Support converting data from bpf_perf_event_output() · 6122d57e
      Wang Nan authored
      bpf_perf_event_output() outputs data through sample->raw_data. This
      patch adds support to convert those data into CTF. A python script then
      can be used to process output data from BPF programs.
      
      Test result:
      
        # cat ./test_bpf_output_2.c
        /************************ BEGIN **************************/
        #include <uapi/linux/bpf.h>
        struct bpf_map_def {
       	unsigned int type;
       	unsigned int key_size;
       	unsigned int value_size;
       	unsigned int max_entries;
        };
        #define SEC(NAME) __attribute__((section(NAME), used))
        static u64 (*ktime_get_ns)(void) =
       	(void *)BPF_FUNC_ktime_get_ns;
        static int (*trace_printk)(const char *fmt, int fmt_size, ...) =
       	(void *)BPF_FUNC_trace_printk;
        static int (*get_smp_processor_id)(void) =
       	(void *)BPF_FUNC_get_smp_processor_id;
        static int (*perf_event_output)(void *, struct bpf_map_def *, int, void *, unsigned long) =
       	(void *)BPF_FUNC_perf_event_output;
      
        struct bpf_map_def SEC("maps") channel = {
       	.type = BPF_MAP_TYPE_PERF_EVENT_ARRAY,
       	.key_size = sizeof(int),
       	.value_size = sizeof(u32),
       	.max_entries = __NR_CPUS__,
        };
      
        static inline int __attribute__((always_inline))
        func(void *ctx, int type)
        {
       	struct {
       		u64 ktime;
       		int type;
       	} __attribute__((packed)) output_data;
       	char error_data[] = "Error: failed to output\n";
       	int err;
      
       	output_data.type = type;
       	output_data.ktime = ktime_get_ns();
       	err = perf_event_output(ctx, &channel, get_smp_processor_id(),
       				&output_data, sizeof(output_data));
       	if (err)
       		trace_printk(error_data, sizeof(error_data));
       	return 0;
        }
        SEC("func_begin=sys_nanosleep")
        int func_begin(void *ctx) {return func(ctx, 1);}
        SEC("func_end=sys_nanosleep%return")
        int func_end(void *ctx) { return func(ctx, 2);}
        char _license[] SEC("license") = "GPL";
        int _version SEC("version") = LINUX_VERSION_CODE;
        /************************* END ***************************/
      
        # ./perf record -e bpf-output/no-inherit,name=evt/ \
                       -e ./test_bpf_output_2.c/map:channel.event=evt/ \
                       usleep 100000
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.012 MB perf.data (2 samples) ]
      
        # ./perf script
                usleep 14942 92503.198504: evt:  ffffffff810e0ba1 sys_nanosleep (/lib/modules/4.3.0....
                usleep 14942 92503.298562: evt:  ffffffff810585e9 kretprobe_trampoline_holder (/lib....
      
        # ./perf data convert --to-ctf ./out.ctf
        [ perf data convert: Converted 'perf.data' into CTF data './out.ctf' ]
        [ perf data convert: Converted and wrote 0.000 MB (2 samples) ]
      
        # babeltrace ./out.ctf
        [01:41:43.198504134] (+?.?????????) evt: { cpu_id = 0 }, { perf_ip = 0xFFFFFFFF810E0BA1, perf_tid = 14942, perf_pid = 14942, perf_id = 1044, raw_len = 3, raw_data = [ [0] = 0x32C0C07B, [1] = 0x5421, [2] = 0x1 ] }
        [01:41:43.298562257] (+0.100058123) evt: { cpu_id = 0 }, { perf_ip = 0xFFFFFFFF810585E9, perf_tid = 14942, perf_pid = 14942, perf_id = 1044, raw_len = 3, raw_data = [ [0] = 0x38B77FAA, [1] = 0x5421, [2] = 0x2 ] }
      
        # cat ./test_bpf_output_2.py
        from babeltrace import TraceCollection
        tc = TraceCollection()
        tc.add_trace('./out.ctf', 'ctf')
        d = {1:[], 2:[]}
        for event in tc.events:
           if not event.name.startswith('evt'):
               continue
           raw_data = event['raw_data']
           (time, type) = ((raw_data[0] + (raw_data[1] << 32)), raw_data[2])
           d[type].append(time)
        print(list(map(lambda i: d[2][i] - d[1][i], range(len(d[1])))));
      
        # python3 ./test_bpf_output_2.py
        [100056879]
      
      Committer note:
      
      Make sure you have python3-devel installed, not python-devel, which may
      be for python2, which will lead to some "PyInstance_Type" errors. Also
      make sure that you use the right libbabeltrace, because it is shipped
      in Fedora, for instance, but an older version.
      
      To build libbabeltrace's python binding one also needs to use:
      
       ./configure --enable-python-bindings
      
      And then set PYTHONPATH=/usr/local/lib64/python3.4/site-packages/.
      Signed-off-by: default avatarWang Nan <wangnan0@huawei.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Li Zefan <lizefan@huawei.com>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1456479154-136027-9-git-send-email-wangnan0@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      6122d57e
    • Andi Kleen's avatar
      perf stat: Check existence of frontend/backed stalled cycles · 9dec4473
      Andi Kleen authored
      Only put the frontend/backend stalled cycles into the default perf stat
      events when the CPU actually supports them.
      
      This avoids empty columns with --metric-only on newer Intel CPUs.
      
      Committer note:
      
      Before:
      
        $ perf stat ls
      
          Performance counter stats for 'ls':
      
                1.080893     task-clock (msec)      #    0.619 CPUs utilized
                       0     context-switches       #    0.000 K/sec
                       0     cpu-migrations         #    0.000 K/sec
                      97     page-faults            #    0.090 M/sec
               3,327,741     cycles                 #    3.079 GHz
         <not supported>     stalled-cycles-frontend
         <not supported>     stalled-cycles-backend
               1,609,544     instructions           #    0.48  insn per cycle
                 319,117     branches               #  295.235 M/sec
                  12,246     branch-misses          #    3.84% of all branches
      
             0.001746508 seconds time elapsed
        $
      
      After:
      
        $ perf stat ls
      
          Performance counter stats for 'ls':
      
                0.693948     task-clock (msec)      #    0.662 CPUs utilized
                       0     context-switches       #    0.000 K/sec
                       0     cpu-migrations         #    0.000 K/sec
                      95     page-faults            #    0.137 M/sec
               1,792,509     cycles                 #    2.583 GHz
               1,599,047     instructions           #    0.89  insn per cycle
                 316,328     branches               #  455.838 M/sec
                  12,453     branch-misses          #    3.94% of all branches
      
             0.001048987 seconds time elapsed
        $
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1456532881-26621-2-git-send-email-andi@firstfloor.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      9dec4473
    • Jiri Olsa's avatar
      perf tools: Fix locale handling in pmu parsing · f9a5978a
      Jiri Olsa authored
      Ingo reported regression on display format of big numbers, which is
      missing separators (in default perf stat output).
      
       triton:~/tip> perf stat -a sleep 1
               ...
               127008602      cycles                    #    0.011 GHz
               279538533      stalled-cycles-frontend   #  220.09% frontend cycles idle
               119213269      instructions              #    0.94  insn per cycle
      
      This is caused by recent change:
      
        perf stat: Check existence of frontend/backed stalled cycles
      
      that added call to pmu_have_event, that subsequently calls
      perf_pmu__parse_scale, which has a bug in locale handling.
      
      The lc string returned from setlocale, that we use to store old locale
      value, may be allocated in static storage. Getting a dynamic copy to
      make it survive another setlocale call.
      
        $ perf stat ls
               ...
               2,360,602      cycles                    #    3.080 GHz
               2,703,090      instructions              #    1.15  insn per cycle
                 546,031      branches                  #  712.511 M/sec
      
      Committer note:
      
      Since the patch introducing the regression didn't made to perf/core,
      move it to just before where the regression was introduced, so that we
      don't break bisection for this feature.
      Reported-by: default avatarIngo Molnar <mingo@redhat.com>
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/20160303095348.GA24511@krava.redhat.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f9a5978a
  2. 29 Feb, 2016 32 commits