1. 25 Apr, 2016 9 commits
    • Jiri Olsa's avatar
      tools build: Fix perf_clean target · ab362f5a
      Jiri Olsa authored
      Fix perf_clean target to follow the same logic as perf target.
      
      Fixes the following make invokation:
      
        $ cd <kernelsrc> && make tools/perf_clean
      Reported-by: default avatarTJ <linux@iam.tj>
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=116411
      Link: http://lkml.kernel.org/r/1461615438-27894-2-git-send-email-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ab362f5a
    • Jiri Olsa's avatar
      perf tools: Make the x86 clean quiet · 6404436a
      Jiri Olsa authored
      Turn current clean output:
      
        $ make clean
        rm -f arch/x86/include/generated/asm/syscalls_64.c
          CLEAN    libbpf
          CLEAN    libapi
      
      into:
      
        $ make clean
          CLEAN    x86
          CLEAN    libapi
          CLEAN    libbpf
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: TJ <linux@iam.tj>
      Link: http://lkml.kernel.org/r/1461615438-27894-1-git-send-email-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      6404436a
    • Arnaldo Carvalho de Melo's avatar
      perf evlist: Decode perf_event_attr->branch_sample_type · a213b92e
      Arnaldo Carvalho de Melo authored
      While trying to use --call-graph lbr in 'perf trace', since we only are
      interested in the callchain for userspace, up to the callchain, I found
      that 'perf evlist' is not decoding the branch_sample_type field, fix it.
      
      Before:
      
        # perf record --call-graph lbr usleep 1
        # perf evlist -v
        cycles:ppp: size: 112, { sample_period, sample_freq }: 4000,
        sample_type: IP|TID|TIME|CALLCHAIN|CPU|PERIOD|BRANCH_STACK,
        disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1,
        precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1,
        comm_exec: 1, branch_sample_type: 51201
                      ^^^^^^^^^^^^^^^^^^^^^^^^^
      
      After:
      
        # perf evlist -v
        cycles:ppp: size: 112, { sample_period, sample_freq }: 4000,
        sample_type: IP|TID|TIME|CALLCHAIN|CPU|PERIOD|BRANCH_STACK,
        disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1,
        precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1,
        comm_exec: 1, branch_sample_type: USER|CALL_STACK|NO_FLAGS|NO_CYCLES
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-hozai7974u0ulgx13k96fcaw@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      a213b92e
    • Arnaldo Carvalho de Melo's avatar
      perf trace: Make --pf honour --min-stack too · 1df54290
      Arnaldo Carvalho de Melo authored
      To check deeply nested page fault callchains.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-wuji34xx003kr88nmqt6jkgf@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      1df54290
    • Arnaldo Carvalho de Melo's avatar
      perf trace: Make --event honour --min-stack too · 7ad35615
      Arnaldo Carvalho de Melo authored
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-shj0fazntmskhjild5i6x73l@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      7ad35615
    • Chris Phlipot's avatar
      perf script: Fix segfault when printing callchains · e557b674
      Chris Phlipot authored
      This fixes a bug caused by an unitialized callchain cursor. The crash
      frist appeared in:
      
      6f736735 ("perf evsel: Require that callchains be resolved before
      calling fprintf_{sym,callchain}")
      
      The callchain cursor is a struct that contains pointers, that when
      uninitialized will cause unpredictable behavior (usually a crash)
      when trying to append to the callchain.
      
      The existing implementation has the following issues:
      
      1. The callchain cursor used is not initialized, resulting in
      	unpredictable behavior when used.
      2. The cursor is declared on the stack. Even if it is properly initalized,
      	the implmentation will leak memory when the function returns,
      	since all the references to the callchain_nodes allocated by
      	callchain_cursor_append will be lost when the cursor goes out of
      	scope.
      3. Storing the cursor on the stack is inefficient. Even if memory is
      	properly freed when it goes out of scope, a performance penalty
      	will be incurred due to reallocation of callchain nodes.
      	callchain_cursor_append is designed to avoid these reallocations
      	when an existing cursor is reused.
      
      This patch fixes the crash by replacing cursor_callchain with a reference
      to the global callchain_cursor which also resolves all 3 issues mentioned
      above.
      
      How to reproduce the crash:
      
        $ perf record --call-graph=dwarf stress -t 1 -c 1
        $ perf script > /dev/null
        Segfault
      Signed-off-by: default avatarChris Phlipot <cphlipot0@gmail.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Fixes: 6f736735 ("perf evsel: Require that callchains be resolved before calling fprintf_{sym,callchain}")
      Link: http://lkml.kernel.org/r/1461119531-2529-1-git-send-email-cphlipot0@gmail.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e557b674
    • Arnaldo Carvalho de Melo's avatar
      perf trace: Make --pf maj/min/all use callchains too · 0c3a6ef4
      Arnaldo Carvalho de Melo authored
      Forgot about page faults, a software event, when adding support for callchains,
      fix it:
      
        # trace --no-syscalls --pf maj --call dwarf
           0.000 ( 0.000 ms): Xorg/2068 majfault [sfbSegment1+0x0] => /usr/lib64/xorg/modules/drivers/intel_drv.so@0x11b490 (x.)
                                             sfbSegment1+0x0 (/usr/lib64/xorg/modules/drivers/intel_drv.so)
                                             fbPolySegment32+0x361 (/usr/lib64/xorg/modules/drivers/intel_drv.so)
                                             sna_poly_segment+0x743 (/usr/lib64/xorg/modules/drivers/intel_drv.so)
                                             damagePolySegment+0x77 (/usr/libexec/Xorg)
                                             ProcPolySegment+0xe7 (/usr/libexec/Xorg)
                                             Dispatch+0x25f (/usr/libexec/Xorg)
                                             dix_main+0x3c3 (/usr/libexec/Xorg)
                                             __libc_start_main+0xf0 (/usr/lib64/libc-2.22.so)
                                             _start+0x29 (/usr/libexec/Xorg)
           0.257 ( 0.000 ms): Xorg/2068 majfault [miZeroClipLine+0x0] => /usr/libexec/Xorg@0x18e830 (x.)
                                             miZeroClipLine+0x0 (/usr/libexec/Xorg)
                                             _fbSegment+0x2c0 (/usr/lib64/xorg/modules/drivers/intel_drv.so)
                                             sfbSegment1+0x67 (/usr/lib64/xorg/modules/drivers/intel_drv.so)
                                             fbPolySegment32+0x361 (/usr/lib64/xorg/modules/drivers/intel_drv.so)
                                             sna_poly_segment+0x743 (/usr/lib64/xorg/modules/drivers/intel_drv.so)
                                             damagePolySegment+0x77 (/usr/libexec/Xorg)
                                             ProcPolySegment+0xe7 (/usr/libexec/Xorg)
                                             Dispatch+0x25f (/usr/libexec/Xorg)
                                             dix_main+0x3c3 (/usr/libexec/Xorg)
                                             __libc_start_main+0xf0 (/usr/lib64/libc-2.22.so)
                                             _start+0x29 (/usr/libexec/Xorg)
      ^C#
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-8h6ssirw5z15qyhy2lwd6f89@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      0c3a6ef4
    • Arnaldo Carvalho de Melo's avatar
      perf trace: Extract evsel contructor from perf_evlist__add_pgfault · 0ae537cb
      Arnaldo Carvalho de Melo authored
      Prep work for next patches, where we'll need access to the created
      evsels, to possibly configure callchains.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-2pcgsgnkgellhlcao4aub8tu@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      0ae537cb
    • Andrey Ryabinin's avatar
      perf buildid: Fix off-by-one in write_buildid() · 70a2cba9
      Andrey Ryabinin authored
      write_buildid() increments 'name_len' with intention to take into
      account trailing zero byte. However, 'name_len' was already incremented
      in machine__write_buildid_table() before.  So this leads to
      out-of-bounds read in do_write():
      
        $ ./perf record sleep 0
        [ perf record: Woken up 1 times to write data ]
        =================================================================
        ==15899==ERROR: AddressSanitizer: global-buffer-overflow on address 0x00000099fc92 at pc 0x7f1aa9c7eab5 bp 0x7fff940f84d0 sp 0x7fff940f7c78
        READ of size 19 at 0x00000099fc92 thread T0
            #0 0x7f1aa9c7eab4  (/usr/lib/gcc/x86_64-pc-linux-gnu/5.3.0/libasan.so.2+0x44ab4)
            #1 0x649c5b in do_write util/header.c:67
            #2 0x649c5b in write_padded util/header.c:82
            #3 0x57e8bc in write_buildid util/build-id.c:239
            #4 0x57e8bc in machine__write_buildid_table util/build-id.c:278
        ...
      
        0x00000099fc92 is located 0 bytes to the right of global variable '*.LC99' defined in 'util/symbol.c' (0x99fc80) of size 18
          '*.LC99' is ascii string '[kernel.kallsyms]'
        ...
      
        Shadow bytes around the buggy address:
          0x00008012bf80: f9 f9 f9 f9 00 00 00 00 00 00 03 f9 f9 f9 f9 f9
        =>0x00008012bf90: 00 00[02]f9 f9 f9 f9 f9 00 00 00 00 00 05 f9 f9
          0x00008012bfa0: f9 f9 f9 f9 00 03 f9 f9 f9 f9 f9 f9 00 00 00 00
      Signed-off-by: default avatarAndrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1461053847-5633-1-git-send-email-aryabinin@virtuozzo.com
      [ Remove the off-by one at the origin, to keep len(s) == strlen(s) assumption ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      70a2cba9
  2. 23 Apr, 2016 11 commits
    • Ingo Molnar's avatar
      Merge tag 'perf-core-for-mingo-20160419' of... · 67d61296
      Ingo Molnar authored
      Merge tag 'perf-core-for-mingo-20160419' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
      
      Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
      
      Build fixes:
      
      - Fix 'perf trace' build when DWARF unwind isn't available (Arnaldo Carvalho de Melo)
      
      - Remove x86 references from arch-neutral Build, fixing it in !x86 arches,
        reported as breaking the build for powerpc64le in linux-next (Arnaldo Carvalho de Melo)
      
      Infrastructure changes:
      
      - Do memset() variable 'st' using the correct size in the jit code (Colin Ian King)
      
      - Fix postgresql ubuntu 'perf script' install instructions (Chris Phlipot)
      
      - Use callchain_param more thoroughly when checking how callchains were
        configured, eventually will be the only way to look for callchain parameters
        (Arnaldo Carvalho de Melo)
      
      - Fix some issues in the 'perf test kallsyms' entry (Arnaldo Carvalho de Melo)
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      67d61296
    • Peter Zijlstra's avatar
      x86/perf/rapl: Add missing Broadwell model · 31b84310
      Peter Zijlstra authored
      With the array aligned as per events/intel/core.c it was fairly
      obvious we missed one, add it in.
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      31b84310
    • Peter Zijlstra's avatar
      x86/perf/rapl: Reorder model numbers · c416e5aa
      Peter Zijlstra authored
      Re-order the model array to match the order in events/intel/core.c,
      to easier spot gaps and such.
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      c416e5aa
    • Srinivas Pandruvada's avatar
      perf/x86/intel/rapl: Support Skylake RAPL domains · dcee75b3
      Srinivas Pandruvada authored
      Add Skylake client support for RAPL domains. In addition to RAPL domains
      in Broadwell clients, it has support for platform domain (aka PSys). The
      PSys domain controls the entire SoC instead of just a CPU package. Unlike
      package domain, PSys support requires more than just processor level
      implementation. The other parts in the system need additional HW level
      signaling, which OEMs need to support. When not supported, the energy
      counter register in PSys domain returns 0.
      
      Also corrected error in comment for GPU counter, which previously was
      DRAM counter.
      
      Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com
      [ Cnverted to model_match stuff. ]
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: bp@alien8.de
      Cc: hpa@zytor.com
      Cc: jacob.jun.pan@linux.intel.com
      Cc: rjw@rjwysocki.net
      Link: http://lkml.kernel.org/r/1460930581-29748-2-git-send-email-srinivas.pandruvada@linux.intel.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      dcee75b3
    • Wang Nan's avatar
      perf/core: Add ::write_backward attribute to perf event · 9ecda41a
      Wang Nan authored
      This patch introduces 'write_backward' bit to perf_event_attr, which
      controls the direction of a ring buffer. After set, the corresponding
      ring buffer is written from end to beginning. This feature is design to
      support reading from overwritable ring buffer.
      
      Ring buffer can be created by mapping a perf event fd. Kernel puts event
      records into ring buffer, user tooling like perf fetch them from
      address returned by mmap(). To prevent racing between kernel and tooling,
      they communicate to each other through 'head' and 'tail' pointers.
      Kernel maintains 'head' pointer, points it to the next free area (tail
      of the last record). Tooling maintains 'tail' pointer, points it to the
      tail of last consumed record (record has already been fetched). Kernel
      determines the available space in a ring buffer using these two
      pointers to avoid overwrite unfetched records.
      
      By mapping without 'PROT_WRITE', an overwritable ring buffer is created.
      Different from normal ring buffer, tooling is unable to maintain 'tail'
      pointer because writing is forbidden. Therefore, for this type of ring
      buffers, kernel overwrite old records unconditionally, works like flight
      recorder. This feature would be useful if reading from overwritable ring
      buffer were as easy as reading from normal ring buffer. However,
      there's an obscure problem.
      
      The following figure demonstrates a full overwritable ring buffer. In
      this figure, the 'head' pointer points to the end of last record, and a
      long record 'E' is pending. For a normal ring buffer, a 'tail' pointer
      would have pointed to position (X), so kernel knows there's no more
      space in the ring buffer. However, for an overwritable ring buffer,
      kernel ignore the 'tail' pointer.
      
         (X)                              head
          .                                |
          .                                V
          +------+-------+----------+------+---+
          |A....A|B.....B|C........C|D....D|   |
          +------+-------+----------+------+---+
      
      Record 'A' is overwritten by event 'E':
      
            head
             |
             V
          +--+---+-------+----------+------+---+
          |.E|..A|B.....B|C........C|D....D|E..|
          +--+---+-------+----------+------+---+
      
      Now tooling decides to read from this ring buffer. However, none of these
      two natural positions, 'head' and the start of this ring buffer, are
      pointing to the head of a record. Even the full ring buffer can be
      accessed by tooling, it is unable to find a position to start decoding.
      
      The first attempt tries to solve this problem AFAIK can be found from
      [1]. It makes kernel to maintain 'tail' pointer: updates it when ring
      buffer is half full. However, this approach introduces overhead to
      fast path. Test result shows a 1% overhead [2]. In addition, this method
      utilizes no more tham 50% records.
      
      Another attempt can be found from [3], which allows putting the size of
      an event at the end of each record. This approach allows tooling to find
      records in a backward manner from 'head' pointer by reading size of a
      record from its tail. However, because of alignment requirement, it
      needs 8 bytes to record the size of a record, which is a huge waste. Its
      performance is also not good, because more data need to be written.
      This approach also introduces some extra branch instructions to fast
      path.
      
      'write_backward' is a better solution to this problem.
      
      Following figure demonstrates the state of the overwritable ring buffer
      when 'write_backward' is set before overwriting:
      
             head
              |
              V
          +---+------+----------+-------+------+
          |   |D....D|C........C|B.....B|A....A|
          +---+------+----------+-------+------+
      
      and after overwriting:
                                           head
                                            |
                                            V
          +---+------+----------+-------+---+--+
          |..E|D....D|C........C|B.....B|A..|E.|
          +---+------+----------+-------+---+--+
      
      In each situation, 'head' points to the beginning of the newest record.
      From this record, tooling can iterate over the full ring buffer and fetch
      records one by one.
      
      The only limitation that needs to be considered is back-to-back reading.
      Due to the non-deterministic of user programs, it is impossible to ensure
      the ring buffer keeps stable during reading. Consider an extreme situation:
      tooling is scheduled out after reading record 'D', then a burst of events
      come, eat up the whole ring buffer (one or multiple rounds). When the
      tooling process comes back, reading after 'D' is incorrect now.
      
      To prevent this problem, we need to find a way to ensure the ring buffer
      is stable during reading. ioctl(PERF_EVENT_IOC_PAUSE_OUTPUT) is
      suggested because its overhead is lower than
      ioctl(PERF_EVENT_IOC_ENABLE).
      
      By carefully verifying 'header' pointer, reader can avoid pausing the
      ring-buffer. For example:
      
          /* A union of all possible events */
          union perf_event event;
      
          p = head = perf_mmap__read_head();
          while (true) {
              /* copy header of next event */
              fetch(&event.header, p, sizeof(event.header));
      
              /* read 'head' pointer */
              head = perf_mmap__read_head();
      
              /* check overwritten: is the header good? */
              if (!verify(sizeof(event.header), p, head))
                  break;
      
              /* copy the whole event */
              fetch(&event, p, event.header.size);
      
              /* read 'head' pointer again */
              head = perf_mmap__read_head();
      
              /* is the whole event good? */
              if (!verify(event.header.size, p, head))
                  break;
              p += event.header.size;
          }
      
      However, the overhead is high because:
      
       a) In-place decoding is not safe.
          Copying-verifying-decoding is required.
       b) Fetching 'head' pointer requires additional synchronization.
      
      (From Alexei Starovoitov:
      
      Even when this trick works, pause is needed for more than stability of
      reading. When we collect the events into overwrite buffer we're waiting
      for some other trigger (like all cpu utilization spike or just one cpu
      running and all others are idle) and when it happens the buffer has
      valuable info from the past. At this point new events are no longer
      interesting and buffer should be paused, events read and unpaused until
      next trigger comes.)
      
      This patch utilizes event's default overflow_handler introduced
      previously. perf_event_output_backward() is created as the default
      overflow handler for backward ring buffers. To avoid extra overhead to
      fast path, original perf_event_output() becomes __perf_event_output()
      and marked '__always_inline'. In theory, there's no extra overhead
      introduced to fast path.
      
      Performance testing:
      
      Calling 3000000 times of 'close(-1)', use gettimeofday() to check
      duration.  Use 'perf record -o /dev/null -e raw_syscalls:*' to capture
      system calls. In ns.
      
      Testing environment:
      
        CPU    : Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz
        Kernel : v4.5.0
                          MEAN         STDVAR
       BASE            800214.950    2853.083
       PRE1           2253846.700    9997.014
       PRE2           2257495.540    8516.293
       POST           2250896.100    8933.921
      
      Where 'BASE' is pure performance without capturing. 'PRE1' is test
      result of pure 'v4.5.0' kernel. 'PRE2' is test result before this
      patch. 'POST' is test result after this patch. See [4] for the detailed
      experimental setup.
      
      Considering the stdvar, this patch doesn't introduce performance
      overhead to the fast path.
      
       [1] http://lkml.iu.edu/hypermail/linux/kernel/1304.1/04584.html
       [2] http://lkml.iu.edu/hypermail/linux/kernel/1307.1/00535.html
       [3] http://lkml.iu.edu/hypermail/linux/kernel/1512.0/01265.html
       [4] http://lkml.kernel.org/g/56F89DCD.1040202@huawei.comSigned-off-by: default avatarWang Nan <wangnan0@huawei.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Cc: <acme@kernel.org>
      Cc: <pi3orama@163.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: Zefan Li <lizefan@huawei.com>
      Link: http://lkml.kernel.org/r/1459865478-53413-1-git-send-email-wangnan0@huawei.com
      [ Fixed the changelog some more. ]
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      9ecda41a
    • Kan Liang's avatar
      perf/x86/intel: Add LBR filter support for Silvermont and Airmont CPUs · f21d5adc
      Kan Liang authored
      LBR filtering is also supported on the Silvermont and Airmont
      microarchitectures. The layout of MSR_LBR_SELECT is the same as Nehalem.
      Signed-off-by: default avatarKan Liang <kan.liang@intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Link: http://lkml.kernel.org/r/1460706825-46163-1-git-send-email-kan.liang@intel.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      f21d5adc
    • Kan Liang's avatar
      perf/x86/intel: Add Goldmont CPU support · 8b92c3a7
      Kan Liang authored
      Add perf core PMU support for Intel Goldmont CPU cores:
      
       - The init code is based on Silvermont.
      
       - There is a new cache event list, based on the Silvermont cache event list.
      
       - Goldmont has 32 LBR entries. It also uses new LBRv6 format, which
         report the cycle information using upper 16-bit of the LBR_TO.
      
       - It's recommended to use CPU_CLK_UNHALTED.CORE_P + NPEBS for precise cycles.
      
      For details, please refer to the latest SDM058:
      
       http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-vol-3b-part-2-manual.pdfSigned-off-by: default avatarKan Liang <kan.liang@intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Link: http://lkml.kernel.org/r/1460706167-45320-1-git-send-email-kan.liang@intel.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      8b92c3a7
    • Ingo Molnar's avatar
    • Peter Zijlstra's avatar
      perf/core: Make sysctl_perf_cpu_time_max_percent conform to documentation · b303e7c1
      Peter Zijlstra authored
      Markus reported that 0 should also disable the throttling we per
      Documentation/sysctl/kernel.txt.
      Reported-by: default avatarMarkus Trippelsdorf <markus@trippelsdorf.de>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Fixes: 91a612ee ("perf/core: Fix dynamic interrupt throttle")
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      b303e7c1
    • Srinivas Pandruvada's avatar
      perf/x86/intel/rapl: Add missing Haswell model · e1089602
      Srinivas Pandruvada authored
      Added one missing Haswell model.
      Signed-off-by: default avatarSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: bp@alien8.de
      Cc: hpa@zytor.com
      Link: http://lkml.kernel.org/r/1460907809-11897-1-git-send-email-srinivas.pandruvada@linux.intel.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      e1089602
    • Andi Kleen's avatar
      perf/x86/intel: Add model number for Skylake Server to perf · b89c1737
      Andi Kleen authored
      Everything the same as base Skylake, just a new model number.
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Link: http://lkml.kernel.org/r/1460751933-2264-1-git-send-email-andi@firstfloor.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      b89c1737
  3. 19 Apr, 2016 7 commits
    • Arnaldo Carvalho de Melo's avatar
      perf test: Add missing verbose output explaining the reason for failure · 6566feaf
      Arnaldo Carvalho de Melo authored
      One of the branches leading to an error had no debug message emitted,
      fix it, the new lines are:
      
        # perf test -v kallsyms
      <SNIP>
        0xffffffff81001000: diff name v: xen_hypercall_set_trap_table k: hypercall_page
        0xffffffff810691f0: diff name v: try_to_free_pud_page k: try_to_free_pmd_page
      <SNIP>
        0xffffffff8150bb20: diff name v: wakeup_expire_count_show.part.5 k: wakeup_active_count_show.part.7
        0xffffffff816bc7f0: diff name v: phys_switch_id_show.part.11 k: phys_port_name_show.part.12
        0xffffffff817bbb90: diff name v: __do_softirq k: __softirqentry_text_start
      <SNIP>
      
      This in turn exercises another bug, still under investigation, because those
      aliases _are_ in kallsyms, with the same name...
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Fixes: ab414dcd ("perf test: Fixup aliases checking in the 'vmlinux matches kallsyms' test")
      Link: http://lkml.kernel.org/n/tip-5fhea7a54a54gsmagu9obpr4@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      6566feaf
    • Arnaldo Carvalho de Melo's avatar
      perf test: Ignore kcore files in the "vmlinux matches kallsyms" test · 53d0fe68
      Arnaldo Carvalho de Melo authored
      Before:
      
        # perf test -v kallsyms
      <SNIP>
        Maps only in vmlinux:
         ffffffff81d5e000-ffffffff81ec3ac8 115e000 [kernel].init.text
         ffffffff81ec3ac8-ffffffffa0000000 12c3ac8 [kernel].exit.text
         ffffffffa0000000-ffffffffa000c000 0 [fjes]
         ffffffffa000c000-ffffffffa0017000 0 [video]
         ffffffffa0017000-ffffffffa001c000 0 [grace]
      <SNIP>
         ffffffffa0a7f000-ffffffffa0ba5000 0 [xfs]
         ffffffffa0ba5000-ffffffffffffffff 0 [veth]
        Maps in vmlinux with a different name in kallsyms:
        Maps only in kallsyms:
         ffff880000100000-ffff88001000b000 80000103000 [kernel.kallsyms]
         ffff88001000b000-ffff880100000000 8001000e000 [kernel.kallsyms]
         ffff880100000000-ffffc90000000000 80100003000 [kernel.kallsyms]
      <SNIP>
         ffffffffa0000000-ffffffffff600000 7fffa0003000 [kernel.kallsyms]
         ffffffffff600000-ffffffffffffffff 7fffff603000 [kernel.kallsyms]
        test child finished with -1
        ---- end ----
        vmlinux symtab matches kallsyms: FAILED!
        #
      
      After:
      
        # perf test -v 1
         1: vmlinux symtab matches kallsyms                          :
        --- start ---
        test child forked, pid 7058
        Looking at the vmlinux_path (8 entries long)
        Using /lib/modules/4.6.0-rc1+/build/vmlinux for symbols
        0xffffffff81076870: diff end addr for aesni_gcm_dec v: 0xffffffff810791f2 k: 0xffffffff81076902
        0xffffffff81079200: diff end addr for aesni_gcm_enc v: 0xffffffff8107bb03 k: 0xffffffff81079292
        0xffffffff8107e8d0: diff end addr for aesni_gcm_enc_avx_gen2 v: 0xffffffff81083e76 k: 0xffffffff8107e943
        0xffffffff81083e80: diff end addr for aesni_gcm_dec_avx_gen2 v: 0xffffffff81089611 k: 0xffffffff81083ef3
        0xffffffff81089990: diff end addr for aesni_gcm_enc_avx_gen4 v: 0xffffffff8108e7c4 k: 0xffffffff81089a03
        0xffffffff8108e7d0: diff end addr for aesni_gcm_dec_avx_gen4 v: 0xffffffff810937ef k: 0xffffffff8108e843
        Maps only in vmlinux:
         ffffffff81d5e000-ffffffff81ec3ac8 115e000 [kernel].init.text
         ffffffff81ec3ac8-ffffffffa0000000 12c3ac8 [kernel].exit.text
        Maps in vmlinux with a different name in kallsyms:
        Maps only in kallsyms:
        test child finished with -1
        ---- end ----
       vmlinux symtab matches kallsyms: FAILED!
        #
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Fixes: 8e0cf965 ("perf symbols: Add support for reading from /proc/kcore")
      Link: http://lkml.kernel.org/n/tip-n6vrwt9t89w8k769y349govx@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      53d0fe68
    • Arnaldo Carvalho de Melo's avatar
      perf symbols: Allow loading kallsyms without considering kcore files · e02092b9
      Arnaldo Carvalho de Melo authored
      Before the support for using /proc/kcore was introduced, the kallsyms
      routines used /proc/modules and the first 'perf test' entry expected
      finding maps for each module in the system, which is not the case with
      the kcore code. Provide a way to ignore kcore files so that the test can
      have its expectations met.
      
      Improving the test to cover kcore files as well needs to be done.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-ek5urnu103dlhfk4l6pcw041@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e02092b9
    • Arnaldo Carvalho de Melo's avatar
      perf build: Remove x86 references from arch-neutral Build · 2cc46669
      Arnaldo Carvalho de Melo authored
      It will already be dealt with generating the syscalltbl.c file in the
      x86 arch specific Build files, namely via 'archheaders'.
      
      This fixes the build on !x86 arches, as reported for powerpcle
      Reported-by: default avatarStephen Rothwell <sfr@canb.auug.org.au>
      Tested-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Wang Nan <wangnan0@huawei.com>
      Fixes: 1b700c99 ("perf tools: Build syscall table .c header from kernel's syscall_64.tbl")
      Link: http://lkml.kernel.org/r/20160415212831.GT9056@kernel.org
      [ Removed the syscalltbl.o altogether, as per Jiri's suggestion ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      2cc46669
    • Colin Ian King's avatar
      perf jit: memset() variable 'st' using the correct size · f56ebf20
      Colin Ian King authored
      The current code is memsetting the 'struct stat' variable 'st' with the size of
      'stat' (which turns out to be 1 byte) rather than the size of variable 'sz'.
      
      Committer notes:
      
      sizeof(function) isn't valid, the result depends on the compiler used, with
      gcc, enabling pedantic warnings we get:
      
        $ cat sizeof_function.c
        #include <sys/types.h>
        #include <sys/stat.h>
        #include <unistd.h>
        #include <stdio.h>
      
        int main(void)
        {
      	  printf("sizeof(stat)=%zd, stat=%p\n", sizeof(stat), stat);
      	  return 0;
        }
        $ readelf -sW sizeof_function | grep -w stat
            49: 0000000000400630    16 FUNC    WEAK   HIDDEN    13 stat
        $ cc -pedantic sizeof_function.c   -o sizeof_function
        sizeof_function.c: In function ‘main’:
        sizeof_function.c:8:46: warning: invalid application of ‘sizeof’ to a function type [-Wpointer-arith]
          printf("sizeof(stat)=%zd, stat=%p\n", sizeof(stat), stat);
                                                    ^
        $ ./sizeof_function
        sizeof(stat)=1, stat=0x400630
        $
      
        Standard C, section 6.5.3.4:
      
        "The sizeof operator shall not be applied to an expression that has function
         type or an incomplete type, to the parenthesized name of such a type,
         or to an expression that designates a bit-field member."
      
        http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdfSigned-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Fixes: 9b07e27f ("perf inject: Add jitdump mmap injection support")
      Link: http://lkml.kernel.org/r/1461020838-9260-1-git-send-email-colin.king@canonical.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f56ebf20
    • Chris Phlipot's avatar
      perf script: Fix postgresql ubuntu install instructions · d6632dd5
      Chris Phlipot authored
      The current instructions for setting up an Ubuntu system for using the
      export-to-postgresql.py script are incorrect.
      
      The instructions in the script have been updated to work on newer
      versions of ubuntu.
      
      -Add missing dependencies to apt-get command:
          python-pyside.qtsql, libqt4-sql-psql
      -Add '-s' option to createuser command to force the user to be a
          superuser since the command doesn't prompt as indicated in the
          current instructions.
      
      Tested on: Ubuntu 14.04, Ubuntu 16.04(beta)
      Signed-off-by: default avatarChris Phlipot <cphlipot0@gmail.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1461056164-14914-3-git-send-email-cphlipot0@gmail.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      d6632dd5
    • Ingo Molnar's avatar
      Merge tag 'perf-urgent-for-mingo-20160418' of... · a19cad6d
      Ingo Molnar authored
      Merge tag 'perf-urgent-for-mingo-20160418' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
      
      Pull a perf/urgent fix from Arnaldo Carvalho de Melo:
      
      - Fix segfault tracing transactions in Intel PT (Adrian Hunter)
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      a19cad6d
  4. 18 Apr, 2016 9 commits
  5. 17 Apr, 2016 4 commits