1. 30 Mar, 2016 8 commits
  2. 29 Mar, 2016 2 commits
  3. 28 Mar, 2016 1 commit
  4. 25 Mar, 2016 2 commits
  5. 24 Mar, 2016 3 commits
    • Arnaldo Carvalho de Melo's avatar
      perf bench: Fix detached tarball building due to missing 'perf bench memcpy' headers · 6a1a77ba
      Arnaldo Carvalho de Melo authored
      A change on kernel files included by the 'perf bench memcpy' code grew some new
      include deps, breaking the detached tarball build:
      
        $ make -C tools/perf build-test
        make: Entering directory '/home/acme/git/linux/tools/perf'
        - tarpkg: ./tests/perf-targz-src-pkg .
        tests/make:302: recipe for target 'tarpkg' failed
        make[1]: *** [tarpkg] Error 2
        Makefile:102: recipe for target 'build-test' failed
        make: *** [build-test] Error 2
        make: Leaving directory '/home/acme/git/linux/tools/perf'
        $ cat tools/perf/tarpkg
        ./tests/perf-targz-src-pkg .
          PERF_VERSION = 4.5.g05f5ec
          PERF_VERSION = 4.5.g05f5ec
        In file included from bench/mem-memcpy-x86-64-asm.S:9:0:
        bench/../../../arch/x86/lib/memcpy_64.S:5:29: fatal error: asm/cpufeatures.h: No such file or directory
        compilation terminated.
        mv: cannot stat ‘bench/.mem-memcpy-x86-64-asm.o.tmp’: No such file or directory
        make[5]: *** [bench/mem-memcpy-x86-64-asm.o] Error 1
        make[5]: *** Waiting for unfinished jobs....
        make[4]: *** [bench] Error 2
        make[4]: *** Waiting for unfinished jobs....
        make[3]: *** [perf-in.o] Error 2
        make[3]: *** Waiting for unfinished jobs....
        make[2]: *** [all] Error 2
        $
      
      Add arch/*/include/asm/*features.h to tools/perf/MANIFEST so that we can
      continue to use detached tarballs to build perf.
      
      Now it builds ok, doing it manually:
      
        $ make help | grep perf
          perf-tar-src-pkg    - Build perf-4.5.0.tar source tarball
          perf-targz-src-pkg  - Build perf-4.5.0.tar.gz source tarball
          perf-tarbz2-src-pkg - Build perf-4.5.0.tar.bz2 source tarball
          perf-tarxz-src-pkg  - Build perf-4.5.0.tar.xz source tarball
        $ ls -la perf-4.5.0.tar
        ls: cannot access perf-4.5.0.tar: No such file or directory
        $ make perf-tar-src-pkg
          TAR
          PERF_VERSION = 4.5.g32c25b
        $ ls -la perf-4.5.0.tar
        -rw-rw-r--. 1 acme acme 6318080 Mar 24 11:52 perf-4.5.0.tar
        $ mv perf-4.5.0.tar /tmp
        $ cd /tmp
        $ tar xf perf-4.5.0.tar
        $ cd perf-4.5.0/tools/perf
        $ make > /dev/null
        PERF_VERSION = 4.5.g32c25b
        $ ls -la perf
        -rwxrwxr-x. 1 acme acme 14046416 Mar 24 11:53 perf
        $ ./perf --version
        perf version 4.5.g32c25b
        $ perf bench
        Usage:
      	perf bench [<common options>] <collection> <benchmark> [<options>]
      
              # List of all available benchmark collections:
      
               sched: Scheduler and IPC benchmarks
                 mem: Memory access benchmarks
                numa: NUMA scheduling and MM benchmarks
               futex: Futex stressing benchmarks
                 all: All benchmarks
      
        $ perf bench mem
      
              # List of available benchmarks for collection 'mem':
      
              memcpy: Benchmark for memcpy() functions
              memset: Benchmark for memset() functions
                 all: Run all memory access benchmarks
      
        $ perf bench mem memcpy
        # Running 'mem/memcpy' benchmark:
        # function 'default' (Default memcpy() provided by glibc)
        # Copying 1MB bytes ...
      
              15.024038 GB/sec
        # function 'x86-64-unrolled' (unrolled memcpy() in arch/x86/lib/memcpy_64.S)
        # Copying 1MB bytes ...
      
              17.438616 GB/sec
        # function 'x86-64-movsq' (movsq-based memcpy() in arch/x86/lib/memcpy_64.S)
        # Copying 1MB bytes ...
      
              25.040064 GB/sec
        # function 'x86-64-movsb' (movsb-based memcpy() in arch/x86/lib/memcpy_64.S)
        # Copying 1MB bytes ...
      
              25.040064 GB/sec
        $
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-2c2sncwffuabw58fj1pw86gu@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      6a1a77ba
    • Arnaldo Carvalho de Melo's avatar
      perf tests: Fix tarpkg build test error output redirection · cde88355
      Arnaldo Carvalho de Melo authored
      So we need to trow away just stdout, leaving stderr to be caught by
      the build tests infrastructure, so that we can see what went wrong
      when the tarpkg build test fails:
      
        $ make -C tools/perf build-test
        make: Entering directory '/home/acme/git/linux/tools/perf'
        - tarpkg: ./tests/perf-targz-src-pkg .
        tests/make:302: recipe for target 'tarpkg' failed
        make[1]: *** [tarpkg] Error 2
        Makefile:102: recipe for target 'build-test' failed
        make: *** [build-test] Error 2
        make: Leaving directory '/home/acme/git/linux/tools/perf'
        $ cat tools/perf/tarpkg
        ./tests/perf-targz-src-pkg .
          PERF_VERSION = 4.5.g05f5ec
          PERF_VERSION = 4.5.g05f5ec
        In file included from bench/mem-memcpy-x86-64-asm.S:9:0:
        bench/../../../arch/x86/lib/memcpy_64.S:5:29: fatal error: asm/cpufeatures.h: No such file or directory
        compilation terminated.
        mv: cannot stat ‘bench/.mem-memcpy-x86-64-asm.o.tmp’: No such file or directory
        make[5]: *** [bench/mem-memcpy-x86-64-asm.o] Error 1
        make[5]: *** Waiting for unfinished jobs....
        make[4]: *** [bench] Error 2
        make[4]: *** Waiting for unfinished jobs....
        make[3]: *** [perf-in.o] Error 2
        make[3]: *** Waiting for unfinished jobs....
        make[2]: *** [all] Error 2
        $
      
      So the test flow is:
      
      1. Run: 'make -C tools/perf build-test'
      
      2. One of its tests failed, in this case, the 'tarpkg' one
      
      3. Look at what went wrong, by looking at the output of that test, in
         tools/perf/tarpkg
      
      Admittedly, this should be shortcircuited to showing what went wrong directly
      from the 'make build-test' step, but lets first fix this tarpkg one and the
      problem it spotted, which should be fixed by adding some extra file to the
      tools/perf/MANIFEST so that detached tarballs continue being self contained and
      build successfully.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-ynld6egoxolmftcddpnd7oh6@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      cde88355
    • Ingo Molnar's avatar
      Merge tag 'perf-core-for-mingo-20160323' of... · 05f5ece7
      Ingo Molnar authored
      Merge tag 'perf-core-for-mingo-20160323' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
      
      Pull perf/core improvements and fixes:
      
      User visible fixes:
      
       - Fix documentation of :ppp modifier in 'perf list' (Andi Kleen)
      
       - Fix silly nodes bitfield bits/bytes length assertion in 'perf bench numa' (Jakub Jelen)
      
       - Remove redundant CPU output in libtraceevent (Steven Rostedt)
      
       - Remove 'core_id' check in topology 'perf test' (Sukadev Bhattiprolu)
      
      Infrastructure changes/fixes:
      
       - Record text offset in dso to calculate objdump address, to use with
         modules in addition to vDSO symbol address calculations (Wang Nan)
      
       - Move utilities.mak from perf to tools/scripts/ (Arnaldo Carvalho de Melo)
      
       - Add cpumode to the perf_sample struct, this way we don't need to pass
         the union event to the machine and thread resolving routines, shortening
         function signatures and allowing the future introduction of a way
         to use tracepoint events instead of the unavailable HW cycles counter on
         powerpc guests in perf kvm by just hooking on perf_evsel__parse_sample,
         at the end (Arnaldo Carvalho de Melo)
      
       - Remove/unexport die() related infrastructure, that at some point will
         finally be removed (Arnaldo Carvalho de Melo)
      
       - Adopt linux/stringify.h from the kernel sources, not to touch this
         kernel header from tools/ (Arnaldo Carvalho de Melo)
      
       - Stop using strbuf for things we can instead trivially use libc's asprintf()
         (Arnaldo Carvalho de Melo)
      
       - Ditch tools/lib/util/abspath.c, its only exported function was used at just
         one place and can be replaced by libc's realpath() (Arnaldo Carvalho de Melo)
      
       - Use strerror_r() in the llvm infrastructure, tread safe, its what is used
         elsewhere in tools/perf/ (Arnaldo Carvalho de Melo)
      
      Cleanups:
      
       - Removed misplaced or needless __maybe_unused/export (Arnaldo Carvalho de Melo)
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      05f5ece7
  6. 23 Mar, 2016 18 commits
  7. 22 Mar, 2016 1 commit
  8. 21 Mar, 2016 5 commits
    • Jakub Jelen's avatar
      perf bench numa: Fix assertion for nodes bitfield · 3c52b658
      Jakub Jelen authored
      Comparing bits and bytes in numa benchmark assertion
      
      I hit the issue on two socket Power8 machine presenting its numa nodes
      as 0,1,16,17 (according to numactl). Therefore I got error (and hang of
      parent process):
      
          perf: bench/numa.c:296: bind_to_memnode: Assertion `!(g->p.nr_nodes > (int)sizeof(nodemask))' failed.
      
      This is obviously false positive. We can fit all the 18 nodes into
      bitfield of 8 bytes (long on 64b architecture).
      Signed-off-by: default avatarJakub Jelen <jakuje@gmail.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jakub Jelen <jjelen@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: trivial@kernel.org
      Link: http://lkml.kernel.org/r/1458388687-24421-1-git-send-email-jakuje@gmail.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      3c52b658
    • Srinivas Pandruvada's avatar
      perf/x86/intel/rapl: Add missing Broadwell models · 7b0fd569
      Srinivas Pandruvada authored
      Added Broadwell-H and Broadwell-Server.
      Signed-off-by: default avatarSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: bp@alien8.de
      Link: http://lkml.kernel.org/r/1458517938-25308-1-git-send-email-srinivas.pandruvada@linux.intel.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      7b0fd569
    • Kan Liang's avatar
      perf/x86/intel/uncore: Remove ev_sel_ext bit support for PCU · cb225252
      Kan Liang authored
      The ev_sel_ext in PCU_MSR_PMON_CTL is locked on some CPU models, so despite
      it being documented in the SDM, if we write 1 to that bit then we can get a #GP
      fault.
      
      Which #GP the perf fuzzer happily triggered in Peter Zijlstra's testing.
      
      Also, there are no public events which use that bit, so remove ev_sel_ext
      bit support for PCU.
      Reported-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarKan Liang <kan.liang@intel.com>
      Acked-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Link: http://lkml.kernel.org/r/1458500301-3594-1-git-send-email-kan.liang@intel.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      cb225252
    • Huang Rui's avatar
      perf/x86/amd/power: Add AMD accumulated power reporting mechanism · c7ab62bf
      Huang Rui authored
      Introduce an AMD accumlated power reporting mechanism for the Family
      15h, Model 60h processor that can be used to calculate the average
      power consumed by a processor during a measurement interval. The
      feature support is indicated by CPUID Fn8000_0007_EDX[12].
      
      This feature will be implemented both in hwmon and perf. The current
      design provides one event to report per package/processor power
      consumption by counting each compute unit power value.
      
      Here the gory details of how the computation is done:
      
      * Tsample: compute unit power accumulator sample period
      * Tref: the PTSC counter period (PTSC: performance timestamp counter)
      * N: the ratio of compute unit power accumulator sample period to the
        PTSC period
      
      * Jmax: max compute unit accumulated power which is indicated by
        MSR_C001007b[MaxCpuSwPwrAcc]
      
      * Jx/Jy: compute unit accumulated power which is indicated by
        MSR_C001007a[CpuSwPwrAcc]
      
      * Tx/Ty: the value of performance timestamp counter which is indicated
        by CU_PTSC MSR_C0010280[PTSC]
      * PwrCPUave: CPU average power
      
      i. Determine the ratio of Tsample to Tref by executing CPUID Fn8000_0007.
      	N = value of CPUID Fn8000_0007_ECX[CpuPwrSampleTimeRatio[15:0]].
      
      ii. Read the full range of the cumulative energy value from the new
          MSR MaxCpuSwPwrAcc.
      	Jmax = value returned.
      
      iii. At time x, software reads CpuSwPwrAcc and samples the PTSC.
      	Jx = value read from CpuSwPwrAcc and Tx = value read from PTSC.
      
      iv. At time y, software reads CpuSwPwrAcc and samples the PTSC.
      	Jy = value read from CpuSwPwrAcc and Ty = value read from PTSC.
      
      v. Calculate the average power consumption for a compute unit over
      time period (y-x). Unit of result is uWatt:
      
      	if (Jy < Jx) // Rollover has occurred
      		Jdelta = (Jy + Jmax) - Jx
      	else
      		Jdelta = Jy - Jx
      	PwrCPUave = N * Jdelta * 1000 / (Ty - Tx)
      
      Simple example:
      
        root@hr-zp:/home/ray/tip# ./tools/perf/perf stat -a -e 'power/power-pkg/' make -j4
          CHK     include/config/kernel.release
          CHK     include/generated/uapi/linux/version.h
          CHK     include/generated/utsrelease.h
          CHK     include/generated/timeconst.h
          CHK     include/generated/bounds.h
          CHK     include/generated/asm-offsets.h
          CALL    scripts/checksyscalls.sh
          CHK     include/generated/compile.h
          SKIPPED include/generated/compile.h
          Building modules, stage 2.
        Kernel: arch/x86/boot/bzImage is ready  (#40)
          MODPOST 4225 modules
      
         Performance counter stats for 'system wide':
      
                    183.44 mWatts power/power-pkg/
      
             341.837270111 seconds time elapsed
      
        root@hr-zp:/home/ray/tip# ./tools/perf/perf stat -a -e 'power/power-pkg/' sleep 10
      
         Performance counter stats for 'system wide':
      
                      0.18 mWatts power/power-pkg/
      
              10.012551815 seconds time elapsed
      Suggested-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Suggested-by: default avatarIngo Molnar <mingo@kernel.org>
      Suggested-by: default avatarBorislav Petkov <bp@suse.de>
      Signed-off-by: default avatarHuang Rui <ray.huang@amd.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Robert Richter <rric@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: jacob.w.shin@gmail.com
      Link: http://lkml.kernel.org/r/1457502306-2559-1-git-send-email-ray.huang@amd.com
      [ Fixed the modular build. ]
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      c7ab62bf
    • Huang Rui's avatar
      x86/cpufeature, perf/x86: Add AMD Accumulated Power Mechanism feature flag · 01fe03ff
      Huang Rui authored
      AMD CPU family 15h model 0x60 introduces a mechanism for measuring
      accumulated power. It is used to report the processor power consumption
      and support for it is indicated by CPUID Fn8000_0007_EDX[12].
      Signed-off-by: default avatarHuang Rui <ray.huang@amd.com>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Cc: Aaron Lu <aaron.lu@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andreas Herrmann <herrmann.der.user@googlemail.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Fengguang Wu <fengguang.wu@intel.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Guenter Roeck <linux@roeck-us.net>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Hector Marco-Gisbert <hecmargi@upv.es>
      Cc: Jacob Shin <jacob.w.shin@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Kristen Carlson Accardi <kristen@linux.intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Robert Richter <rric@kernel.org>
      Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: Wan Zongshun <Vincent.Wan@amd.com>
      Cc: spg_linux_kernel@amd.com
      Link: http://lkml.kernel.org/r/1452739808-11871-4-git-send-email-ray.huang@amd.com
      [ Resolved conflict and moved the synthetic CPUID slot to 19. ]
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      01fe03ff