1. 08 Jan, 2018 11 commits
    • Jin Yao's avatar
      perf report: Support time percent and multiple time ranges · 5b969bc7
      Jin Yao authored
      perf report has a --time option to limit the time range of output.  It
      only supports absolute time.
      
      Now this option is extended to support multiple time ranges and support
      the percent of time.
      
      For example:
      
      1. Select the first and second 10% time slices:
      
      perf report --time 10%/1,10%/2
      
      2. Select from 0% to 10% and 30% to 40% slices:
      
      perf report --time 0%-10%,30%-40%
      
      Changelog:
      
      v6: Fix the merge issue with latest perf/core branch.
          No functional changes.
      
      v5: Add checking of first/last sample time to detect if it's recorded
          in perf.data. If it's not recorded, returns error message to user.
      
      v4: Remove perf_time__skip_sample, only uses perf_time__ranges_skip_sample
      
      v3: Since the definitions of first_sample_time/last_sample_time
          are moved from perf_session to perf_evlist so change the
          related code.
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1512738826-2628-6-git-send-email-yao.jin@linux.intel.com
      [ Add missing colons at end of examples in the man page ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      5b969bc7
    • Jin Yao's avatar
      perf tools: Create function to perform multiple time range checking · 9a9b8b4b
      Jin Yao authored
      Previous patch supports the multiple time range.
      
      For example, select the first and second 10% time slices.
      perf report --time 10%/1,10%/2
      
      We need a function to check if a timestamp is in the ranges of
      [0, 10%) and [10%, 20%].
      
      Note that it includes the last element in [10%, 20%] but it doesn't
      include the last element in [0, 10%). It's to avoid the overlap.
      
      This patch implments a new function perf_time__ranges_skip_sample
      for this checking.
      
      Change log:
      
      v4: Let perf_time__ranges_skip_sample be compatible with
          perf_time__skip_sample when only one time range.
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1512738826-2628-5-git-send-email-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      9a9b8b4b
    • Jin Yao's avatar
      perf tools: Create function to parse time percent · 13a70f35
      Jin Yao authored
      Current perf report/script/... have a --time option to limit the time
      range of output. But right now it only supports absolute time, add
      support for time percentage.
      
      For example:
      
      1. Select the second 10% time slice
         perf report --time 10%/2
      
      2. Select from 0% to 10% time slice
         perf report --time 0%-10%
      
      It also support the multiple time ranges.
      
      3. Select the first and second 10% time slices
         perf report --time 10%/1,10%/2
      
      4. Select from 0% to 10% and 30% to 40% slices
         perf report --time 0%-10%,30%-40%
      
      Changelog:
      
      v4: An issue is found. Following passes.
          perf script --time 10%/10x12321xsdfdasfdsafdsafdsa
      
          Now it uses strtol to replace atoi.
      
      Committer notes:
      
      This just puts in place the infrastructure, so the examples in this cset
      comment will only work later, after more patches in this series are
      applied.
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1512738826-2628-4-git-send-email-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      13a70f35
    • Jin Yao's avatar
      perf record: Record the first and last sample time in the header · 68588baf
      Jin Yao authored
      In the default 'perf record' configuration, all samples are processed,
      to create the HEADER_BUILD_ID table. So it's very easy to get the
      first/last samples and save the time to perf file header via the
      function write_sample_time().
      
      Later, at post processing time, perf report/script will fetch the time
      from perf file header.
      
      Committer testing:
      
        # perf record -a sleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 2.099 MB perf.data (1101 samples) ]
        [root@jouet home]# perf report --header | grep "time of "
        # time of first sample : 22947.909226
        # time of last sample : 22948.910704
        #
        # perf report -D | grep PERF_RECORD_SAMPLE\(
        0 22947909226101 0x20bb68 [0x30]: PERF_RECORD_SAMPLE(IP, 0x4001): 0/0: 0xffffffffa21b1af3 period: 1 addr: 0
        0 22947909229928 0x20bb98 [0x30]: PERF_RECORD_SAMPLE(IP, 0x4001): 0/0: 0xffffffffa200d204 period: 1 addr: 0
        <SNIP>
        3 22948910397351 0x219360 [0x30]: PERF_RECORD_SAMPLE(IP, 0x4001): 28251/28251: 0xffffffffa22071d8 period: 169518 addr: 0
        0 22948910652380 0x20f120 [0x30]: PERF_RECORD_SAMPLE(IP, 0x4001): 0/0: 0xffffffffa2856816 period: 198807 addr: 0
        2 22948910704034 0x2172d0 [0x30]: PERF_RECORD_SAMPLE(IP, 0x4001): 0/0: 0xffffffffa2856816 period: 88111 addr: 0
        #
      
      Changelog:
      
      v7: Just update the patch description according to Arnaldo's suggestion.
      
      v6: Currently '--buildid-all' is not enabled at default. So the walking
          on all samples is the default operation. There is no big overhead
          to calculate the timestamp boundary in process_sample_event handler
          once we already go through all samples. So the timestamp boundary
          calculation is enabled by default when '--buildid-all' is not enabled.
      
          While if '--buildid-all' is enabled, we creates a new option
          "--timestamp-boundary" for user to decide if it enables the
          timestamp boundary calculation.
      
      v5: There is an issue that the sample walking can only work when
          '--buildid-all' is not enabled. So we need to let the walking
          be able to work even if '--buildid-all' is enabled and let the
          processing skips the dso hit marking for this case.
      
          At first, I want to provide a new option "--record-time-boundaries".
          While after consideration, I think a new option is not very
          necessary.
      
      v3: Remove the definitions of first_sample_time and last_sample_time
          from struct record and directly save them in perf_evlist.
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1512738826-2628-3-git-send-email-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      68588baf
    • Jin Yao's avatar
      perf header: Add infrastructure to record first and last sample time · 6011518d
      Jin Yao authored
      perf report/script/... have a --time option to limit the time range of
      output. That's very useful to slice large traces, e.g. when processing
      the output of perf script for some analysis.
      
      But right now --time only supports absolute time. Also there is no fast
      way to get the start/end times of a given trace except for looking at
      it.  This makes it hard to e.g. only decode the first half of the trace,
      which is useful for parallelization of scripts
      
      Another problem is that perf records are variable size and there is no
      synchronization mechanism. So the only way to find the last sample
      reliably would be to walk all samples. But we want to avoid that in perf
      report/...  because it is already quite expensive. That is why storing
      the first sample time and last sample time in perf record is better.
      
      This patch creates a new header feature type HEADER_SAMPLE_TIME and
      related ops. Save the first sample time and the last sample time to the
      feature section in perf file header. That will be done when, for
      instance, processing build-ids, where we already have to process all
      samples to create the build-id table, take advantage of that to further
      amortize that processing by storing HEADER_SAMPLE_TIME to make 'perf
      report/script' faster when using --time.
      
      Committer testing:
      
      After this patch is applied the header is written with zeroes, we need
      the next patch, for "perf record" to actually write the timestamps:
      
        # perf report -D | grep PERF_RECORD_SAMPLE\(
        22501155244406 0x44f0 [0x28]: PERF_RECORD_SAMPLE(IP, 0x4001): 25016/25016: 0xffffffffa21be8c5 period: 1 addr: 0
        <SNIP>
        22501155793625 0x4a30 [0x28]: PERF_RECORD_SAMPLE(IP, 0x4001): 25016/25016: 0xffffffffa21ffd50 period: 2828043 addr: 0
        # perf report --header | grep "time of "
        # time of first sample : 0.000000
        # time of last sample : 0.000000
        #
      
      Changelog:
      
      v7: 1. Rebase to latest perf/core branch.
      
          2. Add following clarification in patch description according to
             Arnaldo's suggestion.
      
             "That will be done when, for instance, processing build-ids,
      	where we already have to process all samples to create the
      	build-id table, take advantage of that to further amortize
      	that processing by storing HEADER_SAMPLE_TIME to make
      	'perf report/script' faster when using --time."
      
      v4: Use perf script time style for timestamp printing. Also add with
          the printing of sample duration.
      
      v3: Remove the definitions of first_sample_time/last_sample_time from
          perf_session. Just define them in perf_evlist
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1512738826-2628-2-git-send-email-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      6011518d
    • Jin Yao's avatar
      perf report: Fix a no annotate browser displayed issue · 40c39e30
      Jin Yao authored
      When enabling '-b' option in perf record, for example,
      
        perf record -b ...
        perf report
      
      and then browsing the annotate browser from perf report (press 'A'), it
      would fail (annotate browser can't be displayed).
      
      It's because the '.add_entry_cb' op of struct report is overwritten by
      hist_iter__branch_callback() in builtin-report.c. But this function doesn't do
      something like mapping symbols and sources. So next, do_annotate() will return
      directly.
      
              notes = symbol__annotation(act->ms.sym);
              if (!notes->src)
                      return 0;
      
      This patch adds the lost code to hist_iter__branch_callback (refer to
      hist_iter__report_callback).
      
      v2:
      
      Fix a crash bug when perform 'perf report --stdio'.
      
      The reason is that we init the symbol annotation only in browser mode, it
      doesn't allocate/init resources for stdio mode.
      
      So now in hist_iter__branch_callback(), it will return directly if it's not in
      browser mode.
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1514284963-18587-1-git-send-email-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      40c39e30
    • Jin Yao's avatar
      perf report: Fix a wrong offset issue when using /proc/kcore · 935f5a9d
      Jin Yao authored
      When a valid vmlinux is not found, 'perf report' falls back to look at
      /proc/kcore. In this case, it will report the impossible large offset.
      
      For example:
      
        # perf record -b -e cycles:k find /etc/ > /dev/null
        # perf report --stdio --branch-history
      
          22.77%  _vm_normal_page+18446603336221188162
                  |
                  ---page_remove_rmap +18446603336221188324
                     page_remove_rmap +18446603336221188487 (cycles:5)
                     unlock_page_memcg +18446603336221188096
                     page_remove_rmap +18446603336221188327 (cycles:1)
      
      The issue is the value which is passed to parameter 'addr' in
      __get_srcline() is the objdump address. It's not correct if we calculate
      the offset by using 'addr - sym->start'.
      
      This patch creates a new parameter 'ip' in __get_srcline(). It is not
      converted to objdump address.
      
      With this patch, the perf report output is:
      
          22.77%  _vm_normal_page+66
                  |
                  ---page_remove_rmap +228
                     page_remove_rmap +391 (cycles:5)
                     unlock_page_memcg +0
                     page_remove_rmap +231 (cycles:1)
                     page_remove_rmap +236
      
      Committer testing:
      
      Make sure you get any valid vmlinux out of the way, using '-v' on the
      'perf report' case and deleting it from places where perf searches them,
      like your kernel build dir and the build-id cache, in ~/.debug/.
      Reported-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1514564812-17344-1-git-send-email-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      935f5a9d
    • Wang Nan's avatar
      perf tools: Fix compile error with libunwind x86 · 44df1afd
      Wang Nan authored
      Fix a compile error:
      
       ...
         CC       util/libunwind/x86_32.o
       In file included from util/libunwind/x86_32.c:33:0:
       util/libunwind/../../arch/x86/util/unwind-libunwind.c: In function 'libunwind__x86_reg_id':
       util/libunwind/../../arch/x86/util/unwind-libunwind.c:110:11: error: 'EINVAL' undeclared (first use in this function)
          return -EINVAL;
                  ^
       util/libunwind/../../arch/x86/util/unwind-libunwind.c:110:11: note: each undeclared identifier is reported only once for each function it appears in
       mv: cannot stat 'util/libunwind/.x86_32.o.tmp': No such file or directory
       make[4]: *** [util/libunwind/x86_32.o] Error 1
       make[3]: *** [util] Error 2
       make[2]: *** [libperf-in.o] Error 2
       make[1]: *** [sub-make] Error 2
       make: *** [all] Error 2
      
      It happens when libunwind-x86 feature is detected.
      Signed-off-by: default avatarWang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/20171206015040.114574-1-wangnan0@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      44df1afd
    • Arnaldo Carvalho de Melo's avatar
      perf test bpf: Hook on epoll_pwait() · e0337f4f
      Arnaldo Carvalho de Melo authored
      The 'perf test bpf' was hooking a eBPF program on the SyS_epoll_wait()
      kernel function, that was what the epoll_wait() glibc function ended up
      calling, but since at least glibc 2.26, the one that comes with, for
      instance, Fedora 27, glibc ends up calling SyS_epoll_pwait() when
      epoll_wait() is used, causing this 'perf test' entry to fail.
      
      So switch to using epoll_pwait() and hook the eBPF program to the
      SyS_epoll_pwait() kernel function to make it work on a wider range of
      glibc and kernel versions.
      Tested-by: default avatarWang Nan <wangnan0@huawei.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-zynvquy63er8s5mrgsz65pto@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e0337f4f
    • Arnaldo Carvalho de Melo's avatar
      perf test bpf: Use designated struct field initializers · 13cb2d0f
      Arnaldo Carvalho de Melo authored
      To follow standard practice in the kernel sources, documenting the
      initialization better and helping quickly finding the value for some
      field in a struct with many entries.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-syn3hz9hz7ukxlxbx5x6hv20@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      13cb2d0f
    • Arnaldo Carvalho de Melo's avatar
      perf test bpf: Improve message about expected samples · 6703c977
      Arnaldo Carvalho de Melo authored
      When failing on one of the BPF tests we were just stating:
      
        BPF filter result incorrect
      
      Add some more info to help figuring out the problem:
      
       BPF filter result incorrect, expected 56, got 0 samples
      
      This came out while investigating this failure, first seen after
      updating the kernel to the 4.15.0-rc6 tag:
      
        [root@jouet ~]# perf test bpf
        39: BPF filter               :
        39.1: Basic BPF filtering    : FAILED!
        39.2: BPF pinning            : Skip
        39.3: BPF prologue generation: Skip
        39.4: BPF relocation checker : Skip
        [root@jouet ~]#
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-403npu7daupv6b2bmxliv5pk@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      6703c977
  2. 06 Jan, 2018 3 commits
    • Ingo Molnar's avatar
      perf/x86/msr: Clean up the code · 9128d3ed
      Ingo Molnar authored
      Recent changes made a bit of an inconsistent mess out of arch/x86/events/msr.c,
      fix it:
      
       - re-align the initialization tables to be vertically aligned and readable again
      
       - harmonize comment style in terms of punctuation, capitalization and spelling
      
       - use curly braces for multi-condition branches
      
       - remove extra newlines
      
       - simplify the code a bit
      
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: kan.liang@intel.com
      Link: http://lkml.kernel.org/r/1515169132-3980-1-git-send-email-eranian@google.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      9128d3ed
    • Stephane Eranian's avatar
      perf/x86/msr: Add support for MSR_IA32_THERM_STATUS · 9ae21dd6
      Stephane Eranian authored
      This patch adds support for the Digital Readout provided by the
      IA32_THERM_STATUS MSR (0x19C) on Intel X86 processors. The readout
      shows the number of degrees Celcius to the TCC (critical temperature)
      supported by the processor. Thus, the larger, the better.
      
      The perf_event support is provided via the msr PMU. The new
      logical event is called cpu_thermal_margin. It comes with a unit and
      snapshot files. The event shows the current temprature distance (margin).
      It is not an accumulating event. The unit is degrees C. The event
      is provided per logical CPU to make things simpler but it is the
      same for both hyper-threads sharing a physical core.
      
      $ perf stat -I 1000 -a -A -e msr/cpu_thermal_margin/
      
      This will print the temperature for all logical CPUs.
                   time CPU                counts unit events
           1.000123741 CPU0                    38 C    msr/cpu_thermal_margin/
           1.000161837 CPU1                    37 C    msr/cpu_thermal_margin/
           1.000187906 CPU2                    36 C    msr/cpu_thermal_margin/
           1.000189046 CPU3                    39 C    msr/cpu_thermal_margin/
           1.000283044 CPU4                    40 C    msr/cpu_thermal_margin/
           1.000344297 CPU5                    40 C    msr/cpu_thermal_margin/
           1.000365832 CPU6                    39 C    msr/cpu_thermal_margin/
           ...
      
      In case the temperature margin cannot be read, the reported value would be -1.
      
      Works on all processors supporting the Digital Readout (dtherm in cpuinfo)
      Signed-off-by: default avatarStephane Eranian <eranian@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: kan.liang@intel.com
      Link: http://lkml.kernel.org/r/1515169132-3980-1-git-send-email-eranian@google.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      9ae21dd6
    • Ingo Molnar's avatar
      b6815f35
  3. 05 Jan, 2018 22 commits
  4. 04 Jan, 2018 4 commits
    • Thomas Gleixner's avatar
      x86/tlb: Drop the _GPL from the cpu_tlbstate export · 1e547681
      Thomas Gleixner authored
      The recent changes for PTI touch cpu_tlbstate from various tlb_flush
      inlines. cpu_tlbstate is exported as GPL symbol, so this causes a
      regression when building out of tree drivers for certain graphics cards.
      
      Aside of that the export was wrong since it was introduced as it should
      have been EXPORT_PER_CPU_SYMBOL_GPL().
      
      Use the correct PER_CPU export and drop the _GPL to restore the previous
      state which allows users to utilize the cards they payed for.
      
      As always I'm really thrilled to make this kind of change to support the
      #friends (or however the hot hashtag of today is spelled) from that closet
      sauce graphics corp.
      
      Fixes: 1e02ce4c ("x86: Store a per-cpu shadow copy of CR4")
      Fixes: 6fd166aa ("x86/mm: Use/Fix PCID to optimize user/kernel switches")
      Reported-by: default avatarKees Cook <keescook@google.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: stable@vger.kernel.org
      1e547681
    • Peter Zijlstra's avatar
      x86/events/intel/ds: Use the proper cache flush method for mapping ds buffers · 42f3bdc5
      Peter Zijlstra authored
      Thomas reported the following warning:
      
       BUG: using smp_processor_id() in preemptible [00000000] code: ovsdb-server/4498
       caller is native_flush_tlb_single+0x57/0xc0
       native_flush_tlb_single+0x57/0xc0
       __set_pte_vaddr+0x2d/0x40
       set_pte_vaddr+0x2f/0x40
       cea_set_pte+0x30/0x40
       ds_update_cea.constprop.4+0x4d/0x70
       reserve_ds_buffers+0x159/0x410
       x86_reserve_hardware+0x150/0x160
       x86_pmu_event_init+0x3e/0x1f0
       perf_try_init_event+0x69/0x80
       perf_event_alloc+0x652/0x740
       SyS_perf_event_open+0x3f6/0xd60
       do_syscall_64+0x5c/0x190
      
      set_pte_vaddr is used to map the ds buffers into the cpu entry area, but
      there are two problems with that:
      
       1) The resulting flush is not supposed to be called in preemptible context
      
       2) The cpu entry area is supposed to be per CPU, but the debug store
          buffers are mapped for all CPUs so these mappings need to be flushed
          globally.
      
      Add the necessary preemption protection across the mapping code and flush
      TLBs globally.
      
      Fixes: c1961a46 ("x86/events/intel/ds: Map debug buffers in cpu_entry_area")
      Reported-by: default avatarThomas Zeitlhofer <thomas.zeitlhofer+lkml@ze-it.at>
      Signed-off-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Tested-by: default avatarThomas Zeitlhofer <thomas.zeitlhofer+lkml@ze-it.at>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20180104170712.GB3040@hirez.programming.kicks-ass.net
      42f3bdc5
    • Thomas Gleixner's avatar
      x86/kaslr: Fix the vaddr_end mess · 1dddd251
      Thomas Gleixner authored
      vaddr_end for KASLR is only documented in the KASLR code itself and is
      adjusted depending on config options. So it's not surprising that a change
      of the memory layout causes KASLR to have the wrong vaddr_end. This can map
      arbitrary stuff into other areas causing hard to understand problems.
      
      Remove the whole ifdef magic and define the start of the cpu_entry_area to
      be the end of the KASLR vaddr range.
      
      Add documentation to that effect.
      
      Fixes: 92a0f81d ("x86/cpu_entry_area: Move it out of the fixmap")
      Reported-by: default avatarBenjamin Gilbert <benjamin.gilbert@coreos.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Tested-by: default avatarBenjamin Gilbert <benjamin.gilbert@coreos.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: stable <stable@vger.kernel.org>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Garnier <thgarnie@google.com>,
      Cc: Alexander Kuleshov <kuleshovmail@gmail.com>
      Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801041320360.1771@nanos
      1dddd251
    • Dave Airlie's avatar
      Merge tag 'drm-intel-fixes-2018-01-04' of... · bc6fe533
      Dave Airlie authored
      Merge tag 'drm-intel-fixes-2018-01-04' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
      
      drm/i915 fixes for v4.15-rc7
      - couple of documentation build fixes
      - serialize non-blocking modesets
      - prevent DMC from messing up GMBUS transfers
      - PSR regression fix
      
      * tag 'drm-intel-fixes-2018-01-04' of git://anongit.freedesktop.org/drm/drm-intel:
        drm/i915: Apply Display WA #1183 on skl, kbl, and cfl
        docs: fix, intel_guc_loader.c has been moved to intel_guc_fw.c
        documentation/gpu/i915: fix docs build error after file rename
        drm/i915: Put all non-blocking modesets onto an ordered wq
        drm/i915: Disable DC states around GMBUS on GLK
        drm/i915/psr: Fix register name mess up.
      bc6fe533