1. 08 May, 2019 6 commits
  2. 06 May, 2019 1 commit
  3. 05 May, 2019 5 commits
    • Linus Torvalds's avatar
      Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 7178fb0b
      Linus Torvalds authored
      Pull perf fixes from Ingo Molnar:
       "I'd like to apologize for this very late pull request: I was dithering
        through the week whether to send the fixes, and then yesterday Jiri's
        crash fix for a regression introduced in this cycle clearly marked
        perf/urgent as 'must merge now'.
      
        Most of the commits are tooling fixes, plus there's three kernel fixes
        via four commits:
      
          - race fix in the Intel PEBS code
      
          - fix an AUX bug and roll back a previous attempt
      
          - fix AMD family 17h generic HW cache-event perf counters
      
        The largest diffstat contribution comes from the AMD fix - a new event
        table is introduced, which is a fairly low risk change but has a large
        linecount"
      
      * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        perf/x86/intel: Fix race in intel_pmu_disable_event()
        perf/x86/intel/pt: Remove software double buffering PMU capability
        perf/ring_buffer: Fix AUX software double buffering
        perf tools: Remove needless asm/unistd.h include fixing build in some places
        tools arch uapi: Copy missing unistd.h headers for arc, hexagon and riscv
        tools build: Add -ldl to the disassembler-four-args feature test
        perf cs-etm: Always allocate memory for cs_etm_queue::prev_packet
        perf cs-etm: Don't check cs_etm_queue::prev_packet validity
        perf report: Report OOM in status line in the GTK UI
        perf bench numa: Add define for RUSAGE_THREAD if not present
        tools lib traceevent: Change tag string for error
        perf annotate: Fix build on 32 bit for BPF annotation
        tools uapi x86: Sync vmx.h with the kernel
        perf bpf: Return value with unlocking in perf_env__find_btf()
        MAINTAINERS: Include vendor specific files under arch/*/events/*
        perf/x86/amd: Update generic hardware cache events for Family 17h
      7178fb0b
    • Linus Torvalds's avatar
      Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 70c9fb57
      Linus Torvalds authored
      Pull scheduler fix from Ingo Molnar:
       "Fix a kobject memory leak in the cpufreq code"
      
      * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        sched/cpufreq: Fix kobject memleak
      70c9fb57
    • Linus Torvalds's avatar
      Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 13369e83
      Linus Torvalds authored
      Pull x86 fix from Ingo Molnar:
       "Disable function tracing during early SME setup to fix a boot crash on
        SME-enabled kernels running distro kernels (some of which have
        function tracing enabled)"
      
      * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/mm/mem_encrypt: Disable all instrumentation for early SME setup
      13369e83
    • Linus Torvalds's avatar
      Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 51987aff
      Linus Torvalds authored
      Pull vfs fixes from Al Viro:
      
       - a couple of ->i_link use-after-free fixes
      
       - regression fix for wrong errno on absent device name in mount(2)
         (this cycle stuff)
      
       - ancient UFS braino in large GID handling on Solaris UFS images (bogus
         cut'n'paste from large UID handling; wrong field checked to decide
         whether we should look at old (16bit) or new (32bit) field)
      
      * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        ufs: fix braino in ufs_get_inode_gid() for solaris UFS flavour
        Abort file_remove_privs() for non-reg. files
        [fix] get rid of checking for absent device name in vfs_get_tree()
        apparmorfs: fix use-after-free on symlink traversal
        securityfs: fix use-after-free on symlink traversal
      51987aff
    • Jiri Olsa's avatar
      perf/x86/intel: Fix race in intel_pmu_disable_event() · 6f55967a
      Jiri Olsa authored
      New race in x86_pmu_stop() was introduced by replacing the
      atomic __test_and_clear_bit() of cpuc->active_mask by separate
      test_bit() and __clear_bit() calls in the following commit:
      
        3966c3fe ("x86/perf/amd: Remove need to check "running" bit in NMI handler")
      
      The race causes panic for PEBS events with enabled callchains:
      
        BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
        ...
        RIP: 0010:perf_prepare_sample+0x8c/0x530
        Call Trace:
         <NMI>
         perf_event_output_forward+0x2a/0x80
         __perf_event_overflow+0x51/0xe0
         handle_pmi_common+0x19e/0x240
         intel_pmu_handle_irq+0xad/0x170
         perf_event_nmi_handler+0x2e/0x50
         nmi_handle+0x69/0x110
         default_do_nmi+0x3e/0x100
         do_nmi+0x11a/0x180
         end_repeat_nmi+0x16/0x1a
        RIP: 0010:native_write_msr+0x6/0x20
        ...
         </NMI>
         intel_pmu_disable_event+0x98/0xf0
         x86_pmu_stop+0x6e/0xb0
         x86_pmu_del+0x46/0x140
         event_sched_out.isra.97+0x7e/0x160
        ...
      
      The event is configured to make samples from PEBS drain code,
      but when it's disabled, we'll go through NMI path instead,
      where data->callchain will not get allocated and we'll crash:
      
                x86_pmu_stop
                  test_bit(hwc->idx, cpuc->active_mask)
                  intel_pmu_disable_event(event)
                  {
                    ...
                    intel_pmu_pebs_disable(event);
                    ...
      
      EVENT OVERFLOW ->  <NMI>
                           intel_pmu_handle_irq
                             handle_pmi_common
         TEST PASSES ->        test_bit(bit, cpuc->active_mask))
                                 perf_event_overflow
                                   perf_prepare_sample
                                   {
                                     ...
                                     if (!(sample_type & __PERF_SAMPLE_CALLCHAIN_EARLY))
                                           data->callchain = perf_callchain(event, regs);
      
               CRASH ->              size += data->callchain->nr;
                                   }
                         </NMI>
                    ...
                    x86_pmu_disable_event(event)
                  }
      
                  __clear_bit(hwc->idx, cpuc->active_mask);
      
      Fixing this by disabling the event itself before setting
      off the PEBS bit.
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: David Arcari <darcari@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Lendacky Thomas <Thomas.Lendacky@amd.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Fixes: 3966c3fe ("x86/perf/amd: Remove need to check "running" bit in NMI handler")
      Link: http://lkml.kernel.org/r/20190504151556.31031-1-jolsa@kernel.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      6f55967a
  4. 04 May, 2019 1 commit
  5. 03 May, 2019 8 commits
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · aa1be08f
      Linus Torvalds authored
      Pull KVM fixes from Paolo Bonzini:
      
       - PPC and ARM bugfixes from submaintainers
      
       - Fix old Windows versions on AMD (recent regression)
      
       - Fix old Linux versions on processors without EPT
      
       - Fixes for LAPIC timer optimizations
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (21 commits)
        KVM: nVMX: Fix size checks in vmx_set_nested_state
        KVM: selftests: make hyperv_cpuid test pass on AMD
        KVM: lapic: Check for in-kernel LAPIC before deferencing apic pointer
        KVM: fix KVM_CLEAR_DIRTY_LOG for memory slots of unaligned size
        x86/kvm/mmu: reset MMU context when 32-bit guest switches PAE
        KVM: x86: Whitelist port 0x7e for pre-incrementing %rip
        Documentation: kvm: fix dirty log ioctl arch lists
        KVM: VMX: Move RSB stuffing to before the first RET after VM-Exit
        KVM: arm/arm64: Don't emulate virtual timers on userspace ioctls
        kvm: arm: Skip stage2 huge mappings for unaligned ipa backed by THP
        KVM: arm/arm64: Ensure vcpu target is unset on reset failure
        KVM: lapic: Convert guest TSC to host time domain if necessary
        KVM: lapic: Allow user to disable adaptive tuning of timer advancement
        KVM: lapic: Track lapic timer advance per vCPU
        KVM: lapic: Disable timer advancement if adaptive tuning goes haywire
        x86: kvm: hyper-v: deal with buggy TLB flush requests from WS2012
        KVM: x86: Consider LAPIC TSC-Deadline timer expired if deadline too short
        KVM: PPC: Book3S: Protect memslots while validating user address
        KVM: PPC: Book3S HV: Perserve PSSCR FAKE_SUSPEND bit on guest exit
        KVM: arm/arm64: vgic-v3: Retire pending interrupts on disabling LPIs
        ...
      aa1be08f
    • Linus Torvalds's avatar
      Merge branch 'i2c/for-current-fixed' of... · 82463436
      Linus Torvalds authored
      Merge branch 'i2c/for-current-fixed' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/wsa/linux
      
      Pull i2c fixes from Wolfram Sang:
       "I2C driver bugfixes and a MAINTAINERS update for you"
      
      * 'i2c/for-current-fixed' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
        i2c: Prevent runtime suspend of adapter when Host Notify is required
        i2c: synquacer: fix enumeration of slave devices
        MAINTAINERS: friendly takeover of i2c-gpio driver
        i2c: designware: ratelimit 'transfer when suspended' errors
        i2c: imx: correct the method of getting private data in notifier_call
      82463436
    • Linus Torvalds's avatar
      Merge tag 'drm-fixes-2019-05-03' of git://anongit.freedesktop.org/drm/drm · a4ccb5f9
      Linus Torvalds authored
      Pull drm fix from Dave Airlie:
       "Just a single qxl revert"
      
      * tag 'drm-fixes-2019-05-03' of git://anongit.freedesktop.org/drm/drm:
        Revert "drm/qxl: drop prime import/export callbacks"
      a4ccb5f9
    • Linus Torvalds's avatar
      Merge tag 'clk-fixes-for-linus' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/clk/linux · 8f76216c
      Linus Torvalds authored
      Pull clk fixes from Stephen Boyd:
       "Two fixes for the NKMP clks on Allwinner SoCs, a locking fix for
        clkdev where we forgot to hold a lock while iterating a list that can
        change, and finally a build fix that adds some stubs for clk APIs that
        are used by devfreq drivers on platforms without the clk APIs"
      
      * tag 'clk-fixes-for-linus' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/clk/linux:
        clk: Add missing stubs for a few functions
        clkdev: Hold clocks_mutex while iterating clocks list
        clk: sunxi-ng: nkmp: Explain why zero width check is needed
        clk: sunxi-ng: nkmp: Avoid GENMASK(-1, 0)
      8f76216c
    • Linus Torvalds's avatar
      Merge tag 'sound-5.1' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · 46572f78
      Linus Torvalds authored
      Pull sound fixes from Takashi Iwai:
       "A few stable fixes at this round.
      
        The USB Line6 audio fixes are a bit large, but they are rather trivial
        and pretty much device-specific, so should be safe to apply at this
        late stage. Ditto for other HD-audio quirks"
      
      * tag 'sound-5.1' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
        ALSA: hda/realtek - Apply the fixup for ASUS Q325UAR
        ALSA: line6: use dynamic buffers
        ALSA: hda/realtek - Fixed Dell AIO speaker noise
        ALSA: hda/realtek - Add new Dell platform for headset mode
      46572f78
    • Alexander Shishkin's avatar
      perf/x86/intel/pt: Remove software double buffering PMU capability · 72e830f6
      Alexander Shishkin authored
      Now that all AUX allocations are high-order by default, the software
      double buffering PMU capability doesn't make sense any more, get rid
      of it. In case some PMUs choose to opt out, we can re-introduce it.
      Signed-off-by: default avatarAlexander Shishkin <alexander.shishkin@linux.intel.com>
      Acked-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: adrian.hunter@intel.com
      Link: http://lkml.kernel.org/r/20190503085536.24119-3-alexander.shishkin@linux.intel.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      72e830f6
    • Alexander Shishkin's avatar
      perf/ring_buffer: Fix AUX software double buffering · 26ae4f44
      Alexander Shishkin authored
      This recent commit:
      
        5768402f ("perf/ring_buffer: Use high order allocations for AUX buffers optimistically")
      
      overlooked the fact that the previous one page granularity of the AUX buffer
      provided an implicit double buffering capability to the PMU driver, which
      went away when the entire buffer became one high-order page.
      
      Always make the full-trace mode AUX allocation at least two-part to preserve
      the previous behavior and allow the implicit double buffering to continue.
      Reported-by: default avatarAmmy Yi <ammy.yi@intel.com>
      Signed-off-by: default avatarAlexander Shishkin <alexander.shishkin@linux.intel.com>
      Acked-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: adrian.hunter@intel.com
      Fixes: 5768402f ("perf/ring_buffer: Use high order allocations for AUX buffers optimistically")
      Link: http://lkml.kernel.org/r/20190503085536.24119-2-alexander.shishkin@linux.intel.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      26ae4f44
    • Ingo Molnar's avatar
      Merge tag 'perf-urgent-for-mingo-5.1-20190502' of... · 221856b1
      Ingo Molnar authored
      Merge tag 'perf-urgent-for-mingo-5.1-20190502' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
      
      Pull perf/urgent fixes from Arnaldo Carvalho de Melo:
      
      tools UAPI:
      
        Arnaldo Carvalho de Melo:
      
        - Sync x86's vmx.h with the kernel.
      
        - Copy missing unistd.h headers for arc, hexagon and riscv, fixing
          a reported build regression on the ARC 32-bit architecture.
      
      perf bench numa:
      
        Arnaldo Carvalho de Melo:
      
        - Add define for RUSAGE_THREAD if not present, fixing the build on the
          ARC architecture when only zlib and libnuma are present.
      
      perf BPF:
      
        Arnaldo Carvalho de Melo:
      
        - The disassembler-four-args feature test needs -ldl on distros such as
          Mageia 7.
      
        Bo YU:
      
        - Fix unlocking on success in perf_env__find_btf(), detected with
          the coverity tool.
      
      libtraceevent:
      
        Leo Yan:
      
        - Change misleading hard coded 'trace-cmd' string in error messages.
      
      ARM hardware tracing:
      
        Leo Yan:
      
        - Always allocate memory for cs_etm_queue::prev_packet, fixing a segfault
          when processing CoreSight perf data.
      
      perf annotate:
      
        Thadeu Lima de Souza Cascardo:
      
        - Fix build on 32 bit for BPF.
      
      perf report:
      
        Thomas Richter:
      
        - Report OOM in status line in the GTK UI.
      
      core libs:
      
        - Remove needless asm/unistd.h that, used with sys/syscall.h ended
          up redefining the syscalls defines in environments such as the
          ARC arch when using uClibc.
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      221856b1
  6. 02 May, 2019 19 commits
    • Dave Airlie's avatar
      Merge tag 'drm-misc-fixes-2019-05-02' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes · 1daa0449
      Dave Airlie authored
      - One revert for QXL for a DRI3 breakage
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      
      From: Maxime Ripard <maxime.ripard@bootlin.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190502122529.hguztj3kncaixe3d@flea
      1daa0449
    • Arnaldo Carvalho de Melo's avatar
      perf tools: Remove needless asm/unistd.h include fixing build in some places · 7e221b81
      Arnaldo Carvalho de Melo authored
      We were including sys/syscall.h and asm/unistd.h, since sys/syscall.h
      includes asm/unistd.h, sometimes this leads to the redefinition of
      defines, breaking the build.
      
      Noticed on ARC with uCLibc.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Arnaldo Carvalho de Melo <arnaldo.melo@gmail.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Vineet Gupta <Vineet.Gupta1@synopsys.com>
      Link: https://lkml.kernel.org/n/tip-xjpf80o64i2ko74aj2jih0qg@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      7e221b81
    • Arnaldo Carvalho de Melo's avatar
      tools arch uapi: Copy missing unistd.h headers for arc, hexagon and riscv · 18f90d37
      Arnaldo Carvalho de Melo authored
      Since those were introduced in:
      
        c8ce48f0 ("asm-generic: Make time32 syscall numbers optional")
      
      But when the asm-generic/unistd.h was sync'ed with tools/ in:
      
        1a787fc5 ("tools headers uapi: Sync copy of asm-generic/unistd.h with the kernel sources")
      
      I forgot to copy the files for the architectures that define
      __ARCH_WANT_TIME32_SYSCALLS, so the perf build was breaking there, as
      reported by Vineet Gupta for the ARC architecture.
      
      After updating my ARC container to use the glibc based toolchain + cross
      building libnuma, zlib and elfutils, I finally managed to reproduce the
      problem and verify that this now is fixed and will not regress as will
      be tested before each pull req sent upstream.
      Reported-by: default avatarVineet Gupta <Vineet.Gupta1@synopsys.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Jiri Olsa <jolsa@kernel.org>
      CC: linux-snps-arc@lists.infradead.org
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/r/20190426193531.GC28586@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      18f90d37
    • Arnaldo Carvalho de Melo's avatar
      tools build: Add -ldl to the disassembler-four-args feature test · c638417e
      Arnaldo Carvalho de Melo authored
      Thomas Backlund reported that the perf build was failing on the Mageia 7
      distro, that is because it uses:
      
        cat /tmp/build/perf/feature/test-disassembler-four-args.make.output
        /usr/bin/ld: /usr/lib64/libbfd.a(plugin.o): in function `try_load_plugin':
        /home/iurt/rpmbuild/BUILD/binutils-2.32/objs/bfd/../../bfd/plugin.c:243:
        undefined reference to `dlopen'
        /usr/bin/ld:
        /home/iurt/rpmbuild/BUILD/binutils-2.32/objs/bfd/../../bfd/plugin.c:271:
        undefined reference to `dlsym'
        /usr/bin/ld:
        /home/iurt/rpmbuild/BUILD/binutils-2.32/objs/bfd/../../bfd/plugin.c:256:
        undefined reference to `dlclose'
        /usr/bin/ld:
        /home/iurt/rpmbuild/BUILD/binutils-2.32/objs/bfd/../../bfd/plugin.c:246:
        undefined reference to `dlerror'
        as we allow dynamic linking and loading
      
      Mageia 7 uses these linker flags:
        $ rpm --eval %ldflags
          -Wl,--as-needed -Wl,--no-undefined -Wl,-z,relro -Wl,-O1 -Wl,--build-id -Wl,--enable-new-dtags
      
      So add -ldl to this feature LDFLAGS.
      Reported-by: default avatarThomas Backlund <tmb@mageia.org>
      Tested-by: default avatarThomas Backlund <tmb@mageia.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Song Liu <songliubraving@fb.com>
      Link: https://lkml.kernel.org/r/20190501173158.GC21436@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c638417e
    • Leo Yan's avatar
      perf cs-etm: Always allocate memory for cs_etm_queue::prev_packet · 35bb59c1
      Leo Yan authored
      Robert Walker reported a segmentation fault is observed when process
      CoreSight trace data; this issue can be easily reproduced by the command
      'perf report --itrace=i1000i' for decoding tracing data.
      
      If neither the 'b' flag (synthesize branches events) nor 'l' flag
      (synthesize last branch entries) are specified to option '--itrace',
      cs_etm_queue::prev_packet will not been initialised.  After merging the
      code to support exception packets and sample flags, there introduced a
      number of uses of cs_etm_queue::prev_packet without checking whether it
      is valid, for these cases any accessing to uninitialised prev_packet
      will cause crash.
      
      As cs_etm_queue::prev_packet is used more widely now and it's already
      hard to follow which functions have been called in a context where the
      validity of cs_etm_queue::prev_packet has been checked, this patch
      always allocates memory for cs_etm_queue::prev_packet.
      Reported-by: default avatarRobert Walker <robert.walker@arm.com>
      Suggested-by: default avatarRobert Walker <robert.walker@arm.com>
      Signed-off-by: default avatarLeo Yan <leo.yan@linaro.org>
      Tested-by: default avatarRobert Walker <robert.walker@arm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Suzuki K Poulouse <suzuki.poulose@arm.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Fixes: 7100b12c ("perf cs-etm: Generate branch sample for exception packet")
      Fixes: 24fff5eb ("perf cs-etm: Avoid stale branch samples when flush packet")
      Link: http://lkml.kernel.org/r/20190428083228.20246-1-leo.yan@linaro.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      35bb59c1
    • Leo Yan's avatar
      perf cs-etm: Don't check cs_etm_queue::prev_packet validity · cf0c37b6
      Leo Yan authored
      Since cs_etm_queue::prev_packet is allocated for all cases, it will
      never be NULL pointer; now validity checking prev_packet is pointless,
      remove all of them.
      Signed-off-by: default avatarLeo Yan <leo.yan@linaro.org>
      Tested-by: default avatarRobert Walker <robert.walker@arm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Suzuki K Poulouse <suzuki.poulose@arm.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: http://lkml.kernel.org/r/20190428083228.20246-2-leo.yan@linaro.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      cf0c37b6
    • Thomas Richter's avatar
      perf report: Report OOM in status line in the GTK UI · 167e418f
      Thomas Richter authored
      An -ENOMEM error is not reported in the GTK GUI.  Instead this error
      message pops up on the screen:
      
      [root@m35lp76 perf]# ./perf  report -i perf.data.error68-1
      
      	Processing events... [974K/3M]
      	Error:failed to process sample
      
      	0xf4198 [0x8]: failed to process type: 68
      
      However when I use the same perf.data file with --stdio it works:
      
      [root@m35lp76 perf]# ./perf  report -i perf.data.error68-1 --stdio \
      		| head -12
      
        # Total Lost Samples: 0
        #
        # Samples: 76K of event 'cycles'
        # Event count (approx.): 99056160000
        #
        # Overhead  Command          Shared Object      Symbol
        # ........  ...............  .................  .........
        #
           8.81%  find             [kernel.kallsyms]  [k] ftrace_likely_update
           8.74%  swapper          [kernel.kallsyms]  [k] ftrace_likely_update
           8.34%  sshd             [kernel.kallsyms]  [k] ftrace_likely_update
           2.19%  kworker/u512:1-  [kernel.kallsyms]  [k] ftrace_likely_update
      
      The sample precentage is a bit low.....
      
      The GUI always fails in the FINISHED_ROUND event (68) and does not
      indicate the reason why.
      
      When happened is the following. Perf report calls a lot of functions and
      down deep when a FINISHED_ROUND event is processed, these functions are
      called:
      
        perf_session__process_event()
        + perf_session__process_user_event()
          + process_finished_round()
            + ordered_events__flush()
              + __ordered_events__flush()
      	  + do_flush()
      	    + ordered_events__deliver_event()
      	      + perf_session__deliver_event()
      	        + machine__deliver_event()
      	          + perf_evlist__deliver_event()
      	            + process_sample_event()
      	              + hist_entry_iter_add() --> only called in GUI case!!!
      	                + hist_iter__report__callback()
      	                  + symbol__inc_addr_sample()
      
      	                    Now this functions runs out of memory and
      			    returns -ENOMEM. This is reported all the way up
      			    until function
      
      perf_session__process_event() returns to its caller, where -ENOMEM is
      changed to -EINVAL and processing stops:
      
       if ((skip = perf_session__process_event(session, event, head)) < 0) {
            pr_err("%#" PRIx64 " [%#x]: failed to process type: %d\n",
      	     head, event->header.size, event->header.type);
            err = -EINVAL;
            goto out_err;
       }
      
      This occurred in the FINISHED_ROUND event when it has to process some
      10000 entries and ran out of memory.
      
      This patch indicates the root cause and displays it in the status line
      of ther perf report GUI.
      
      Output before (on GUI status line):
      
        0xf4198 [0x8]: failed to process type: 68
      
      Output after:
      
        0xf4198 [0x8]: failed to process type: 68 [not enough memory]
      
      Committer notes:
      
      the 'skip' variable needs to be initialized to -EINVAL, so that when the
      size is less than sizeof(struct perf_event_attr) we avoid this valid
      compiler warning:
      
        util/session.c: In function ‘perf_session__process_events’:
        util/session.c:1936:7: error: ‘skip’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
           err = skip;
           ~~~~^~~~~~
        util/session.c:1874:6: note: ‘skip’ was declared here
          s64 skip;
              ^~~~
        cc1: all warnings being treated as errors
      Signed-off-by: default avatarThomas Richter <tmricht@linux.ibm.com>
      Reviewed-by: default avatarHendrik Brueckner <brueckner@linux.ibm.com>
      Reviewed-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Link: http://lkml.kernel.org/r/20190423105303.61683-1-tmricht@linux.ibm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      167e418f
    • Arnaldo Carvalho de Melo's avatar
      perf bench numa: Add define for RUSAGE_THREAD if not present · bf561d3c
      Arnaldo Carvalho de Melo authored
      While cross building perf to the ARC architecture on a fedora 30 host,
      we were failing with:
      
            CC       /tmp/build/perf/bench/numa.o
        bench/numa.c: In function ‘worker_thread’:
        bench/numa.c:1261:12: error: ‘RUSAGE_THREAD’ undeclared (first use in this function); did you mean ‘SIGEV_THREAD’?
          getrusage(RUSAGE_THREAD, &rusage);
                    ^~~~~~~~~~~~~
                    SIGEV_THREAD
        bench/numa.c:1261:12: note: each undeclared identifier is reported only once for each function it appears in
      
      [perfbuilder@60d5802468f6 perf]$ /arc_gnu_2019.03-rc1_prebuilt_uclibc_le_archs_linux_install/bin/arc-linux-gcc --version | head -1
      arc-linux-gcc (ARCv2 ISA Linux uClibc toolchain 2019.03-rc1) 8.3.1 20190225
      [perfbuilder@60d5802468f6 perf]$
      
      Trying to reproduce a report by Vineet, I noticed that, with just
      cross-built zlib and numactl libraries, I ended up with the above
      failure.
      
      So, since RUSAGE_THREAD is available as a define, check for that and
      numactl libraries, I ended up with the above failure.
      
      So, since RUSAGE_THREAD is available as a define in the system headers,
      check if it is defined in the 'perf bench numa' sources and define it if
      not.
      
      Now it builds and I have to figure out if the problem reported by Vineet
      only takes place if we have libelf or some other library available.
      
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: linux-snps-arc@lists.infradead.org
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Vineet Gupta <Vineet.Gupta1@synopsys.com>
      Link: https://lkml.kernel.org/n/tip-2wb4r1gir9xrevbpq7qp0amk@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      bf561d3c
    • Leo Yan's avatar
      tools lib traceevent: Change tag string for error · 5f05182f
      Leo Yan authored
      The traceevent lib is used by the perf tool, and when executing
      
        perf test -v 6
      
      it outputs error log on the ARM64 platform:
      
        running test 33 '*:*'trace-cmd: No such file or directory
      
        [...]
      
        trace-cmd: Invalid argument
      
      The trace event parsing code originally came from trace-cmd so it keeps
      the tag string "trace-cmd" for errors, this easily introduces the
      impression that the perf tool launches trace-cmd command for trace event
      parsing, but in fact the related parsing is accomplished by the
      traceevent lib.
      
      This patch changes the tag string to "libtraceevent" so that we can
      avoid confusion and let users to more easily connect the error with
      traceevent lib.
      Signed-off-by: default avatarLeo Yan <leo.yan@linaro.org>
      Acked-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/r/20190424013802.27569-1-leo.yan@linaro.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      5f05182f
    • Thadeu Lima de Souza Cascardo's avatar
      perf annotate: Fix build on 32 bit for BPF annotation · 01e985e9
      Thadeu Lima de Souza Cascardo authored
      Commit 6987561c ("perf annotate: Enable annotation of BPF programs") adds
      support for BPF programs annotations but the new code does not build on 32-bit.
      Signed-off-by: default avatarThadeu Lima de Souza Cascardo <cascardo@canonical.com>
      Acked-by: default avatarSong Liu <songliubraving@fb.com>
      Fixes: 6987561c ("perf annotate: Enable annotation of BPF programs")
      Link: http://lkml.kernel.org/r/20190403194452.10845-1-cascardo@canonical.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      01e985e9
    • Arnaldo Carvalho de Melo's avatar
      tools uapi x86: Sync vmx.h with the kernel · 24e45b49
      Arnaldo Carvalho de Melo authored
      To pick up the changes from:
      
        2b27924b ("KVM: nVMX: always use early vmcs check when EPT is disabled")
      
      That causes this object in the tools/perf build process to be rebuilt:
      
        CC       /tmp/build/perf/arch/x86/util/kvm-stat.o
      
      But it isn't using VMX_ABORT_ prefixed constants, so no change in
      behaviour.
      
      This silences this perf build warning:
      
        Warning: Kernel ABI header at 'tools/arch/x86/include/uapi/asm/vmx.h' differs from latest version at 'arch/x86/include/uapi/asm/vmx.h'
        diff -u tools/arch/x86/include/uapi/asm/vmx.h arch/x86/include/uapi/asm/vmx.h
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Link: https://lkml.kernel.org/n/tip-bjbo3zc0r8i8oa0udpvftya6@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      24e45b49
    • Bo YU's avatar
      perf bpf: Return value with unlocking in perf_env__find_btf() · 2e712675
      Bo YU authored
      In perf_env__find_btf(), we're returning without unlocking
      "env->bpf_progs.lock". There may be cause lockdep issue.
      
      Detected by CoversityScan, CID# 1444762:(program hangs(LOCK))
      Signed-off-by: default avatarBo YU <tsu.yubo@gmail.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Martin KaFai Lau <kafai@fb.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Yonghong Song <yhs@fb.com>
      Cc: bpf@vger.kernel.org
      Cc: netdev@vger.kernel.org
      Fixes: 2db7b1e0: (perf bpf: Return NULL when RB tree lookup fails in perf_env__find_btf())
      Link: http://lkml.kernel.org/r/20190422080138.10088-1-tsu.yubo@gmail.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      2e712675
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · ea986679
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Out of bounds access in xfrm IPSEC policy unlink, from Yue Haibing.
      
       2) Missing length check for esp4 UDP encap, from Sabrina Dubroca.
      
       3) Fix byte order of RX STBC access in mac80211, from Johannes Berg.
      
       4) Inifnite loop in bpftool map create, from Alban Crequy.
      
       5) Register mark fix in ebpf verifier after pkt/null checks, from Paul
          Chaignon.
      
       6) Properly use rcu_dereference_sk_user_data in L2TP code, from Eric
          Dumazet.
      
       7) Buffer overrun in marvell phy driver, from Andrew Lunn.
      
       8) Several crash and statistics handling fixes to bnxt_en driver, from
          Michael Chan and Vasundhara Volam.
      
       9) Several fixes to the TLS layer from Jakub Kicinski (copying negative
          amounts of data in reencrypt, reencrypt frag copying, blind nskb->sk
          NULL deref, etc).
      
      10) Several UDP GRO fixes, from Paolo Abeni and Eric Dumazet.
      
      11) PID/UID checks on ipv6 flow labels are inverted, from Willem de
          Bruijn.
      
      12) Use after free in l2tp, from Eric Dumazet.
      
      13) IPV6 route destroy races, also from Eric Dumazet.
      
      14) SCTP state machine can erroneously run recursively, fix from Xin
          Long.
      
      15) Adjust AF_PACKET msg_name length checks, add padding bytes if
          necessary. From Willem de Bruijn.
      
      16) Preserve skb_iif, so that forwarded packets have consistent values
          even if fragmentation is involved. From Shmulik Ladkani.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (69 commits)
        udp: fix GRO packet of death
        ipv6: A few fixes on dereferencing rt->from
        rds: ib: force endiannes annotation
        selftests: fib_rule_tests: print the result and return 1 if any tests failed
        ipv4: ip_do_fragment: Preserve skb_iif during fragmentation
        net/tls: avoid NULL pointer deref on nskb->sk in fallback
        selftests: fib_rule_tests: Fix icmp proto with ipv6
        packet: validate msg_namelen in send directly
        packet: in recvmsg msg_name return at least sizeof sockaddr_ll
        sctp: avoid running the sctp state machine recursively
        stmmac: pci: Fix typo in IOT2000 comment
        Documentation: fix netdev-FAQ.rst markup warning
        ipv6: fix races in ip6_dst_destroy()
        l2ip: fix possible use-after-free
        appletalk: Set error code if register_snap_client failed
        net: dsa: bcm_sf2: fix buffer overflow doing set_rxnfc
        rxrpc: Fix net namespace cleanup
        ipv6/flowlabel: wait rcu grace period before put_pid()
        vrf: Use orig netdev to count Ip6InNoRoutes and a fresh route lookup when sending dest unreach
        tcp: add sanity tests in tcp_add_backlog()
        ...
      ea986679
    • Linus Torvalds's avatar
      Merge tag 'for-linus-20190502' of git://git.kernel.dk/linux-block · 5ce3307b
      Linus Torvalds authored
      Pull io_uring fixes from Jens Axboe:
       "This is mostly io_uring fixes/tweaks. Most of these were actually done
        in time for the last -rc, but I wanted to ensure that everything
        tested out great before including them. The code delta looks larger
        than it really is, as it's mostly just comment additions/changes.
      
        Outside of the comment additions/changes, this is mostly removal of
        unnecessary barriers. In all, this pull request contains:
      
         - Tweak to how we handle errors at submission time. We now post a
           completion event if the error occurs on behalf of an sqe, instead
           of returning it through the system call. If the error happens
           outside of a specific sqe, we return the error through the system
           call. This makes it nicer to use and makes the "normal" use case
           behave the same as the offload cases. (me)
      
         - Fix for a missing req reference drop from async context (me)
      
         - If an sqe is submitted with RWF_NOWAIT, don't punt it to async
           context. Return -EAGAIN directly, instead of using it as a hint to
           do async punt. (Stefan)
      
         - Fix notes on barriers (Stefan)
      
         - Remove unnecessary barriers (Stefan)
      
         - Fix potential double free of memory in setup error (Mark)
      
         - Further improve sq poll CPU validation (Mark)
      
         - Fix page allocation warning and leak on buffer registration error
           (Mark)
      
         - Fix iov_iter_type() for new no-ref flag (Ming)
      
         - Fix a case where dio doesn't honor bio no-page-ref (Ming)"
      
      * tag 'for-linus-20190502' of git://git.kernel.dk/linux-block:
        io_uring: avoid page allocation warnings
        iov_iter: fix iov_iter_type
        block: fix handling for BIO_NO_PAGE_REF
        io_uring: drop req submit reference always in async punt
        io_uring: free allocated io_memory once
        io_uring: fix SQPOLL cpu validation
        io_uring: have submission side sqe errors post a cqe
        io_uring: remove unnecessary barrier after unsetting IORING_SQ_NEED_WAKEUP
        io_uring: remove unnecessary barrier after incrementing dropped counter
        io_uring: remove unnecessary barrier before reading SQ tail
        io_uring: remove unnecessary barrier after updating SQ head
        io_uring: remove unnecessary barrier before reading cq head
        io_uring: remove unnecessary barrier before wq_has_sleeper
        io_uring: fix notes on barriers
        io_uring: fix handling SQEs requesting NOWAIT
      5ce3307b
    • Jarkko Nikula's avatar
      i2c: Prevent runtime suspend of adapter when Host Notify is required · 72bfcee1
      Jarkko Nikula authored
      Multiple users have reported their Synaptics touchpad has stopped
      working between v4.20.1 and v4.20.2 when using SMBus interface.
      
      The culprit for this appeared to be commit c5eb1190 ("PCI / PM: Allow
      runtime PM without callback functions") that fixed the runtime PM for
      i2c-i801 SMBus adapter. Those Synaptics touchpad are using i2c-i801
      for SMBus communication and testing showed they are able to get back
      working by preventing the runtime suspend of adapter.
      
      Normally when i2c-i801 SMBus adapter transmits with the client it resumes
      before operation and autosuspends after.
      
      However, if client requires SMBus Host Notify protocol, what those
      Synaptics touchpads do, then the host adapter must not go to runtime
      suspend since then it cannot process incoming SMBus Host Notify commands
      the client may send.
      
      Fix this by keeping I2C/SMBus adapter active in case client requires
      Host Notify.
      Reported-by: default avatarKeijo Vaara <ferdasyn@rocketmail.com>
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=203297
      Fixes: c5eb1190 ("PCI / PM: Allow runtime PM without callback functions")
      Cc: stable@vger.kernel.org # v4.20+
      Signed-off-by: default avatarJarkko Nikula <jarkko.nikula@linux.intel.com>
      Acked-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Tested-by: default avatarKeijo Vaara <ferdasyn@rocketmail.com>
      Signed-off-by: default avatarWolfram Sang <wsa@the-dreams.de>
      72bfcee1
    • Ard Biesheuvel's avatar
      i2c: synquacer: fix enumeration of slave devices · 95e0cf3c
      Ard Biesheuvel authored
      The I2C host driver for SynQuacer fails to populate the of_node and
      ACPI companion fields of the struct i2c_adapter it instantiates,
      resulting in enumeration of the subordinate I2C bus to fail.
      
      Fixes: 0d676a6c ("i2c: add support for Socionext SynQuacer I2C controller")
      Cc: <stable@vger.kernel.org> # v4.19+
      Signed-off-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: default avatarWolfram Sang <wsa@the-dreams.de>
      95e0cf3c
    • Wolfram Sang's avatar
      MAINTAINERS: friendly takeover of i2c-gpio driver · fb31fbef
      Wolfram Sang authored
      I haven't heard from Haavard in years despite putting him to the CC list for
      i2c-gpio related mails. Since I was doing the work on this driver for a while
      now, let me take official maintainership, so it will be more clear to users.
      Signed-off-by: default avatarWolfram Sang <wsa+renesas@sang-engineering.com>
      Acked-by: default avatarHaavard Skinnemoen <hskinnemoen@gmail.com>
      Signed-off-by: default avatarWolfram Sang <wsa@the-dreams.de>
      fb31fbef
    • Kim Phillips's avatar
      MAINTAINERS: Include vendor specific files under arch/*/events/* · 1804569d
      Kim Phillips authored
      Add an explicit subdirectory specification for arch/x86/events/amd to
      the MAINTAINERS file, to distinguish it from its parent. This will
      produce the correct set of maintainers for the files found therein.
      Signed-off-by: default avatarKim Phillips <kim.phillips@amd.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Gary Hook <Gary.Hook@amd.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Martin Liška <mliska@suse.cz>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Pu Wen <puwen@hygon.cn>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Thomas Lendacky <Thomas.Lendacky@amd.com>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: linux-kernel@vger.kernel.org
      Fixes: 39b0332a ("perf/x86: Move perf_event_amd.c ........... => x86/events/amd/core.c")
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      1804569d
    • Kim Phillips's avatar
      perf/x86/amd: Update generic hardware cache events for Family 17h · 0e3b74e2
      Kim Phillips authored
      Add a new amd_hw_cache_event_ids_f17h assignment structure set
      for AMD families 17h and above, since a lot has changed.  Specifically:
      
      L1 Data Cache
      
      The data cache access counter remains the same on Family 17h.
      
      For DC misses, PMCx041's definition changes with Family 17h,
      so instead we use the L2 cache accesses from L1 data cache
      misses counter (PMCx060,umask=0xc8).
      
      For DC hardware prefetch events, Family 17h breaks compatibility
      for PMCx067 "Data Prefetcher", so instead, we use PMCx05a "Hardware
      Prefetch DC Fills."
      
      L1 Instruction Cache
      
      PMCs 0x80 and 0x81 (32-byte IC fetches and misses) are backward
      compatible on Family 17h.
      
      For prefetches, we remove the erroneous PMCx04B assignment which
      counts how many software data cache prefetch load instructions were
      dispatched.
      
      LL - Last Level Cache
      
      Removing PMCs 7D, 7E, and 7F assignments, as they do not exist
      on Family 17h, where the last level cache is L3.  L3 counters
      can be accessed using the existing AMD Uncore driver.
      
      Data TLB
      
      On Intel machines, data TLB accesses ("dTLB-loads") are assigned
      to counters that count load/store instructions retired.  This
      is inconsistent with instruction TLB accesses, where Intel
      implementations report iTLB misses that hit in the STLB.
      
      Ideally, dTLB-loads would count higher level dTLB misses that hit
      in lower level TLBs, and dTLB-load-misses would report those
      that also missed in those lower-level TLBs, therefore causing
      a page table walk.  That would be consistent with instruction
      TLB operation, remove the redundancy between dTLB-loads and
      L1-dcache-loads, and prevent perf from producing artificially
      low percentage ratios, i.e. the "0.01%" below:
      
              42,550,869      L1-dcache-loads
              41,591,860      dTLB-loads
                   4,802      dTLB-load-misses          #    0.01% of all dTLB cache hits
               7,283,682      L1-dcache-stores
               7,912,392      dTLB-stores
                     310      dTLB-store-misses
      
      On AMD Families prior to 17h, the "Data Cache Accesses" counter is
      used, which is slightly better than load/store instructions retired,
      but still counts in terms of individual load/store operations
      instead of TLB operations.
      
      So, for AMD Families 17h and higher, this patch assigns "dTLB-loads"
      to a counter for L1 dTLB misses that hit in the L2 dTLB, and
      "dTLB-load-misses" to a counter for L1 DTLB misses that caused
      L2 DTLB misses and therefore also caused page table walks.  This
      results in a much more accurate view of data TLB performance:
      
              60,961,781      L1-dcache-loads
                   4,601      dTLB-loads
                     963      dTLB-load-misses          #   20.93% of all dTLB cache hits
      
      Note that for all AMD families, data loads and stores are combined
      in a single accesses counter, so no 'L1-dcache-stores' are reported
      separately, and stores are counted with loads in 'L1-dcache-loads'.
      
      Also note that the "% of all dTLB cache hits" string is misleading
      because (a) "dTLB cache": although TLBs can be considered caches for
      page tables, in this context, it can be misinterpreted as data cache
      hits because the figures are similar (at least on Intel), and (b) not
      all those loads (technically accesses) technically "hit" at that
      hardware level.  "% of all dTLB accesses" would be more clear/accurate.
      
      Instruction TLB
      
      On Intel machines, 'iTLB-loads' measure iTLB misses that hit in the
      STLB, and 'iTLB-load-misses' measure iTLB misses that also missed in
      the STLB and completed a page table walk.
      
      For AMD Family 17h and above, for 'iTLB-loads' we replace the
      erroneous instruction cache fetches counter with PMCx084
      "L1 ITLB Miss, L2 ITLB Hit".
      
      For 'iTLB-load-misses' we still use PMCx085 "L1 ITLB Miss,
      L2 ITLB Miss", but set a 0xff umask because without it the event
      does not get counted.
      
      Branch Predictor (BPU)
      
      PMCs 0xc2 and 0xc3 continue to be valid across all AMD Families.
      
      Node Level Events
      
      Family 17h does not have a PMCx0e9 counter, and corresponding counters
      have not been made available publicly, so for now, we mark them as
      unsupported for Families 17h and above.
      
      Reference:
      
        "Open-Source Register Reference For AMD Family 17h Processors Models 00h-2Fh"
        Released 7/17/2018, Publication #56255, Revision 3.03:
        https://www.amd.com/system/files/TechDocs/56255_OSRR.pdf
      
      [ mingo: tidied up the line breaks. ]
      Signed-off-by: default avatarKim Phillips <kim.phillips@amd.com>
      Cc: <stable@vger.kernel.org> # v4.9+
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Martin Liška <mliska@suse.cz>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Pu Wen <puwen@hygon.cn>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Thomas Lendacky <Thomas.Lendacky@amd.com>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: linux-kernel@vger.kernel.org
      Cc: linux-perf-users@vger.kernel.org
      Fixes: e40ed154 ("perf/x86: Add perf support for AMD family-17h processors")
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      0e3b74e2