1. 17 Apr, 2019 4 commits
    • Jiri Olsa's avatar
      perf tools: Fix map reference counting · b9abbdfa
      Jiri Olsa authored
      By calling maps__insert() we assume to get 2 references on the map,
      which we relese within maps__remove call.
      
      However if there's already same map name, we currently don't bump the
      reference and can crash, like:
      
        Program received signal SIGABRT, Aborted.
        0x00007ffff75e60f5 in raise () from /lib64/libc.so.6
      
        (gdb) bt
        #0  0x00007ffff75e60f5 in raise () from /lib64/libc.so.6
        #1  0x00007ffff75d0895 in abort () from /lib64/libc.so.6
        #2  0x00007ffff75d0769 in __assert_fail_base.cold () from /lib64/libc.so.6
        #3  0x00007ffff75de596 in __assert_fail () from /lib64/libc.so.6
        #4  0x00000000004fc006 in refcount_sub_and_test (i=1, r=0x1224e88) at tools/include/linux/refcount.h:131
        #5  refcount_dec_and_test (r=0x1224e88) at tools/include/linux/refcount.h:148
        #6  map__put (map=0x1224df0) at util/map.c:299
        #7  0x00000000004fdb95 in __maps__remove (map=0x1224df0, maps=0xb17d80) at util/map.c:953
        #8  maps__remove (maps=0xb17d80, map=0x1224df0) at util/map.c:959
        #9  0x00000000004f7d8a in map_groups__remove (map=<optimized out>, mg=<optimized out>) at util/map_groups.h:65
        #10 machine__process_ksymbol_unregister (sample=<optimized out>, event=0x7ffff7279670, machine=<optimized out>) at util/machine.c:728
        #11 machine__process_ksymbol (machine=<optimized out>, event=0x7ffff7279670, sample=<optimized out>) at util/machine.c:741
        #12 0x00000000004fffbb in perf_session__deliver_event (session=0xb11390, event=0x7ffff7279670, tool=0x7fffffffc7b0, file_offset=13936) at util/session.c:1362
        #13 0x00000000005039bb in do_flush (show_progress=false, oe=0xb17e80) at util/ordered-events.c:243
        #14 __ordered_events__flush (oe=0xb17e80, how=OE_FLUSH__ROUND, timestamp=<optimized out>) at util/ordered-events.c:322
        #15 0x00000000005005e4 in perf_session__process_user_event (session=session@entry=0xb11390, event=event@entry=0x7ffff72a4af8,
        ...
      
      Add the map to the list and getting the reference event if we find the
      map with same name.
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Eric Saint-Etienne <eric.saint.etienne@oracle.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <songliubraving@fb.com>
      Fixes: 1e628569 ("perf symbols: Fix slowness due to -ffunction-section")
      Link: http://lkml.kernel.org/r/20190416160127.30203-10-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      b9abbdfa
    • Jiri Olsa's avatar
      perf evlist: Fix side band thread draining · adc6257c
      Jiri Olsa authored
      Current perf_evlist__poll_thread() code could finish without draining
      the data. Adding the logic that makes sure we won't finish before the
      drain.
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <songliubraving@fb.com>
      Fixes: 657ee553 ("perf evlist: Introduce side band thread")
      Link: http://lkml.kernel.org/r/20190416160127.30203-9-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      adc6257c
    • Song Liu's avatar
      perf tools: Check maps for bpf programs · a93e0b23
      Song Liu authored
      As reported by Jiri Olsa in:
      
        "[BUG] perf: intel_pt won't display kernel function"
        https://lore.kernel.org/lkml/20190403143738.GB32001@krava
      
      Recent changes to support PERF_RECORD_KSYMBOL and PERF_RECORD_BPF_EVENT
      broke --kallsyms option. This is because it broke test __map__is_kmodule.
      
      This patch fixes this by adding check for bpf program, so that these maps
      are not mistaken as kernel modules.
      Signed-off-by: default avatarSong Liu <songliubraving@fb.com>
      Reported-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Martin KaFai Lau <kafai@fb.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Yonghong Song <yhs@fb.com>
      Link: http://lkml.kernel.org/r/20190416160127.30203-8-jolsa@kernel.org
      Fixes: 76193a94 ("perf, bpf: Introduce PERF_RECORD_KSYMBOL")
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      a93e0b23
    • Jiri Olsa's avatar
      perf bpf: Return NULL when RB tree lookup fails in perf_env__find_bpf_prog_info() · aa526602
      Jiri Olsa authored
      We currently don't return NULL in case we don't find the
      bpf_prog_info_node, fixing that.
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Acked-by: default avatarSong Liu <songliubraving@fb.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Fixes: e4378f0c ("perf bpf: Save bpf_prog_info in a rbtree in perf_env")
      Link: http://lkml.kernel.org/r/20190416134151.15282-1-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      aa526602
  2. 16 Apr, 2019 12 commits
  3. 15 Apr, 2019 2 commits
    • Linus Torvalds's avatar
      Merge tag 'libnvdimm-fixes-5.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm · 618d919c
      Linus Torvalds authored
      Pull libnvdimm fixes from Dan Williams:
       "I debated holding this back for the v5.2 merge window due to the size
        of the "zero-key" changes, but affected users would benefit from
        having the fixes sooner. It did not make sense to change the zero-key
        semantic in isolation for the "secure-erase" command, but instead
        include it for all security commands.
      
        The short background on the need for these changes is that some NVDIMM
        platforms enable security with a default zero-key rather than let the
        OS specify the initial key. This makes the security enabling that
        landed in v5.0 unusable for some users.
      
        Summary:
      
         - Compatibility fix for nvdimm-security implementations with a
           default zero-key.
      
         - Miscellaneous small fixes for out-of-bound accesses, cleanup after
           initialization failures, and missing debug messages"
      
      * tag 'libnvdimm-fixes-5.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
        tools/testing/nvdimm: Retain security state after overwrite
        libnvdimm/pmem: fix a possible OOB access when read and write pmem
        libnvdimm/security, acpi/nfit: unify zero-key for all security commands
        libnvdimm/security: provide fix for secure-erase to use zero-key
        libnvdimm/btt: Fix a kmemdup failure check
        libnvdimm/namespace: Fix a potential NULL pointer dereference
        acpi/nfit: Always dump _DSM output payload
      618d919c
    • Linus Torvalds's avatar
      Merge tag 'fsdax-fix-5.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm · 5512320c
      Linus Torvalds authored
      Pull fsdax fix from Dan Williams:
       "A single filesystem-dax fix. It has been lingering in -next for a long
        while and there are no other fsdax fixes on the horizon:
      
         - Avoid a crash scenario with architectures like powerpc that require
           'pgtable_deposit' for the zero page"
      
      * tag 'fsdax-fix-5.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
        fs/dax: Deposit pagetable even when installing zero page
      5512320c
  4. 14 Apr, 2019 6 commits
    • Linus Torvalds's avatar
      Linux 5.1-rc5 · dc4060a5
      Linus Torvalds authored
      dc4060a5
    • Linus Torvalds's avatar
      Merge branch 'page-refs' (page ref overflow) · 6b3a7077
      Linus Torvalds authored
      Merge page ref overflow branch.
      
      Jann Horn reported that he can overflow the page ref count with
      sufficient memory (and a filesystem that is intentionally extremely
      slow).
      
      Admittedly it's not exactly easy.  To have more than four billion
      references to a page requires a minimum of 32GB of kernel memory just
      for the pointers to the pages, much less any metadata to keep track of
      those pointers.  Jann needed a total of 140GB of memory and a specially
      crafted filesystem that leaves all reads pending (in order to not ever
      free the page references and just keep adding more).
      
      Still, we have a fairly straightforward way to limit the two obvious
      user-controllable sources of page references: direct-IO like page
      references gotten through get_user_pages(), and the splice pipe page
      duplication.  So let's just do that.
      
      * branch page-refs:
        fs: prevent page refcount overflow in pipe_buf_get
        mm: prevent get_user_pages() from overflowing page refcount
        mm: add 'try_get_page()' helper function
        mm: make page ref count overflow check tighter and more explicit
      6b3a7077
    • Matthew Wilcox's avatar
      fs: prevent page refcount overflow in pipe_buf_get · 15fab63e
      Matthew Wilcox authored
      Change pipe_buf_get() to return a bool indicating whether it succeeded
      in raising the refcount of the page (if the thing in the pipe is a page).
      This removes another mechanism for overflowing the page refcount.  All
      callers converted to handle a failure.
      Reported-by: default avatarJann Horn <jannh@google.com>
      Signed-off-by: default avatarMatthew Wilcox <willy@infradead.org>
      Cc: stable@kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      15fab63e
    • Linus Torvalds's avatar
      mm: prevent get_user_pages() from overflowing page refcount · 8fde12ca
      Linus Torvalds authored
      If the page refcount wraps around past zero, it will be freed while
      there are still four billion references to it.  One of the possible
      avenues for an attacker to try to make this happen is by doing direct IO
      on a page multiple times.  This patch makes get_user_pages() refuse to
      take a new page reference if there are already more than two billion
      references to the page.
      Reported-by: default avatarJann Horn <jannh@google.com>
      Acked-by: default avatarMatthew Wilcox <willy@infradead.org>
      Cc: stable@kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8fde12ca
    • Linus Torvalds's avatar
      mm: add 'try_get_page()' helper function · 88b1a17d
      Linus Torvalds authored
      This is the same as the traditional 'get_page()' function, but instead
      of unconditionally incrementing the reference count of the page, it only
      does so if the count was "safe".  It returns whether the reference count
      was incremented (and is marked __must_check, since the caller obviously
      has to be aware of it).
      
      Also like 'get_page()', you can't use this function unless you already
      had a reference to the page.  The intent is that you can use this
      exactly like get_page(), but in situations where you want to limit the
      maximum reference count.
      
      The code currently does an unconditional WARN_ON_ONCE() if we ever hit
      the reference count issues (either zero or negative), as a notification
      that the conditional non-increment actually happened.
      
      NOTE! The count access for the "safety" check is inherently racy, but
      that doesn't matter since the buffer we use is basically half the range
      of the reference count (ie we look at the sign of the count).
      Acked-by: default avatarMatthew Wilcox <willy@infradead.org>
      Cc: Jann Horn <jannh@google.com>
      Cc: stable@kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      88b1a17d
    • Linus Torvalds's avatar
      mm: make page ref count overflow check tighter and more explicit · f958d7b5
      Linus Torvalds authored
      We have a VM_BUG_ON() to check that the page reference count doesn't
      underflow (or get close to overflow) by checking the sign of the count.
      
      That's all fine, but we actually want to allow people to use a "get page
      ref unless it's already very high" helper function, and we want that one
      to use the sign of the page ref (without triggering this VM_BUG_ON).
      
      Change the VM_BUG_ON to only check for small underflows (or _very_ close
      to overflowing), and ignore overflows which have strayed into negative
      territory.
      Acked-by: default avatarMatthew Wilcox <willy@infradead.org>
      Cc: Jann Horn <jannh@google.com>
      Cc: stable@kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f958d7b5
  5. 13 Apr, 2019 14 commits
  6. 12 Apr, 2019 2 commits