1. 19 Jun, 2020 3 commits
    • Linus Torvalds's avatar
      Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux · 84bc1993
      Linus Torvalds authored
      Pull arm64 fixes from Will Deacon:
       "Unfortunately, we still have a number of outstanding issues so there
        will be more fixes to come, but this lot are a good start.
      
         - Fix handling of watchpoints triggered by uaccess routines
      
         - Fix initialisation of gigantic pages for CMA buffers
      
         - Raise minimum clang version for BTI to avoid miscompilation
      
         - Fix data race in SVE vector length configuration code
      
         - Ensure address tags are ignored in kern_addr_valid()
      
         - Dump register state on fatal BTI exception
      
         - kexec_file() cleanup to use struct_size() macro"
      
      * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
        arm64: hw_breakpoint: Don't invoke overflow handler on uaccess watchpoints
        arm64: kexec_file: Use struct_size() in kmalloc()
        arm64: mm: reserve hugetlb CMA after numa_init
        arm64: bti: Require clang >= 10.0.1 for in-kernel BTI support
        arm64: sve: Fix build failure when ARM64_SVE=y and SYSCTL=n
        arm64: pgtable: Clear the GP bit for non-executable kernel pages
        arm64: mm: reset address tag set by kasan sw tagging
        arm64: traps: Dump registers prior to panic() in bad_mode()
        arm64/sve: Eliminate data races on sve_default_vl
        docs/arm64: Fix typo'd #define in sve.rst
        arm64: remove TEXT_OFFSET randomization
      84bc1993
    • Linus Torvalds's avatar
      Merge tag 'overflow-v5.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux · 98b76994
      Linus Torvalds authored
      Pull flex-array size helper from Kees Cook:
       "During the treewide clean-ups of zero-length "flexible arrays", the
        struct_size() helper was heavily used, but it was noticed that many
        times it would have been nice to have an additional helper to get the
        size of just the flexible array itself.
      
        This need appears to be even more common when cleaning up the 1-byte
        array "flexible arrays", so Gustavo implemented it.
      
        I'd love to get this landed early so it can be used during the v5.9
        dev cycle to ease the 1-byte array cleanups."
      
      * tag 'overflow-v5.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
        overflow.h: Add flex_array_size() helper
      98b76994
    • Linus Torvalds's avatar
      Merge tag 'perf-tools-fixes-2020-06-02' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux · 98d7e741
      Linus Torvalds authored
      Pull perf tooling fixes from Arnaldo Carvalho de Melo:
      
       - Update various UAPI headers, some automatically adding support for a
         new MSR and the faccess2 syscall.
      
       - Fix corner case NULL deref in the histograms code.
      
       - Fix corner case NULL deref in 'perf stat' aggregation code.
      
       - Fix array pointer deref and old style declaration in the parsing of
         events.
      
       - Fix segfault when processing ZSTD compressed perf.data files in 'perf
         script' due to lack of initialization of the ZSTD library.
      
       - Handle __attribute__((user)) in libtraceevent fixing the parsing of
         syscall tracepoints with user buffers.
      
       - Make libtraevent aware of __builtin_expect() appearing in tracepoint
         fields.
      
       - Make the BPF prologue generation use bpf_probe_read_{user,kernel}().
      
       - Fix the '@user' attribute parsing in kprobes variables in 'perf
         probe'.
      
       - Fix error message when asking for -fsanitize=address without required
         libraries.
      
      * tag 'perf-tools-fixes-2020-06-02' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (22 commits)
        perf build: Fix error message when asking for -fsanitize=address without required libraries
        tools lib traceevent: Add handler for __builtin_expect()
        tools lib traceevent: Handle __attribute__((user)) in field names
        tools lib traceevent: Add append() function helper for appending strings
        tools headers UAPI: Sync linux/fs.h with the kernel sources
        tools include UAPI: Sync linux/vhost.h with the kernel sources
        tools arch x86: Sync the msr-index.h copy with the kernel sources
        perf script: Initialize zstd_data
        perf pmu: Remove unused declaration
        perf parse-events: Fix an old style declaration
        perf parse-events: Fix an incompatible pointer
        perf bpf: Fix bpf prologue generation
        perf probe: Fix user attribute access in kprobes
        perf stat: Fix NULL pointer dereference
        perf report: Fix NULL pointer dereference in hists__fprintf_nr_sample_events()
        tools headers UAPI: Sync kvm.h headers with the kernel sources
        tools headers UAPI: Sync drm/i915_drm.h with the kernel sources
        tools headers UAPI: Sync linux/fscrypt.h with the kernel sources
        perf beauty: Add support to STATX_MNT_ID in the 'statx' syscall 'mask' argument
        tools headers uapi: Sync linux/stat.h with the kernel sources
        ...
      98d7e741
  2. 18 Jun, 2020 11 commits
    • Linus Torvalds's avatar
      Merge branch 'hch' (maccess patches from Christoph Hellwig) · 5e857ce6
      Linus Torvalds authored
      Merge non-faulting memory access cleanups from Christoph Hellwig:
       "Andrew and I decided to drop the patches implementing your suggested
        rename of the probe_kernel_* and probe_user_* helpers from -mm as
        there were way to many conflicts.
      
        After -rc1 might be a good time for this as all the conflicts are
        resolved now"
      
      This also adds a type safety checking patch on top of the renaming
      series to make the subtle behavioral difference between 'get_user()' and
      'get_kernel_nofault()' less potentially dangerous and surprising.
      
      * emailed patches from Christoph Hellwig <hch@lst.de>:
        maccess: make get_kernel_nofault() check for minimal type compatibility
        maccess: rename probe_kernel_address to get_kernel_nofault
        maccess: rename probe_user_{read,write} to copy_{from,to}_user_nofault
        maccess: rename probe_kernel_{read,write} to copy_{from,to}_kernel_nofault
      5e857ce6
    • Linus Torvalds's avatar
      maccess: make get_kernel_nofault() check for minimal type compatibility · 0c389d89
      Linus Torvalds authored
      Now that we've renamed probe_kernel_address() to get_kernel_nofault()
      and made it look and behave more in line with get_user(), some of the
      subtle type behavior differences end up being more obvious and possibly
      dangerous.
      
      When you do
      
              get_user(val, user_ptr);
      
      the type of the access comes from the "user_ptr" part, and the above
      basically acts as
      
              val = *user_ptr;
      
      by design (except, of course, for the fact that the actual dereference
      is done with a user access).
      
      Note how in the above case, the type of the end result comes from the
      pointer argument, and then the value is cast to the type of 'val' as
      part of the assignment.
      
      So the type of the pointer is ultimately the more important type both
      for the access itself.
      
      But 'get_kernel_nofault()' may now _look_ similar, but it behaves very
      differently.  When you do
      
              get_kernel_nofault(val, kernel_ptr);
      
      it behaves like
      
              val = *(typeof(val) *)kernel_ptr;
      
      except, of course, for the fact that the actual dereference is done with
      exception handling so that a faulting access is suppressed and returned
      as the error code.
      
      But note how different the casting behavior of the two superficially
      similar accesses are: one does the actual access in the size of the type
      the pointer points to, while the other does the access in the size of
      the target, and ignores the pointer type entirely.
      
      Actually changing get_kernel_nofault() to act like get_user() is almost
      certainly the right thing to do eventually, but in the meantime this
      patch adds logit to at least verify that the pointer type is compatible
      with the type of the result.
      
      In many cases, this involves just casting the pointer to 'void *' to
      make it obvious that the type of the pointer is not the important part.
      It's not how 'get_user()' acts, but at least the behavioral difference
      is now obvious and explicit.
      
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      0c389d89
    • Christoph Hellwig's avatar
      maccess: rename probe_kernel_address to get_kernel_nofault · 25f12ae4
      Christoph Hellwig authored
      Better describe what this helper does, and match the naming of
      copy_from_kernel_nofault.
      
      Also switch the argument order around, so that it acts and looks
      like get_user().
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      25f12ae4
    • Luc Van Oostenryck's avatar
      sparse: use identifiers to define address spaces · 670d0a4b
      Luc Van Oostenryck authored
      Currently, address spaces in warnings are displayed as '<asn:X>' with
      'X' being the address space's arbitrary number.
      
      But since sparse v0.6.0-rc1 (late December 2018), sparse allows you to
      define the address spaces using an identifier instead of a number.  This
      identifier is then directly used in the warnings.
      
      So, use the identifiers '__user', '__iomem', '__percpu' & '__rcu' for
      the corresponding address spaces.  The default address space, __kernel,
      being not displayed in warnings, stays defined as '0'.
      
      With this change, warnings that used to be displayed as:
      
      	cast removes address space '<asn:1>' of expression
      	... void [noderef] <asn:2> *
      
      will now be displayed as:
      
      	cast removes address space '__user' of expression
      	... void [noderef] __iomem *
      
      This also moves the __kernel annotation to be the first one, since it is
      quite different from the others because it's the default one, and so:
      
       - it's never displayed
      
       - it's normally not needed, nor in type annotations, nor in cast
         between address spaces. The only time it's needed is when it's
         combined with a typeof to express "the same type as this one but
         without the address space"
      
       - it can't be defined with a name, '0' must be used.
      
      So, it seemed strange to me to have it in the middle of the other
      ones.
      Signed-off-by: default avatarLuc Van Oostenryck <luc.vanoostenryck@gmail.com>
      Acked-by: default avatarMiguel Ojeda <miguel.ojeda.sandonis@gmail.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      670d0a4b
    • Tiezhu Yang's avatar
      perf build: Fix error message when asking for -fsanitize=address without required libraries · 6a1515c9
      Tiezhu Yang authored
      When build perf with ASan or UBSan, if libasan or libubsan can not find,
      the feature-glibc is 0 and there exists the following error log which is
      wrong, because we can find gnu/libc-version.h in /usr/include,
      glibc-devel is also installed.
      
        [yangtiezhu@linux perf]$ make DEBUG=1 EXTRA_CFLAGS='-fno-omit-frame-pointer -fsanitize=address'
          BUILD:   Doing 'make -j4' parallel build
          HOSTCC   fixdep.o
          HOSTLD   fixdep-in.o
          LINK     fixdep
        <stdin>:1:0: warning: -fsanitize=address and -fsanitize=kernel-address are not supported for this target
        <stdin>:1:0: warning: -fsanitize=address not supported for this target
      
        Auto-detecting system features:
        ...                         dwarf: [ OFF ]
        ...            dwarf_getlocations: [ OFF ]
        ...                         glibc: [ OFF ]
        ...                          gtk2: [ OFF ]
        ...                      libaudit: [ OFF ]
        ...                        libbfd: [ OFF ]
        ...                        libcap: [ OFF ]
        ...                        libelf: [ OFF ]
        ...                       libnuma: [ OFF ]
        ...        numa_num_possible_cpus: [ OFF ]
        ...                       libperl: [ OFF ]
        ...                     libpython: [ OFF ]
        ...                     libcrypto: [ OFF ]
        ...                     libunwind: [ OFF ]
        ...            libdw-dwarf-unwind: [ OFF ]
        ...                          zlib: [ OFF ]
        ...                          lzma: [ OFF ]
        ...                     get_cpuid: [ OFF ]
        ...                           bpf: [ OFF ]
        ...                        libaio: [ OFF ]
        ...                       libzstd: [ OFF ]
        ...        disassembler-four-args: [ OFF ]
      
        Makefile.config:393: *** No gnu/libc-version.h found, please install glibc-dev[el].  Stop.
        Makefile.perf:224: recipe for target 'sub-make' failed
        make[1]: *** [sub-make] Error 2
        Makefile:69: recipe for target 'all' failed
        make: *** [all] Error 2
        [yangtiezhu@linux perf]$ ls /usr/include/gnu/libc-version.h
        /usr/include/gnu/libc-version.h
      
      After install libasan and libubsan, the feature-glibc is 1 and the build
      process is success, so the cause is related with libasan or libubsan, we
      should check them and print an error log to reflect the reality.
      
      Committer testing:
      
        $ rm -rf /tmp/build/perf ; mkdir -p /tmp/build/perf
        $ make DEBUG=1 EXTRA_CFLAGS='-fno-omit-frame-pointer -fsanitize=address' O=/tmp/build/perf -C tools/perf/ install-bin
        make: Entering directory '/home/acme/git/perf/tools/perf'
          BUILD:   Doing 'make -j12' parallel build
          HOSTCC   /tmp/build/perf/fixdep.o
          HOSTLD   /tmp/build/perf/fixdep-in.o
          LINK     /tmp/build/perf/fixdep
      
        Auto-detecting system features:
        ...                         dwarf: [ OFF ]
        ...            dwarf_getlocations: [ OFF ]
        ...                         glibc: [ OFF ]
        ...                          gtk2: [ OFF ]
        ...                        libbfd: [ OFF ]
        ...                        libcap: [ OFF ]
        ...                        libelf: [ OFF ]
        ...                       libnuma: [ OFF ]
        ...        numa_num_possible_cpus: [ OFF ]
        ...                       libperl: [ OFF ]
        ...                     libpython: [ OFF ]
        ...                     libcrypto: [ OFF ]
        ...                     libunwind: [ OFF ]
        ...            libdw-dwarf-unwind: [ OFF ]
        ...                          zlib: [ OFF ]
        ...                          lzma: [ OFF ]
        ...                     get_cpuid: [ OFF ]
        ...                           bpf: [ OFF ]
        ...                        libaio: [ OFF ]
        ...                       libzstd: [ OFF ]
        ...        disassembler-four-args: [ OFF ]
      
        Makefile.config:401: *** No libasan found, please install libasan.  Stop.
        make[1]: *** [Makefile.perf:231: sub-make] Error 2
        make: *** [Makefile:70: all] Error 2
        make: Leaving directory '/home/acme/git/perf/tools/perf'
        $
        $
        $ sudo dnf install libasan
        <SNIP>
        Installed:
          libasan-9.3.1-2.fc31.x86_64
        $
        $
        $ make DEBUG=1 EXTRA_CFLAGS='-fno-omit-frame-pointer -fsanitize=address' O=/tmp/build/perf -C tools/perf/ install-bin
        make: Entering directory '/home/acme/git/perf/tools/perf'
          BUILD:   Doing 'make -j12' parallel build
      
        Auto-detecting system features:
        ...                         dwarf: [ on  ]
        ...            dwarf_getlocations: [ on  ]
        ...                         glibc: [ on  ]
        ...                          gtk2: [ on  ]
        ...                        libbfd: [ on  ]
        ...                        libcap: [ on  ]
        ...                        libelf: [ on  ]
        ...                       libnuma: [ on  ]
        ...        numa_num_possible_cpus: [ on  ]
        ...                       libperl: [ on  ]
        ...                     libpython: [ on  ]
        ...                     libcrypto: [ on  ]
        ...                     libunwind: [ on  ]
        ...            libdw-dwarf-unwind: [ on  ]
        ...                          zlib: [ on  ]
        ...                          lzma: [ on  ]
        ...                     get_cpuid: [ on  ]
        ...                           bpf: [ on  ]
        ...                        libaio: [ on  ]
        ...                       libzstd: [ on  ]
        ...        disassembler-four-args: [ on  ]
         <SNIP>
          CC       /tmp/build/perf/util/pmu-flex.o
          FLEX     /tmp/build/perf/util/expr-flex.c
          CC       /tmp/build/perf/util/expr-bison.o
          CC       /tmp/build/perf/util/expr.o
          CC       /tmp/build/perf/util/expr-flex.o
          CC       /tmp/build/perf/util/parse-events-flex.o
          CC       /tmp/build/perf/util/parse-events.o
          LD       /tmp/build/perf/util/intel-pt-decoder/perf-in.o
          LD       /tmp/build/perf/util/perf-in.o
          LD       /tmp/build/perf/perf-in.o
          LINK     /tmp/build/perf/perf
        <SNIP>
          INSTALL  python-scripts
          INSTALL  perf_completion-script
          INSTALL  perf-tip
        make: Leaving directory '/home/acme/git/perf/tools/perf'
        $ ldd ~/bin/perf | grep asan
        	libasan.so.5 => /lib64/libasan.so.5 (0x00007f0904164000)
        $
      
      And if we rebuild without -fsanitize-address:
      
        $ rm -rf /tmp/build/perf ; mkdir -p /tmp/build/perf
        $ make O=/tmp/build/perf -C tools/perf/ install-bin
        make: Entering directory '/home/acme/git/perf/tools/perf'
          BUILD:   Doing 'make -j12' parallel build
          HOSTCC   /tmp/build/perf/fixdep.o
          HOSTLD   /tmp/build/perf/fixdep-in.o
          LINK     /tmp/build/perf/fixdep
      
        Auto-detecting system features:
        ...                         dwarf: [ on  ]
        ...            dwarf_getlocations: [ on  ]
        ...                         glibc: [ on  ]
        ...                          gtk2: [ on  ]
        ...                        libbfd: [ on  ]
        ...                        libcap: [ on  ]
        ...                        libelf: [ on  ]
        ...                       libnuma: [ on  ]
        ...        numa_num_possible_cpus: [ on  ]
        ...                       libperl: [ on  ]
        ...                     libpython: [ on  ]
        ...                     libcrypto: [ on  ]
        ...                     libunwind: [ on  ]
        ...            libdw-dwarf-unwind: [ on  ]
        ...                          zlib: [ on  ]
        ...                          lzma: [ on  ]
        ...                     get_cpuid: [ on  ]
        ...                           bpf: [ on  ]
        ...                        libaio: [ on  ]
        ...                       libzstd: [ on  ]
        ...        disassembler-four-args: [ on  ]
      
          GEN      /tmp/build/perf/common-cmds.h
          CC       /tmp/build/perf/exec-cmd.o
        <SNIP>
          INSTALL  perf_completion-script
          INSTALL  perf-tip
        make: Leaving directory '/home/acme/git/perf/tools/perf'
        $ ldd ~/bin/perf | grep asan
        $
      Signed-off-by: default avatarTiezhu Yang <yangtiezhu@loongson.cn>
      Acked-by: default avatarJiri Olsa <jolsa@redhat.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: tiezhu yang <yangtiezhu@loongson.cn>
      Cc: xuefeng li <lixuefeng@loongson.cn>
      Link: http://lore.kernel.org/lkml/1592445961-28044-1-git-send-email-yangtiezhu@loongson.cnSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      6a1515c9
    • Steven Rostedt (VMware)'s avatar
      tools lib traceevent: Add handler for __builtin_expect() · 1b20d949
      Steven Rostedt (VMware) authored
      In order to move pointer checks like IS_ERR_VALUE() out of the hotpath
      and into the reader path of a trace event, user space tools need to be
      able to parse that. IS_ERR_VALUE() is defined as:
      
       #define IS_ERR_VALUE() unlikely((unsigned long)(void *)(x) >= (unsigned long)-MAX_ERRNO)
      
      Which eventually turns into:
      
        __builtin_expect(!!((unsigned long)(void *)(x) >= (unsigned long)-4095), 0)
      
      Now the traceevent parser can handle most of that except for the
      __builtin_expect(), which needs to be added.
      
      Link: https://lore.kernel.org/linux-mm/20200320055823.27089-3-jaewon31.kim@samsung.com/Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Jaewon Kim <jaewon31.kim@samsung.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kees Kook <keescook@chromium.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: linux-mm@kvack.org
      Cc: linux-trace-devel@vger.kernel.org
      Link: http://lore.kernel.org/lkml/20200324200956.821799393@goodmis.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      1b20d949
    • Steven Rostedt (VMware)'s avatar
      tools lib traceevent: Handle __attribute__((user)) in field names · 74621d92
      Steven Rostedt (VMware) authored
      Commit c61f13ea ("gcc-plugins: Add structleak for more stack
      initialization") added "__attribute__((user))" to the user when
      stackleak detector is enabled. This now appears in the field format of
      system call trace events for system calls that have user buffers. The
      "__attribute__((user))" breaks the parsing in libtraceevent. That needs
      to be handled.
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Jaewon Kim <jaewon31.kim@samsung.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kees Kook <keescook@chromium.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: linux-mm@kvack.org
      Cc: linux-trace-devel@vger.kernel.org
      Link: http://lore.kernel.org/lkml/20200324200956.663647256@goodmis.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      74621d92
    • Steven Rostedt (VMware)'s avatar
      tools lib traceevent: Add append() function helper for appending strings · 27d4d336
      Steven Rostedt (VMware) authored
      There's several locations that open code realloc and strcat() to append
      text to strings. Add an append() function that takes a delimiter and a
      string to append to another string.
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Jaewon Lim <jaewon31.kim@samsung.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kees Kook <keescook@chromium.org>
      Cc: linux-mm@kvack.org
      Cc: linux-trace-devel@vger.kernel.org
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Link: http://lore.kernel.org/lkml/20200324200956.515118403@goodmis.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      27d4d336
    • Will Deacon's avatar
      arm64: hw_breakpoint: Don't invoke overflow handler on uaccess watchpoints · 24ebec25
      Will Deacon authored
      Unprivileged memory accesses generated by the so-called "translated"
      instructions (e.g. STTR) at EL1 can cause EL0 watchpoints to fire
      unexpectedly if kernel debugging is enabled. In such cases, the
      hw_breakpoint logic will invoke the user overflow handler which will
      typically raise a SIGTRAP back to the current task. This is futile when
      returning back to the kernel because (a) the signal won't have been
      delivered and (b) userspace can't handle the thing anyway.
      
      Avoid invoking the user overflow handler for watchpoints triggered by
      kernel uaccess routines, and instead single-step over the faulting
      instruction as we would if no overflow handler had been installed.
      
      (Fixes tag identifies the introduction of unprivileged memory accesses,
       which exposed this latent bug in the hw_breakpoint code)
      
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      Fixes: 57f4959b ("arm64: kernel: Add support for User Access Override")
      Reported-by: default avatarLuis Machado <luis.machado@linaro.org>
      Signed-off-by: default avatarWill Deacon <will@kernel.org>
      24ebec25
    • Gustavo A. R. Silva's avatar
      arm64: kexec_file: Use struct_size() in kmalloc() · bf508ec9
      Gustavo A. R. Silva authored
      Make use of the struct_size() helper instead of an open-coded version
      in order to avoid any potential type mistakes.
      
      This code was detected with the help of Coccinelle and, audited and
      fixed manually.
      Signed-off-by: default avatarGustavo A. R. Silva <gustavoars@kernel.org>
      Link: https://lore.kernel.org/r/20200617213407.GA1385@embeddedorSigned-off-by: default avatarWill Deacon <will@kernel.org>
      bf508ec9
    • Barry Song's avatar
      arm64: mm: reserve hugetlb CMA after numa_init · 618e0786
      Barry Song authored
      hugetlb_cma_reserve() is called at the wrong place. numa_init has not been
      done yet. so all reserved memory will be located at node0.
      
      Fixes: cf11e85f ("mm: hugetlb: optionally allocate gigantic hugepages using cma")
      Signed-off-by: default avatarBarry Song <song.bao.hua@hisilicon.com>
      Reviewed-by: default avatarAnshuman Khandual <anshuman.khandual@arm.com>
      Acked-by: default avatarRoman Gushchin <guro@fb.com>
      Cc: Matthias Brugger <matthias.bgg@gmail.com>
      Cc: Will Deacon <will@kernel.org>
      Link: https://lore.kernel.org/r/20200617215828.25296-1-song.bao.hua@hisilicon.comSigned-off-by: default avatarWill Deacon <will@kernel.org>
      618e0786
  3. 17 Jun, 2020 15 commits
    • Linus Torvalds's avatar
      Merge tag 'dma-mapping-5.8-3' of git://git.infradead.org/users/hch/dma-mapping · 1b504402
      Linus Torvalds authored
      Pull dma-mapping fixes from Christoph Hellwig:
       "Fixes for the SEV atomic pool (Geert Uytterhoeven and David Rientjes)"
      
      * tag 'dma-mapping-5.8-3' of git://git.infradead.org/users/hch/dma-mapping:
        dma-pool: decouple DMA_REMAP from DMA_COHERENT_POOL
        dma-pool: fix too large DMA pools on medium memory size systems
      1b504402
    • Christoph Hellwig's avatar
      maccess: rename probe_user_{read,write} to copy_{from,to}_user_nofault · c0ee37e8
      Christoph Hellwig authored
      Better describe what these functions do.
      Suggested-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c0ee37e8
    • Christoph Hellwig's avatar
      fe557319
    • Arnaldo Carvalho de Melo's avatar
      tools headers UAPI: Sync linux/fs.h with the kernel sources · 0e093c77
      Arnaldo Carvalho de Melo authored
      To pick the changes from:
      
        b383a73f ("fs/ext4: Introduce DAX inode flag")
      
      And silence this perf build warning:
      
        Warning: Kernel ABI header at 'tools/include/uapi/linux/fs.h' differs from latest version at 'include/uapi/linux/fs.h'
        diff -u tools/include/uapi/linux/fs.h include/uapi/linux/fs.h
      
      It causes various beautifiers for things like fspick, fsmount, etc (see
      below) to get rebuilt, but this specific change doesn't make 'perf
      trace' be capable of decoding anything new, as we still don't decode
      what comes from ioctls, just its cmds.
      
      Details about the update:
      
        $ cp include/uapi/linux/fs.h tools/include/uapi/linux/fs.h
        $ git diff
        diff --git a/tools/include/uapi/linux/fs.h b/tools/include/uapi/linux/fs.h
        index 379a612f8f1d..f44eb0a04afd 100644
        --- a/tools/include/uapi/linux/fs.h
        +++ b/tools/include/uapi/linux/fs.h
        @@ -262,6 +262,7 @@ struct fsxattr {
         #define FS_EA_INODE_FL                 0x00200000 /* Inode used for large EA */
         #define FS_EOFBLOCKS_FL                        0x00400000 /* Reserved for ext4 */
         #define FS_NOCOW_FL                    0x00800000 /* Do not cow file */
        +#define FS_DAX_FL                      0x02000000 /* Inode is DAX */
         #define FS_INLINE_DATA_FL              0x10000000 /* Reserved for ext4 */
         #define FS_PROJINHERIT_FL              0x20000000 /* Create with parents projid */
         #define FS_CASEFOLD_FL                 0x40000000 /* Folder is case insensitive */
        $ m
        make: Entering directory '/home/acme/git/perf/tools/perf'
          BUILD:   Doing 'make -j8' parallel build
          INSTALL  GTK UI
          CC       /tmp/build/perf/builtin-trace.o
          DESCEND  plugins
          CC       /tmp/build/perf/trace/beauty/fsmount.o
          CC       /tmp/build/perf/trace/beauty/fspick.o
          CC       /tmp/build/perf/trace/beauty/mount_flags.o
          CC       /tmp/build/perf/trace/beauty/move_mount.o
          CC       /tmp/build/perf/trace/beauty/renameat.o
          CC       /tmp/build/perf/trace/beauty/sync_file_range.o
          INSTALL  trace_plugins
          LD       /tmp/build/perf/trace/beauty/perf-in.o
          LD       /tmp/build/perf/perf-in.o
          LINK     /tmp/build/perf/perf
        <SNIP>
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ira Weiny <ira.weiny@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      0e093c77
    • Arnaldo Carvalho de Melo's avatar
      tools include UAPI: Sync linux/vhost.h with the kernel sources · f64925c1
      Arnaldo Carvalho de Melo authored
      To get the changes in:
      
        776f3950 ("vhost_vdpa: Support config interrupt in vdpa")
      
      Silencing this perf build warning:
      
        Warning: Kernel ABI header at 'tools/include/uapi/linux/vhost.h' differs from latest version at 'include/uapi/linux/vhost.h'
        diff -u tools/include/uapi/linux/vhost.h include/uapi/linux/vhost.h
      
      This automatically picks the new ioctl introduced in the above patch,
      making tools such as 'perf trace' aware of them and possibly allowing to
      use the strings in filters, etc:
      
        # perf trace -e ioctl --pid 7951
        <SNIP>
           0.178 ( 0.010 ms): CPU 0/KVM/8023 ioctl(fd: 14, cmd: KVM_RUN) = 0
           0.194 ( 0.010 ms): CPU 0/KVM/8023 ioctl(fd: 14, cmd: KVM_RUN) = 0
           0.209 ( 0.010 ms): CPU 0/KVM/8023 ioctl(fd: 14, cmd: KVM_RUN) = 0
           0.224 (249.413 ms): CPU 0/KVM/8023 ioctl(fd: 14, cmd: KVM_RUN) = 0
         249.660 ( 0.011 ms): CPU 0/KVM/8023 ioctl(fd: 14, cmd: KVM_RUN) = 0
         249.675 ( 0.007 ms): CPU 0/KVM/8023 ioctl(fd: 14, cmd: KVM_RUN) = 0
         249.686 ( 0.007 ms): CPU 0/KVM/8023 ioctl(fd: 14, cmd: KVM_RUN) = 0
         249.697 ( 0.008 ms): CPU 0/KVM/8023 ioctl(fd: 14, cmd: KVM_RUN) = 0
         249.709 ( 0.007 ms): CPU 0/KVM/8023 ioctl(fd: 14, cmd: KVM_RUN) = 0
         249.720 ( 0.007 ms): CPU 0/KVM/8023 ioctl(fd: 14, cmd: KVM_RUN) = 0
         249.730 ( 0.007 ms): CPU 0/KVM/8023 ioctl(fd: 14, cmd: KVM_RUN) = 0
         249.740 ( 0.007 ms): CPU 0/KVM/8023 ioctl(fd: 14, cmd: KVM_RUN) = 0
         249.752 ( 0.007 ms): CPU 0/KVM/8023 ioctl(fd: 14, cmd: KVM_RUN) = 0
         249.762 ( 0.007 ms): CPU 0/KVM/8023 ioctl(fd: 14, cmd: KVM_RUN) = 0
         249.772 ( 0.007 ms): CPU 0/KVM/8023 ioctl(fd: 14, cmd: KVM_RUN) = 0
         249.782 (120.138 ms): CPU 0/KVM/8023 ioctl(fd: 14, cmd: KVM_RUN) = 0
         370.201 ( 0.039 ms): CPU 0/KVM/8023 ioctl(fd: 12, cmd: KVM_IRQ_LINE_STATUS, arg: 0x7f744f9e1420) = 0
         370.254 ( 0.052 ms): CPU 0/KVM/8023 ioctl(fd: 14, cmd: KVM_RUN) = 0
         370.575 ( 0.365 ms): CPU 0/KVM/8023 ioctl(fd: 14, cmd: KVM_RUN) = 0
         370.973 ( 0.028 ms): CPU 0/KVM/8023 ioctl(fd: 14, cmd: KVM_RUN) = 0
         371.015 ( 0.037 ms): CPU 0/KVM/8023 ioctl(fd: 14, cmd: KVM_RUN) = 0
         371.071 ( 0.009 ms): CPU 0/KVM/8023 ioctl(fd: 12, cmd: KVM_IRQ_LINE_STATUS, arg: 0x7f744f9e14b0) = 0
        <SNIP>
        #
      
      Details about the update:
      
        $ diff -u tools/include/uapi/linux/vhost.h include/uapi/linux/vhost.h
        --- tools/include/uapi/linux/vhost.h	2020-04-16 13:19:12.056763843 -0300
        +++ include/uapi/linux/vhost.h	2020-06-17 10:04:20.532056428 -0300
        @@ -15,6 +15,8 @@
         #include <linux/types.h>
         #include <linux/ioctl.h>
      
        +#define VHOST_FILE_UNBIND -1
        +
         /* ioctls */
      
         #define VHOST_VIRTIO 0xAF
        @@ -140,4 +142,6 @@
         /* Get the max ring size. */
         #define VHOST_VDPA_GET_VRING_NUM	_IOR(VHOST_VIRTIO, 0x76, __u16)
      
        +/* Set event fd for config interrupt*/
        +#define VHOST_VDPA_SET_CONFIG_CALL	_IOW(VHOST_VIRTIO, 0x77, int)
         #endif
        $
        $ tools/perf/trace/beauty/vhost_virtio_ioctl.sh > before
        $ cp include/uapi/linux/vhost.h tools/include/uapi/linux/vhost.h
        $ tools/perf/trace/beauty/vhost_virtio_ioctl.sh > after
        $ diff -u before after
        --- before	2020-06-17 10:15:35.123275966 -0300
        +++ after	2020-06-17 10:15:51.812482117 -0300
        @@ -27,6 +27,7 @@
         	[0x72] = "VDPA_SET_STATUS",
         	[0x74] = "VDPA_SET_CONFIG",
         	[0x75] = "VDPA_SET_VRING_ENABLE",
        +	[0x77] = "VDPA_SET_CONFIG_CALL",
         };
         static const char *vhost_virtio_ioctl_read_cmds[] = {
         	[0x00] = "GET_FEATURES",
        $
      
      This causes these parts to get rebuilt:
      
        CC       /tmp/build/perf/trace/beauty/ioctl.o
        INSTALL  trace_plugins
        LD       /tmp/build/perf/trace/beauty/perf-in.o
        LD       /tmp/build/perf/perf-in.o
        LINK     /tmp/build/perf/perf
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Zhu Lingshan <lingshan.zhu@intel.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f64925c1
    • Arnaldo Carvalho de Melo's avatar
      tools arch x86: Sync the msr-index.h copy with the kernel sources · 25ca7e5c
      Arnaldo Carvalho de Melo authored
      To pick up the changes in:
      
        7e5b3c26 ("x86/speculation: Add Special Register Buffer Data Sampling (SRBDS) mitigation")
      
      Addressing these tools/perf build warnings:
      
        Warning: Kernel ABI header at 'tools/arch/x86/include/asm/msr-index.h' differs from latest version at 'arch/x86/include/asm/msr-index.h'
        diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h
        Warning: Kernel ABI header at 'tools/arch/x86/include/asm/cpufeatures.h' differs from latest version at 'arch/x86/include/asm/cpufeatures.h'
        diff -u tools/arch/x86/include/asm/cpufeatures.h arch/x86/include/asm/cpufeatures.h
      
      With this one will be able to use these new AMD MSRs in filters, by
      name, e.g.:
      
        # perf trace -e msr:* --filter "msr==IA32_MCU_OPT_CTRL"
        ^C#
      
      Using -v we can see how it sets up the tracepoint filters, converting
      from the string in the filter to the numeric value:
      
        # perf trace -v -e msr:* --filter "msr==IA32_MCU_OPT_CTRL"
        Using CPUID GenuineIntel-6-8E-A
        0x123
        New filter for msr:read_msr: (msr==0x123) && (common_pid != 335 && common_pid != 30344)
        0x123
        New filter for msr:write_msr: (msr==0x123) && (common_pid != 335 && common_pid != 30344)
        0x123
        New filter for msr:rdpmc: (msr==0x123) && (common_pid != 335 && common_pid != 30344)
        mmap size 528384B
        ^C#
      
      The updating process shows how this affects tooling in more detail:
      
        $ diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h
        --- tools/arch/x86/include/asm/msr-index.h	2020-06-03 10:36:09.959910238 -0300
        +++ arch/x86/include/asm/msr-index.h	2020-06-17 10:04:20.235052901 -0300
        @@ -128,6 +128,10 @@
         #define TSX_CTRL_RTM_DISABLE		BIT(0)	/* Disable RTM feature */
         #define TSX_CTRL_CPUID_CLEAR		BIT(1)	/* Disable TSX enumeration */
      
        +/* SRBDS support */
        +#define MSR_IA32_MCU_OPT_CTRL		0x00000123
        +#define RNGDS_MITG_DIS			BIT(0)
        +
         #define MSR_IA32_SYSENTER_CS		0x00000174
         #define MSR_IA32_SYSENTER_ESP		0x00000175
         #define MSR_IA32_SYSENTER_EIP		0x00000176
        $ set -o vi
        $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > before
        $ cp arch/x86/include/asm/msr-index.h tools/arch/x86/include/asm/msr-index.h
        $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > after
        $ diff -u before after
        --- before	2020-06-17 10:05:49.653114752 -0300
        +++ after	2020-06-17 10:06:01.777258731 -0300
        @@ -51,6 +51,7 @@
         	[0x0000011e] = "IA32_BBL_CR_CTL3",
         	[0x00000120] = "IDT_MCR_CTRL",
         	[0x00000122] = "IA32_TSX_CTRL",
        +	[0x00000123] = "IA32_MCU_OPT_CTRL",
         	[0x00000140] = "MISC_FEATURES_ENABLES",
         	[0x00000174] = "IA32_SYSENTER_CS",
         	[0x00000175] = "IA32_SYSENTER_ESP",
        $
      
      The related change to cpu-features.h affects this:
      
        CC       /tmp/build/perf/bench/mem-memcpy-x86-64-asm.o
        CC       /tmp/build/perf/bench/mem-memset-x86-64-asm.o
      
      This shouldn't be affecting that 'perf bench' entry:
      
        $ find tools/perf/ -type f | xargs grep SRBDS
        $
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Gross <mgross@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      25ca7e5c
    • Arnaldo Carvalho de Melo's avatar
      Merge remote-tracking branch 'torvalds/master' into perf/urgent · 08a7c777
      Arnaldo Carvalho de Melo authored
      To get some newer headers that got out of sync with the copies in tools/
      so that we can try to have the tools/perf/ build clean for v5.8 with
      fewer pull requests.
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      08a7c777
    • Milian Wolff's avatar
      perf script: Initialize zstd_data · b13b04d9
      Milian Wolff authored
      Fixes segmentation fault when trying to interpret zstd-compressed data
      with perf script:
      
      ```
        $ perf record -z ls
        ...
        [ perf record: Captured and wrote 0,010 MB perf.data, compressed (original 0,001 MB, ratio is 2,190) ]
        $ memcheck perf script
        ...
        ==67911== Invalid read of size 4
        ==67911==    at 0x5568188: ZSTD_decompressStream (in /usr/lib/libzstd.so.1.4.5)
        ==67911==    by 0x6E726B: zstd_decompress_stream (zstd.c:100)
        ==67911==    by 0x65729C: perf_session__process_compressed_event (session.c:72)
        ==67911==    by 0x6598E8: perf_session__process_user_event (session.c:1583)
        ==67911==    by 0x65BA59: reader__process_events (session.c:2177)
        ==67911==    by 0x65BA59: __perf_session__process_events (session.c:2234)
        ==67911==    by 0x65BA59: perf_session__process_events (session.c:2267)
        ==67911==    by 0x5A7397: __cmd_script (builtin-script.c:2447)
        ==67911==    by 0x5A7397: cmd_script (builtin-script.c:3840)
        ==67911==    by 0x5FE9D2: run_builtin (perf.c:312)
        ==67911==    by 0x711627: handle_internal_command (perf.c:364)
        ==67911==    by 0x711627: run_argv (perf.c:408)
        ==67911==    by 0x711627: main (perf.c:538)
        ==67911==  Address 0x71d8 is not stack'd, malloc'd or (recently) free'd
      ```
      Signed-off-by: default avatarMilian Wolff <milian.wolff@kdab.com>
      Acked-by: default avatarAlexey Budankov <alexey.budankov@linux.intel.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      LPU-Reference: 20200612230333.72140-1-milian.wolff@kdab.com
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      b13b04d9
    • Will Deacon's avatar
      arm64: bti: Require clang >= 10.0.1 for in-kernel BTI support · b9249cba
      Will Deacon authored
      Unfortunately, most versions of clang that support BTI are capable of
      miscompiling the kernel when converting a switch statement into a jump
      table. As an example, attempting to spawn a KVM guest results in a panic:
      
      [   56.253312] Kernel panic - not syncing: bad mode
      [   56.253834] CPU: 0 PID: 279 Comm: lkvm Not tainted 5.8.0-rc1 #2
      [   56.254225] Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015
      [   56.254712] Call trace:
      [   56.254952]  dump_backtrace+0x0/0x1d4
      [   56.255305]  show_stack+0x1c/0x28
      [   56.255647]  dump_stack+0xc4/0x128
      [   56.255905]  panic+0x16c/0x35c
      [   56.256146]  bad_el0_sync+0x0/0x58
      [   56.256403]  el1_sync_handler+0xb4/0xe0
      [   56.256674]  el1_sync+0x7c/0x100
      [   56.256928]  kvm_vm_ioctl_check_extension_generic+0x74/0x98
      [   56.257286]  __arm64_sys_ioctl+0x94/0xcc
      [   56.257569]  el0_svc_common+0x9c/0x150
      [   56.257836]  do_el0_svc+0x84/0x90
      [   56.258083]  el0_sync_handler+0xf8/0x298
      [   56.258361]  el0_sync+0x158/0x180
      
      This is because the switch in kvm_vm_ioctl_check_extension_generic()
      is executed as an indirect branch to tail-call through a jump table:
      
      ffff800010032dc8:       3869694c        ldrb    w12, [x10, x9]
      ffff800010032dcc:       8b0c096b        add     x11, x11, x12, lsl #2
      ffff800010032dd0:       d61f0160        br      x11
      
      However, where the target case uses the stack, the landing pad is elided
      due to the presence of a paciasp instruction:
      
      ffff800010032e14:       d503233f        paciasp
      ffff800010032e18:       a9bf7bfd        stp     x29, x30, [sp, #-16]!
      ffff800010032e1c:       910003fd        mov     x29, sp
      ffff800010032e20:       aa0803e0        mov     x0, x8
      ffff800010032e24:       940017c0        bl      ffff800010038d24 <kvm_vm_ioctl_check_extension>
      ffff800010032e28:       93407c00        sxtw    x0, w0
      ffff800010032e2c:       a8c17bfd        ldp     x29, x30, [sp], #16
      ffff800010032e30:       d50323bf        autiasp
      ffff800010032e34:       d65f03c0        ret
      
      Unfortunately, this results in a fatal exception because paciasp is
      compatible only with branch-and-link (call) instructions and not simple
      indirect branches.
      
      A fix is being merged into Clang 10.0.1 so that a 'bti j' instruction is
      emitted as an explicit landing pad in this situation. Make in-kernel
      BTI depend on that compiler version when building with clang.
      
      Cc: Tom Stellard <tstellar@redhat.com>
      Cc: Daniel Kiss <daniel.kiss@arm.com>
      Reviewed-by: default avatarMark Brown <broonie@kernel.org>
      Acked-by: default avatarDave Martin <Dave.Martin@arm.com>
      Reviewed-by: default avatarNathan Chancellor <natechancellor@gmail.com>
      Acked-by: default avatarNick Desaulniers <ndesaulniers@google.com>
      Link: https://lore.kernel.org/r/20200615105524.GA2694@willie-the-truck
      Link: https://lore.kernel.org/r/20200616183630.2445-1-will@kernel.orgSigned-off-by: default avatarWill Deacon <will@kernel.org>
      b9249cba
    • Gustavo A. R. Silva's avatar
      overflow.h: Add flex_array_size() helper · b19d57d0
      Gustavo A. R. Silva authored
      Add flex_array_size() helper for the calculation of the size, in bytes,
      of a flexible array member contained within an enclosing structure.
      
      Example of usage:
      
      struct something {
      	size_t count;
      	struct foo items[];
      };
      
      struct something *instance;
      
      instance = kmalloc(struct_size(instance, items, count), GFP_KERNEL);
      instance->count = count;
      memcpy(instance->items, src, flex_array_size(instance, items, instance->count));
      
      The helper returns SIZE_MAX on overflow instead of wrapping around.
      
      Additionally replaces parameter "n" with "count" in struct_size() helper
      for greater clarity and unification.
      Signed-off-by: default avatarGustavo A. R. Silva <gustavoars@kernel.org>
      Link: https://lore.kernel.org/r/20200609012233.GA3371@embeddedorSigned-off-by: default avatarKees Cook <keescook@chromium.org>
      b19d57d0
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 69119673
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Don't get per-cpu pointer with preemption enabled in nft_set_pipapo,
          fix from Stefano Brivio.
      
       2) Fix memory leak in ctnetlink, from Pablo Neira Ayuso.
      
       3) Multiple definitions of MPTCP_PM_MAX_ADDR, from Geliang Tang.
      
       4) Accidently disabling NAPI in non-error paths of macb_open(), from
          Charles Keepax.
      
       5) Fix races between alx_stop and alx_remove, from Zekun Shen.
      
       6) We forget to re-enable SRIOV during resume in bnxt_en driver, from
          Michael Chan.
      
       7) Fix memory leak in ipv6_mc_destroy_dev(), from Wang Hai.
      
       8) rxtx stats use wrong index in mvpp2 driver, from Sven Auhagen.
      
       9) Fix memory leak in mptcp_subflow_create_socket error path, from Wei
          Yongjun.
      
      10) We should not adjust the TCP window advertised when sending dup acks
          in non-SACK mode, because it won't be counted as a dup by the sender
          if the window size changes. From Eric Dumazet.
      
      11) Destroy the right number of queues during remove in mvpp2 driver,
          from Sven Auhagen.
      
      12) Various WOL and PM fixes to e1000 driver, from Chen Yu, Vaibhav
          Gupta, and Arnd Bergmann.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (35 commits)
        e1000e: fix unused-function warning
        e1000: use generic power management
        e1000e: Do not wake up the system via WOL if device wakeup is disabled
        lan743x: add MODULE_DEVICE_TABLE for module loading alias
        mlxsw: spectrum: Adjust headroom buffers for 8x ports
        bareudp: Fixed configuration to avoid having garbage values
        mvpp2: remove module bugfix
        tcp: grow window for OOO packets only for SACK flows
        mptcp: fix memory leak in mptcp_subflow_create_socket()
        netfilter: flowtable: Make nf_flow_table_offload_add/del_cb inline
        net/sched: act_ct: Make tcf_ct_flow_table_restore_skb inline
        net: dsa: sja1105: fix PTP timestamping with large tc-taprio cycles
        mvpp2: ethtool rxtx stats fix
        MAINTAINERS: switch to my private email for Renesas Ethernet drivers
        rocker: fix incorrect error handling in dma_rings_init
        test_objagg: Fix potential memory leak in error handling
        net: ethernet: mtk-star-emac: simplify interrupt handling
        mld: fix memory leak in ipv6_mc_destroy_dev()
        bnxt_en: Return from timer if interface is not in open state.
        bnxt_en: Fix AER reset logic on 57500 chips.
        ...
      69119673
    • Linus Torvalds's avatar
      Merge tag 'afs-fixes-20200616' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs · 26c20ffc
      Linus Torvalds authored
      Pull AFS fixes from David Howells:
       "I've managed to get xfstests kind of working with afs. Here are a set
        of patches that fix most of the bugs found.
      
        There are a number of primary issues:
      
         - Incorrect handling of mtime and non-handling of ctime. It might be
           argued, that the latter isn't a bug since the AFS protocol doesn't
           support ctime, but I should probably still update it locally.
      
         - Shared-write mmap, truncate and writeback bugs. This includes not
           changing i_size under the callback lock, overwriting local i_size
           with the reply from the server after a partial writeback, not
           limiting the writeback from an mmapped page to EOF.
      
         - Checks for an abort code indicating that the primary vnode in an
           operation was deleted by a third-party are done in the wrong place.
      
         - Silly rename bugs. This includes an incomplete conversion to the
           new operation handling, duplicate nlink handling, nlink changing
           not being done inside the callback lock and insufficient handling
           of third-party conflicting directory changes.
      
        And some secondary ones:
      
         - The UAEOVERFLOW abort code should map to EOVERFLOW not EREMOTEIO.
      
         - Remove a couple of unused or incompletely used bits.
      
         - Remove a couple of redundant success checks.
      
        These seem to fix all the data-corruption bugs found by
      
      	./check -afs -g quick
      
        along with the obvious silly rename bugs and time bugs.
      
        There are still some test failures, but they seem to fall into two
        classes: firstly, the authentication/security model is different to
        the standard UNIX model and permission is arbitrated by the server and
        cached locally; and secondly, there are a number of features that AFS
        does not support (such as mknod). But in these cases, the tests
        themselves need to be adapted or skipped.
      
        Using the in-kernel afs client with xfstests also found a bug in the
        AuriStor AFS server that has been fixed for a future release"
      
      * tag 'afs-fixes-20200616' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
        afs: Fix silly rename
        afs: afs_vnode_commit_status() doesn't need to check the RPC error
        afs: Fix use of afs_check_for_remote_deletion()
        afs: Remove afs_operation::abort_code
        afs: Fix yfs_fs_fetch_status() to honour vnode selector
        afs: Remove yfs_fs_fetch_file_status() as it's not used
        afs: Fix the mapping of the UAEOVERFLOW abort code
        afs: Fix truncation issues and mmap writeback size
        afs: Concoct ctimes
        afs: Fix EOF corruption
        afs: afs_write_end() should change i_size under the right lock
        afs: Fix non-setting of mtime when writing into mmap
      26c20ffc
    • Randy Dunlap's avatar
      Documentation: remove SH-5 index entries · f17957f7
      Randy Dunlap authored
      Remove SH-5 documentation index entries following the removal
      of SH-5 source code.
      
      Error: Cannot open file ../arch/sh/mm/tlb-sh5.c
      Error: Cannot open file ../arch/sh/mm/tlb-sh5.c
      Error: Cannot open file ../arch/sh/include/asm/tlb_64.h
      Error: Cannot open file ../arch/sh/include/asm/tlb_64.h
      
      Fixes: 3b69e8b4 ("Merge tag 'sh-for-5.8' of git://git.libc.org/linux-sh")
      Signed-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Reviewed-by: default avatarGeert Uytterhoeven <geert+renesas@glider.be>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: ysato@users.sourceforge.jp
      Cc: linux-sh@vger.kernel.org
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f17957f7
    • Linus Torvalds's avatar
      Merge tag 'flex-array-conversions-5.8-rc2' of... · ffbc9376
      Linus Torvalds authored
      Merge tag 'flex-array-conversions-5.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux
      
      Pull flexible-array member conversions from Gustavo A. R. Silva:
       "Replace zero-length arrays with flexible-array members.
      
        Notice that all of these patches have been baking in linux-next for
        two development cycles now.
      
        There is a regular need in the kernel to provide a way to declare
        having a dynamically sized set of trailing elements in a structure.
        Kernel code should always use “flexible array members”[1] for these
        cases. The older style of one-element or zero-length arrays should no
        longer be used[2].
      
        C99 introduced “flexible array members”, which lacks a numeric size
        for the array declaration entirely:
      
              struct something {
                      size_t count;
                      struct foo items[];
              };
      
        This is the way the kernel expects dynamically sized trailing elements
        to be declared. It allows the compiler to generate errors when the
        flexible array does not occur last in the structure, which helps to
        prevent some kind of undefined behavior[3] bugs from being
        inadvertently introduced to the codebase.
      
        It also allows the compiler to correctly analyze array sizes (via
        sizeof(), CONFIG_FORTIFY_SOURCE, and CONFIG_UBSAN_BOUNDS). For
        instance, there is no mechanism that warns us that the following
        application of the sizeof() operator to a zero-length array always
        results in zero:
      
              struct something {
                      size_t count;
                      struct foo items[0];
              };
      
              struct something *instance;
      
              instance = kmalloc(struct_size(instance, items, count), GFP_KERNEL);
              instance->count = count;
      
              size = sizeof(instance->items) * instance->count;
              memcpy(instance->items, source, size);
      
        At the last line of code above, size turns out to be zero, when one
        might have thought it represents the total size in bytes of the
        dynamic memory recently allocated for the trailing array items. Here
        are a couple examples of this issue[4][5].
      
        Instead, flexible array members have incomplete type, and so the
        sizeof() operator may not be applied[6], so any misuse of such
        operators will be immediately noticed at build time.
      
        The cleanest and least error-prone way to implement this is through
        the use of a flexible array member:
      
              struct something {
                      size_t count;
                      struct foo items[];
              };
      
              struct something *instance;
      
              instance = kmalloc(struct_size(instance, items, count), GFP_KERNEL);
              instance->count = count;
      
              size = sizeof(instance->items[0]) * instance->count;
              memcpy(instance->items, source, size);
      
        instead"
      
      [1] https://en.wikipedia.org/wiki/Flexible_array_member
      [2] https://github.com/KSPP/linux/issues/21
      [3] commit 76497732 ("cxgb3/l2t: Fix undefined behaviour")
      [4] commit f2cd32a4 ("rndis_wlan: Remove logically dead code")
      [5] commit ab91c2a8 ("tpm: eventlog: Replace zero-length array with flexible-array member")
      [6] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
      
      * tag 'flex-array-conversions-5.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux: (41 commits)
        w1: Replace zero-length array with flexible-array
        tracing/probe: Replace zero-length array with flexible-array
        soc: ti: Replace zero-length array with flexible-array
        tifm: Replace zero-length array with flexible-array
        dmaengine: tegra-apb: Replace zero-length array with flexible-array
        stm class: Replace zero-length array with flexible-array
        Squashfs: Replace zero-length array with flexible-array
        ASoC: SOF: Replace zero-length array with flexible-array
        ima: Replace zero-length array with flexible-array
        sctp: Replace zero-length array with flexible-array
        phy: samsung: Replace zero-length array with flexible-array
        RxRPC: Replace zero-length array with flexible-array
        rapidio: Replace zero-length array with flexible-array
        media: pwc: Replace zero-length array with flexible-array
        firmware: pcdp: Replace zero-length array with flexible-array
        oprofile: Replace zero-length array with flexible-array
        block: Replace zero-length array with flexible-array
        tools/testing/nvdimm: Replace zero-length array with flexible-array
        libata: Replace zero-length array with flexible-array
        kprobes: Replace zero-length array with flexible-array
        ...
      ffbc9376
    • Arvind Sankar's avatar
      x86/purgatory: Add -fno-stack-protector · ff58155c
      Arvind Sankar authored
      The purgatory Makefile removes -fstack-protector options if they were
      configured in, but does not currently add -fno-stack-protector.
      
      If gcc was configured with the --enable-default-ssp configure option,
      this results in the stack protector still being enabled for the
      purgatory (absent distro-specific specs files that might disable it
      again for freestanding compilations), if the main kernel is being
      compiled with stack protection enabled (if it's disabled for the main
      kernel, the top-level Makefile will add -fno-stack-protector).
      
      This will break the build since commit
        e4160b2e ("x86/purgatory: Fail the build if purgatory.ro has missing symbols")
      and prior to that would have caused runtime failure when trying to use
      kexec.
      
      Explicitly add -fno-stack-protector to avoid this, as done in other
      Makefiles that need to disable the stack protector.
      Reported-by: default avatarGabriel C <nix.or.die@googlemail.com>
      Signed-off-by: default avatarArvind Sankar <nivedita@alum.mit.edu>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ff58155c
  4. 16 Jun, 2020 11 commits
    • David S. Miller's avatar
      Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue · c9f66b43
      David S. Miller authored
      Jeff Kirsher says:
      
      ====================
      Intel Wired LAN Driver Updates 2020-06-16
      
      This series contains fixes to e1000 and e1000e.
      
      Chen fixes an e1000e issue where systems could be waken via WoL, even
      though the user has disabled the wakeup bit via sysfs.
      
      Vaibhav Gupta updates the e1000 driver to clean up the legacy Power
      Management hooks.
      
      Arnd Bergmann cleans up the inconsistent use CONFIG_PM_SLEEP
      preprocessor tags, which also resolves the compiler warnings about the
      possibility of unused structure.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c9f66b43
    • Arnd Bergmann's avatar
      e1000e: fix unused-function warning · 880e6269
      Arnd Bergmann authored
      The CONFIG_PM_SLEEP #ifdef checks in this file are inconsistent,
      leading to a warning about sometimes unused function:
      
      drivers/net/ethernet/intel/e1000e/netdev.c:137:13: error: unused function 'e1000e_check_me' [-Werror,-Wunused-function]
      
      Rather than adding more #ifdefs, just remove them completely
      and mark the PM functions as __maybe_unused to let the compiler
      work it out on it own.
      
      Fixes: e086ba2f ("e1000e: disable s0ix entry and exit flows for ME systems")
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Tested-by: default avatarAaron Brown <aaron.f.brown@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      880e6269
    • Vaibhav Gupta's avatar
      e1000: use generic power management · eb6779d4
      Vaibhav Gupta authored
      With legacy PM hooks, it was the responsibility of a driver to manage PCI
      states and also the device's power state. The generic approach is to let PCI
      core handle the work.
      
      e1000_suspend() calls __e1000_shutdown() to perform intermediate tasks.
      __e1000_shutdown() modifies the value of "wake" (device should be wakeup
      enabled or not), responsible for controlling the flow of legacy PM.
      
      Since, PCI core has no idea about the value of "wake", new code for generic
      PM may produce unexpected results. Thus, use "device_set_wakeup_enable()"
      to wakeup-enable the device accordingly.
      Signed-off-by: default avatarVaibhav Gupta <vaibhavgupta40@gmail.com>
      Tested-by: default avatarAaron Brown <aaron.f.brown@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      eb6779d4
    • Chen Yu's avatar
      e1000e: Do not wake up the system via WOL if device wakeup is disabled · 6bf6be11
      Chen Yu authored
      Currently the system will be woken up via WOL(Wake On LAN) even if the
      device wakeup ability has been disabled via sysfs:
       cat /sys/devices/pci0000:00/0000:00:1f.6/power/wakeup
       disabled
      
      The system should not be woken up if the user has explicitly
      disabled the wake up ability for this device.
      
      This patch clears the WOL ability of this network device if the
      user has disabled the wake up ability in sysfs.
      
      Fixes: bc7f75fa ("[E1000E]: New pci-express e1000 driver")
      Reported-by: default avatar"Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
      Reviewed-by: default avatarAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: <Stable@vger.kernel.org>
      Signed-off-by: default avatarChen Yu <yu.c.chen@intel.com>
      Tested-by: default avatarAaron Brown <aaron.f.brown@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      6bf6be11
    • Tim Harvey's avatar
      lan743x: add MODULE_DEVICE_TABLE for module loading alias · ea12fe9d
      Tim Harvey authored
      Without a MODULE_DEVICE_TABLE the attributes are missing that create
      an alias for auto-loading the module in userspace via hotplug.
      Signed-off-by: default avatarTim Harvey <tharvey@gateworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ea12fe9d
    • David Howells's avatar
      afs: Fix silly rename · b6489a49
      David Howells authored
      Fix AFS's silly rename by the following means:
      
       (1) Set the destination directory in afs_do_silly_rename() so as to avoid
           misbehaviour and indicate that the directory data version will
           increment by 1 so as to avoid warnings about unexpected changes in the
           DV.  Also indicate that the ctime should be updated to avoid xfstest
           grumbling.
      
       (2) Note when the server indicates that a directory changed more than we
           expected (AFS_OPERATION_DIR_CONFLICT), indicating a conflict with a
           third party change, checking on successful completion of unlink and
           rename.
      
           The problem is that the FS.RemoveFile RPC op doesn't report the status
           of the unlinked file, though YFS.RemoveFile2 does.  This can be
           mitigated by the assumption that if the directory DV cranked by
           exactly 1, we can be sure we removed one link from the file; further,
           ordinarily in AFS, files cannot be hardlinked across directories, so
           if we reduce nlink to 0, the file is deleted.
      
           However, if the directory DV jumps by more than 1, we cannot know if a
           third party intervened by adding or removing a link on the file we
           just removed a link from.
      
           The same also goes for any vnode that is at the destination of the
           FS.Rename RPC op.
      
       (3) Make afs_vnode_commit_status() apply the nlink drop inside the cb_lock
           section along with the other attribute updates if ->op_unlinked is set
           on the descriptor for the appropriate vnode.
      
       (4) Issue a follow up status fetch to the unlinked file in the event of a
           third party conflict that makes it impossible for us to know if we
           actually deleted the file or not.
      
       (5) Provide a flag, AFS_VNODE_SILLY_DELETED, to make afs_getattr() lie to
           the user about the nlink of a silly deleted file so that it appears as
           0, not 1.
      
      Found with the generic/035 and generic/084 xfstests.
      
      Fixes: e49c7b2f ("afs: Build an abstraction around an "operation" concept")
      Reported-by: default avatarMarc Dionne <marc.dionne@auristor.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      b6489a49
    • Ido Schimmel's avatar
      mlxsw: spectrum: Adjust headroom buffers for 8x ports · 60833d54
      Ido Schimmel authored
      The port's headroom buffers are used to store packets while they
      traverse the device's pipeline and also to store packets that are egress
      mirrored.
      
      On Spectrum-3, ports with eight lanes use two headroom buffers between
      which the configured headroom size is split.
      
      In order to prevent packet loss, multiply the calculated headroom size
      by two for 8x ports.
      
      Fixes: da382875 ("mlxsw: spectrum: Extend to support Spectrum-3 ASIC")
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      60833d54
    • Martin's avatar
      bareudp: Fixed configuration to avoid having garbage values · b15bb881
      Martin authored
      Code to initialize the conf structure while gathering the configuration
      of the device was missing.
      
      Fixes: 571912c6 ("net: UDP tunnel encapsulation module for tunnelling different protocols like MPLS, IP, NSH etc.")
      Signed-off-by: default avatarMartin <martin.varghese@nokia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b15bb881
    • Sven Auhagen's avatar
      mvpp2: remove module bugfix · 807eaf99
      Sven Auhagen authored
      The remove function does not destroy all
      BM Pools when per cpu pool is active.
      
      When reloading the mvpp2 as a module the BM Pools
      are still active in hardware and due to the bug
      have twice the size now old + new.
      
      This eventually leads to a kernel crash.
      
      v2:
      * add Fixes tag
      
      Fixes: 7d04b0b1 ("mvpp2: percpu buffers")
      Signed-off-by: default avatarSven Auhagen <sven.auhagen@voleatech.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      807eaf99
    • Eric Dumazet's avatar
      tcp: grow window for OOO packets only for SACK flows · 66205121
      Eric Dumazet authored
      Back in 2013, we made a change that broke fast retransmit
      for non SACK flows.
      
      Indeed, for these flows, a sender needs to receive three duplicate
      ACK before starting fast retransmit. Sending ACK with different
      receive window do not count.
      
      Even if enabling SACK is strongly recommended these days,
      there still are some cases where it has to be disabled.
      
      Not increasing the window seems better than having to
      rely on RTO.
      
      After the fix, following packetdrill test gives :
      
      // Initialize connection
          0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
         +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
         +0 bind(3, ..., ...) = 0
         +0 listen(3, 1) = 0
      
         +0 < S 0:0(0) win 32792 <mss 1000,nop,wscale 7>
         +0 > S. 0:0(0) ack 1 <mss 1460,nop,wscale 8>
         +0 < . 1:1(0) ack 1 win 514
      
         +0 accept(3, ..., ...) = 4
      
         +0 < . 1:1001(1000) ack 1 win 514
      // Quick ack
         +0 > . 1:1(0) ack 1001 win 264
      
         +0 < . 2001:3001(1000) ack 1 win 514
      // DUPACK : Normally we should not change the window
         +0 > . 1:1(0) ack 1001 win 264
      
         +0 < . 3001:4001(1000) ack 1 win 514
      // DUPACK : Normally we should not change the window
         +0 > . 1:1(0) ack 1001 win 264
      
         +0 < . 4001:5001(1000) ack 1 win 514
      // DUPACK : Normally we should not change the window
          +0 > . 1:1(0) ack 1001 win 264
      
         +0 < . 1001:2001(1000) ack 1 win 514
      // Hole is repaired.
         +0 > . 1:1(0) ack 5001 win 272
      
      Fixes: 4e4f1fc2 ("tcp: properly increase rcv_ssthresh for ofo packets")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarVenkat Venkatsubra <venkat.x.venkatsubra@oracle.com>
      Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      66205121
    • Linus Torvalds's avatar
      Merge tag 'mfd-fixes-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd · 651220e2
      Linus Torvalds authored
      Pull MFD fix from Lee Jones:
       "Fix NULL pointer dereference in mt6360 driver"
      
      * tag 'mfd-fixes-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd:
        mfd: mt6360: Fix register driver NULL pointer by adding driver name
      651220e2