1. 25 Sep, 2021 13 commits
    • Kees Cook's avatar
      compiler_types.h: Remove __compiletime_object_size() · c80d92fb
      Kees Cook authored
      Since all compilers support __builtin_object_size(), and there is only
      one user of __compiletime_object_size, remove it to avoid the needless
      indirection. This lets Clang reason about check_copy_size() correctly.
      
      Link: https://github.com/ClangBuiltLinux/linux/issues/1179Suggested-by: default avatarNick Desaulniers <ndesaulniers@google.com>
      Cc: Nathan Chancellor <nathan@kernel.org>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Sedat Dilek <sedat.dilek@gmail.com>
      Cc: Will Deacon <will@kernel.org>
      Cc: Marco Elver <elver@google.com>
      Cc: Arvind Sankar <nivedita@alum.mit.edu>
      Cc: Masahiro Yamada <masahiroy@kernel.org>
      Cc: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Sami Tolvanen <samitolvanen@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Gabriel Krisman Bertazi <krisman@collabora.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Reviewed-by: default avatarMiguel Ojeda <ojeda@kernel.org>
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      c80d92fb
    • Kees Cook's avatar
      cm4000_cs: Use struct_group() to zero struct cm4000_dev region · 8610047c
      Kees Cook authored
      In preparation for FORTIFY_SOURCE performing compile-time and run-time
      field bounds checking for memset(), avoid intentionally writing across
      neighboring fields.
      
      Add struct_group() to mark region of struct cm4000_dev that should be
      initialized to zero.
      
      Cc: Harald Welte <laforge@gnumonks.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Acked-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Link: https://lore.kernel.org/lkml/YQDvxAofJlI1JoGZ@kroah.com
      8610047c
    • Kees Cook's avatar
      can: flexcan: Use struct_group() to zero struct flexcan_regs regions · c92a08c1
      Kees Cook authored
      In preparation for FORTIFY_SOURCE performing compile-time and run-time
      field bounds checking for memset(), avoid intentionally writing across
      neighboring fields.
      
      Add struct_group() to mark both regions of struct flexcan_regs that get
      initialized to zero. Avoid the future warnings:
      
      In function 'fortify_memset_chk',
          inlined from 'memset_io' at ./include/asm-generic/io.h:1169:2,
          inlined from 'flexcan_ram_init' at drivers/net/can/flexcan.c:1403:2:
      ./include/linux/fortify-string.h:199:4: warning: call to '__write_overflow_field' declared with attribute warning: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Wattribute-warning]
        199 |    __write_overflow_field(p_size_field, size);
            |    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      In function 'fortify_memset_chk',
          inlined from 'memset_io' at ./include/asm-generic/io.h:1169:2,
          inlined from 'flexcan_ram_init' at drivers/net/can/flexcan.c:1408:3:
      ./include/linux/fortify-string.h:199:4: warning: call to '__write_overflow_field' declared with attribute warning: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Wattribute-warning]
        199 |    __write_overflow_field(p_size_field, size);
            |    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      
      Cc: Wolfgang Grandegger <wg@grandegger.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Jakub Kicinski <kuba@kernel.org>
      Cc: linux-can@vger.kernel.org
      Cc: netdev@vger.kernel.org
      Acked-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      c92a08c1
    • Kees Cook's avatar
      HID: roccat: Use struct_group() to zero kone_mouse_event · 69dae0fe
      Kees Cook authored
      In preparation for FORTIFY_SOURCE performing compile-time and run-time
      field bounds checking for memset(), avoid intentionally writing across
      neighboring fields.
      
      Add struct_group() to mark region of struct kone_mouse_event that should
      be initialized to zero.
      
      Cc: Stefan Achatz <erazor_de@users.sourceforge.net>
      Cc: Benjamin Tissoires <benjamin.tissoires@redhat.com>
      Cc: linux-input@vger.kernel.org
      Acked-by: default avatarJiri Kosina <jikos@kernel.org>
      Link: https://lore.kernel.org/lkml/nycvar.YFH.7.76.2108201810560.15313@cbobk.fhfr.pmSigned-off-by: default avatarKees Cook <keescook@chromium.org>
      69dae0fe
    • Kees Cook's avatar
      HID: cp2112: Use struct_group() for memcpy() region · 5e423a0c
      Kees Cook authored
      In preparation for FORTIFY_SOURCE performing compile-time and run-time
      field bounds checking for memcpy(), memmove(), and memset(), avoid
      intentionally writing across neighboring fields.
      
      Use struct_group() in struct cp2112_string_report around members report,
      length, type, and string, so they can be referenced together. This will
      allow memcpy() and sizeof() to more easily reason about sizes, improve
      readability, and avoid future warnings about writing beyond the end of
      report.
      
      "pahole" shows no size nor member offset changes to struct
      cp2112_string_report.  "objdump -d" shows no meaningful object
      code changes (i.e. only source line number induced differences.)
      
      Cc: Benjamin Tissoires <benjamin.tissoires@redhat.com>
      Cc: linux-input@vger.kernel.org
      Acked-by: default avatarJiri Kosina <jikos@kernel.org>
      Link: https://lore.kernel.org/lkml/nycvar.YFH.7.76.2108201810560.15313@cbobk.fhfr.pmSigned-off-by: default avatarKees Cook <keescook@chromium.org>
      5e423a0c
    • Kees Cook's avatar
      drm/mga/mga_ioc32: Use struct_group() for memcpy() region · 10579b75
      Kees Cook authored
      In preparation for FORTIFY_SOURCE performing compile-time and run-time
      field bounds checking for memcpy(), memmove(), and memset(), avoid
      intentionally writing across neighboring fields.
      
      Use struct_group() in struct drm32_mga_init around members chipset, sgram,
      maccess, fb_cpp, front_offset, front_pitch, back_offset, back_pitch,
      depth_cpp, depth_offset, depth_pitch, texture_offset, and texture_size,
      so they can be referenced together. This will allow memcpy() and sizeof()
      to more easily reason about sizes, improve readability, and avoid future
      warnings about writing beyond the end of chipset.
      
      "pahole" shows no size nor member offset changes to struct drm32_mga_init.
      "objdump -d" shows no meaningful object code changes (i.e. only source
      line number induced differences and optimizations).
      
      Note that since this is a UAPI header, __struct_group() is used
      directly.
      
      Cc: David Airlie <airlied@linux.ie>
      Cc: Lee Jones <lee.jones@linaro.org>
      Cc: dri-devel@lists.freedesktop.org
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Acked-by: default avatarDaniel Vetter <daniel@ffwll.ch>
      Link: https://lore.kernel.org/lkml/YQKa76A6XuFqgM03@phenom.ffwll.local
      10579b75
    • Kees Cook's avatar
      iommu/amd: Use struct_group() for memcpy() region · 43d83af8
      Kees Cook authored
      In preparation for FORTIFY_SOURCE performing compile-time and run-time
      field bounds checking for memcpy(), memmove(), and memset(), avoid
      intentionally writing across neighboring fields.
      
      Use struct_group() in struct ivhd_entry around members ext and hidh, so
      they can be referenced together. This will allow memcpy() and sizeof()
      to more easily reason about sizes, improve readability, and avoid future
      warnings about writing beyond the end of ext.
      
      "pahole" shows no size nor member offset changes to struct ivhd_entry.
      "objdump -d" shows no object code changes.
      
      Cc: Will Deacon <will@kernel.org>
      Cc: iommu@lists.linux-foundation.org
      Acked-by: default avatarJoerg Roedel <jroedel@suse.de>
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      43d83af8
    • Kees Cook's avatar
      bnxt_en: Use struct_group_attr() for memcpy() region · 241fe395
      Kees Cook authored
      In preparation for FORTIFY_SOURCE performing compile-time and run-time
      field bounds checking for memcpy(), memmove(), and memset(), avoid
      intentionally writing across neighboring fields.
      
      Use struct_group() around members queue_id, min_bw, max_bw, tsa, pri_lvl,
      and bw_weight so they can be referenced together. This will allow memcpy()
      and sizeof() to more easily reason about sizes, improve readability,
      and avoid future warnings about writing beyond the end of queue_id.
      
      "pahole" shows no size nor member offset changes to struct bnxt_cos2bw_cfg.
      "objdump -d" shows no meaningful object code changes (i.e. only source
      line number induced differences and optimizations).
      
      Cc: Michael Chan <michael.chan@broadcom.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Jakub Kicinski <kuba@kernel.org>
      Cc: netdev@vger.kernel.org
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Reviewed-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Link: https://lore.kernel.org/lkml/CACKFLinDc6Y+P8eZ=450yA1nMC7swTURLtcdyiNR=9J6dfFyBg@mail.gmail.comReviewed-by: default avatarGustavo A. R. Silva <gustavoars@kernel.org>
      Link: https://lore.kernel.org/lkml/20210728044517.GE35706@embeddedor
      241fe395
    • Kees Cook's avatar
      cxl/core: Replace unions with struct_group() · 301e68dd
      Kees Cook authored
      Use the newly introduced struct_group_typed() macro to clean up the
      declaration of struct cxl_regs.
      
      Cc: Alison Schofield <alison.schofield@intel.com>
      Cc: Vishal Verma <vishal.l.verma@intel.com>
      Cc: Ira Weiny <ira.weiny@intel.com>
      Cc: Ben Widawsky <ben.widawsky@intel.com>
      Cc: linux-cxl@vger.kernel.org
      Suggested-by: default avatarDan Williams <dan.j.williams@intel.com>
      Link: https://lore.kernel.org/lkml/1d9a2e6df2a9a35b2cdd50a9a68cac5991e7e5f0.camel@intel.comReviewed-by: default avatarDan Williams <dan.j.williams@intel.com>
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      301e68dd
    • Kees Cook's avatar
      stddef: Introduce struct_group() helper macro · 50d7bd38
      Kees Cook authored
      Kernel code has a regular need to describe groups of members within a
      structure usually when they need to be copied or initialized separately
      from the rest of the surrounding structure. The generally accepted design
      pattern in C is to use a named sub-struct:
      
      	struct foo {
      		int one;
      		struct {
      			int two;
      			int three, four;
      		} thing;
      		int five;
      	};
      
      This would allow for traditional references and sizing:
      
      	memcpy(&dst.thing, &src.thing, sizeof(dst.thing));
      
      However, doing this would mean that referencing struct members enclosed
      by such named structs would always require including the sub-struct name
      in identifiers:
      
      	do_something(dst.thing.three);
      
      This has tended to be quite inflexible, especially when such groupings
      need to be added to established code which causes huge naming churn.
      Three workarounds exist in the kernel for this problem, and each have
      other negative properties.
      
      To avoid the naming churn, there is a design pattern of adding macro
      aliases for the named struct:
      
      	#define f_three thing.three
      
      This ends up polluting the global namespace, and makes it difficult to
      search for identifiers.
      
      Another common work-around in kernel code avoids the pollution by avoiding
      the named struct entirely, instead identifying the group's boundaries using
      either a pair of empty anonymous structs of a pair of zero-element arrays:
      
      	struct foo {
      		int one;
      		struct { } start;
      		int two;
      		int three, four;
      		struct { } finish;
      		int five;
      	};
      
      	struct foo {
      		int one;
      		int start[0];
      		int two;
      		int three, four;
      		int finish[0];
      		int five;
      	};
      
      This allows code to avoid needing to use a sub-struct named for member
      references within the surrounding structure, but loses the benefits of
      being able to actually use such a struct, making it rather fragile. Using
      these requires open-coded calculation of sizes and offsets. The efforts
      made to avoid common mistakes include lots of comments, or adding various
      BUILD_BUG_ON()s. Such code is left with no way for the compiler to reason
      about the boundaries (e.g. the "start" object looks like it's 0 bytes
      in length), making bounds checking depend on open-coded calculations:
      
      	if (length > offsetof(struct foo, finish) -
      		     offsetof(struct foo, start))
      		return -EINVAL;
      	memcpy(&dst.start, &src.start, offsetof(struct foo, finish) -
      				       offsetof(struct foo, start));
      
      However, the vast majority of places in the kernel that operate on
      groups of members do so without any identification of the grouping,
      relying either on comments or implicit knowledge of the struct contents,
      which is even harder for the compiler to reason about, and results in
      even more fragile manual sizing, usually depending on member locations
      outside of the region (e.g. to copy "two" and "three", use the start of
      "four" to find the size):
      
      	BUILD_BUG_ON((offsetof(struct foo, four) <
      		      offsetof(struct foo, two)) ||
      		     (offsetof(struct foo, four) <
      		      offsetof(struct foo, three));
      	if (length > offsetof(struct foo, four) -
      		     offsetof(struct foo, two))
      		return -EINVAL;
      	memcpy(&dst.two, &src.two, length);
      
      In order to have a regular programmatic way to describe a struct
      region that can be used for references and sizing, can be examined for
      bounds checking, avoids forcing the use of intermediate identifiers,
      and avoids polluting the global namespace, introduce the struct_group()
      macro. This macro wraps the member declarations to create an anonymous
      union of an anonymous struct (no intermediate name) and a named struct
      (for references and sizing):
      
      	struct foo {
      		int one;
      		struct_group(thing,
      			int two;
      			int three, four;
      		);
      		int five;
      	};
      
      	if (length > sizeof(src.thing))
      		return -EINVAL;
      	memcpy(&dst.thing, &src.thing, length);
      	do_something(dst.three);
      
      There are some rare cases where the resulting struct_group() needs
      attributes added, so struct_group_attr() is also introduced to allow
      for specifying struct attributes (e.g. __align(x) or __packed).
      Additionally, there are places where such declarations would like to
      have the struct be tagged, so struct_group_tagged() is added.
      
      Given there is a need for a handful of UAPI uses too, the underlying
      __struct_group() macro has been defined in UAPI so it can be used there
      too.
      
      To avoid confusing scripts/kernel-doc, hide the macro from its struct
      parsing.
      Co-developed-by: default avatarKeith Packard <keithp@keithp.com>
      Signed-off-by: default avatarKeith Packard <keithp@keithp.com>
      Acked-by: default avatarGustavo A. R. Silva <gustavoars@kernel.org>
      Link: https://lore.kernel.org/lkml/20210728023217.GC35706@embeddedorEnhanced-by: default avatarRasmus Villemoes <linux@rasmusvillemoes.dk>
      Link: https://lore.kernel.org/lkml/41183a98-bdb9-4ad6-7eab-5a7292a6df84@rasmusvillemoes.dkEnhanced-by: default avatarDan Williams <dan.j.williams@intel.com>
      Link: https://lore.kernel.org/lkml/1d9a2e6df2a9a35b2cdd50a9a68cac5991e7e5f0.camel@intel.comEnhanced-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      Link: https://lore.kernel.org/lkml/YQKa76A6XuFqgM03@phenom.ffwll.localAcked-by: default avatarDan Williams <dan.j.williams@intel.com>
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      50d7bd38
    • Kees Cook's avatar
      stddef: Fix kerndoc for sizeof_field() and offsetofend() · e7f18c22
      Kees Cook authored
      Adjust the comment styles so these are correctly identified as valid
      kern-doc.
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      e7f18c22
    • Kees Cook's avatar
      powerpc: Split memset() to avoid multi-field overflow · 0e17ad87
      Kees Cook authored
      In preparation for FORTIFY_SOURCE performing compile-time and run-time
      field bounds checking for memset(), avoid intentionally writing across
      neighboring fields.
      
      Instead of writing across a field boundary with memset(), move the call
      to just the array, and an explicit zeroing of the prior field.
      
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Qinglang Miao <miaoqinglang@huawei.com>
      Cc: "Gustavo A. R. Silva" <gustavoars@kernel.org>
      Cc: Hulk Robot <hulkci@huawei.com>
      Cc: Wang Wensheng <wangwensheng4@huawei.com>
      Cc: linuxppc-dev@lists.ozlabs.org
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Reviewed-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/lkml/87czqsnmw9.fsf@mpe.ellerman.id.au
      0e17ad87
    • Kees Cook's avatar
      scsi: ibmvscsi: Avoid multi-field memset() overflow by aiming at srp · 3d0107a7
      Kees Cook authored
      In preparation for FORTIFY_SOURCE performing compile-time and run-time
      field bounds checking for memset(), avoid intentionally writing across
      neighboring fields.
      
      Instead of writing beyond the end of evt_struct->iu.srp.cmd, target the
      upper union (evt_struct->iu.srp) instead, as that's what is being wiped.
      
      Cc: Tyrel Datwyler <tyreld@linux.ibm.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
      Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
      Cc: linux-scsi@vger.kernel.org
      Cc: linuxppc-dev@lists.ozlabs.org
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Acked-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Link: https://lore.kernel.org/lkml/yq135rzp79c.fsf@ca-mkp.ca.oracle.comAcked-by: default avatarTyrel Datwyler <tyreld@linux.ibm.com>
      Link: https://lore.kernel.org/lkml/6eae8434-e9a7-aa74-628b-b515b3695359@linux.ibm.com
      3d0107a7
  2. 20 Sep, 2021 2 commits
    • Linus Torvalds's avatar
      Linux 5.15-rc2 · e4e737bb
      Linus Torvalds authored
      e4e737bb
    • Linus Torvalds's avatar
      pci_iounmap'2: Electric Boogaloo: try to make sense of it all · 316e8d79
      Linus Torvalds authored
      Nathan Chancellor reports that the recent change to pci_iounmap in
      commit 9caea000 ("parisc: Declare pci_iounmap() parisc version only
      when CONFIG_PCI enabled") causes build errors on arm64.
      
      It took me about two hours to convince myself that I think I know what
      the logic of that mess of #ifdef's in the <asm-generic/io.h> header file
      really aim to do, and rewrite it to be easier to follow.
      
      Famous last words.
      
      Anyway, the code has now been lifted from that grotty header file into
      lib/pci_iomap.c, and has fairly extensive comments about what the logic
      is.  It also avoids indirecting through another confusing (and badly
      named) helper function that has other preprocessor config conditionals.
      
      Let's see what odd architecture did something else strange in this area
      to break things.  But my arm64 cross build is clean.
      
      Fixes: 9caea000 ("parisc: Declare pci_iounmap() parisc version only when CONFIG_PCI enabled")
      Reported-by: default avatarNathan Chancellor <nathan@kernel.org>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Guenter Roeck <linux@roeck-us.net>
      Cc: Ulrich Teichert <krypton@ulrich-teichert.org>
      Cc: James Bottomley <James.Bottomley@hansenpartnership.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      316e8d79
  3. 19 Sep, 2021 18 commits
  4. 18 Sep, 2021 7 commits
    • Linus Torvalds's avatar
      alpha: move __udiv_qrnnd library function to arch/alpha/lib/ · d4d016ca
      Linus Torvalds authored
      We already had the implementation for __udiv_qrnnd (unsigned divide for
      multi-precision arithmetic) as part of the alpha math emulation code.
      
      But you can disable the math emulation code - even if you shouldn't -
      and then the MPI code that actually wants this functionality (and is
      needed by various crypto functions) will fail to build.
      
      So move the extended-precision divide code to be a regular library
      function, just like all the regular division code is.  That way ie is
      available regardless of math-emulation.
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d4d016ca
    • Linus Torvalds's avatar
      alpha: mark 'Jensen' platform as no longer broken · ab41f75e
      Linus Torvalds authored
      Ok, it almost certainly is still broken on actual hardware, but the
      immediate reason for it having been marked BROKEN was a build error that
      is fixed by just making sure the low-level IO header file is included
      sufficiently early that the __EXTERN_INLINE hackery takes effect.
      
      This was marked broken back in 2017 by commit 1883c9f4 ("alpha: mark
      jensen as broken"), but Ulrich Teichert made me look at it as part of my
      cross-build work to make sure -Werror actually does the right thing.
      
      There are lots of alpha configurations that do not build cleanly, but
      now it's no longer because Jensen wouldn't be buildable.  That said,
      because the Jensen platform doesn't force PCI to be enabled (Jensen only
      had EISA), it ends up being somewhat interesting as a source of odd
      configs.
      Reported-by: default avatarUlrich Teichert <krypton@ulrich-teichert.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ab41f75e
    • Andrii Nakryiko's avatar
      perf bpf: Ignore deprecation warning when using libbpf's btf__get_from_id() · 219d720e
      Andrii Nakryiko authored
      Perf code re-implements libbpf's btf__load_from_kernel_by_id() API as
      a weak function, presumably to dynamically link against old version of
      libbpf shared library. Unfortunately this causes compilation warning
      when perf is compiled against libbpf v0.6+.
      
      For now, just ignore deprecation warning, but there might be a better
      solution, depending on perf's needs.
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: kernel-team@fb.com
      LPU-Reference: 20210914170004.4185659-1-andrii@kernel.org
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      219d720e
    • Ian Rogers's avatar
      libperf evsel: Make use of FD robust. · aba5daeb
      Ian Rogers authored
      FD uses xyarray__entry that may return NULL if an index is out of
      bounds. If NULL is returned then a segv happens as FD unconditionally
      dereferences the pointer. This was happening in a case of with perf
      iostat as shown below. The fix is to make FD an "int*" rather than an
      int and handle the NULL case as either invalid input or a closed fd.
      
        $ sudo gdb --args perf stat --iostat  list
        ...
        Breakpoint 1, perf_evsel__alloc_fd (evsel=0x5555560951a0, ncpus=1, nthreads=1) at evsel.c:50
        50      {
        (gdb) bt
         #0  perf_evsel__alloc_fd (evsel=0x5555560951a0, ncpus=1, nthreads=1) at evsel.c:50
         #1  0x000055555585c188 in evsel__open_cpu (evsel=0x5555560951a0, cpus=0x555556093410,
            threads=0x555556086fb0, start_cpu=0, end_cpu=1) at util/evsel.c:1792
         #2  0x000055555585cfb2 in evsel__open (evsel=0x5555560951a0, cpus=0x0, threads=0x555556086fb0)
            at util/evsel.c:2045
         #3  0x000055555585d0db in evsel__open_per_thread (evsel=0x5555560951a0, threads=0x555556086fb0)
            at util/evsel.c:2065
         #4  0x00005555558ece64 in create_perf_stat_counter (evsel=0x5555560951a0,
            config=0x555555c34700 <stat_config>, target=0x555555c2f1c0 <target>, cpu=0) at util/stat.c:590
         #5  0x000055555578e927 in __run_perf_stat (argc=1, argv=0x7fffffffe4a0, run_idx=0)
            at builtin-stat.c:833
         #6  0x000055555578f3c6 in run_perf_stat (argc=1, argv=0x7fffffffe4a0, run_idx=0)
            at builtin-stat.c:1048
         #7  0x0000555555792ee5 in cmd_stat (argc=1, argv=0x7fffffffe4a0) at builtin-stat.c:2534
         #8  0x0000555555835ed3 in run_builtin (p=0x555555c3f540 <commands+288>, argc=3,
            argv=0x7fffffffe4a0) at perf.c:313
         #9  0x0000555555836154 in handle_internal_command (argc=3, argv=0x7fffffffe4a0) at perf.c:365
         #10 0x000055555583629f in run_argv (argcp=0x7fffffffe2ec, argv=0x7fffffffe2e0) at perf.c:409
         #11 0x0000555555836692 in main (argc=3, argv=0x7fffffffe4a0) at perf.c:539
        ...
        (gdb) c
        Continuing.
        Error:
        The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (uncore_iio_0/event=0x83,umask=0x04,ch_mask=0xF,fc_mask=0x07/).
        /bin/dmesg | grep -i perf may provide additional information.
      
        Program received signal SIGSEGV, Segmentation fault.
        0x00005555559b03ea in perf_evsel__close_fd_cpu (evsel=0x5555560951a0, cpu=1) at evsel.c:166
        166                     if (FD(evsel, cpu, thread) >= 0)
      
      v3. fixes a bug in perf_evsel__run_ioctl where the sense of a branch was
          backward.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Acked-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lore.kernel.org/lkml/20210918054440.2350466-1-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      aba5daeb
    • Michael Petlan's avatar
      perf machine: Initialize srcline string member in add_location struct · 57f0ff05
      Michael Petlan authored
      It's later supposed to be either a correct address or NULL. Without the
      initialization, it may contain an undefined value which results in the
      following segmentation fault:
      
        # perf top --sort comm -g --ignore-callees=do_idle
      
      terminates with:
      
        #0  0x00007ffff56b7685 in __strlen_avx2 () from /lib64/libc.so.6
        #1  0x00007ffff55e3802 in strdup () from /lib64/libc.so.6
        #2  0x00005555558cb139 in hist_entry__init (callchain_size=<optimized out>, sample_self=true, template=0x7fffde7fb110, he=0x7fffd801c250) at util/hist.c:489
        #3  hist_entry__new (template=template@entry=0x7fffde7fb110, sample_self=sample_self@entry=true) at util/hist.c:564
        #4  0x00005555558cb4ba in hists__findnew_entry (hists=hists@entry=0x5555561d9e38, entry=entry@entry=0x7fffde7fb110, al=al@entry=0x7fffde7fb420,
            sample_self=sample_self@entry=true) at util/hist.c:657
        #5  0x00005555558cba1b in __hists__add_entry (hists=hists@entry=0x5555561d9e38, al=0x7fffde7fb420, sym_parent=<optimized out>, bi=bi@entry=0x0, mi=mi@entry=0x0,
            sample=sample@entry=0x7fffde7fb4b0, sample_self=true, ops=0x0, block_info=0x0) at util/hist.c:288
        #6  0x00005555558cbb70 in hists__add_entry (sample_self=true, sample=0x7fffde7fb4b0, mi=0x0, bi=0x0, sym_parent=<optimized out>, al=<optimized out>, hists=0x5555561d9e38)
            at util/hist.c:1056
        #7  iter_add_single_cumulative_entry (iter=0x7fffde7fb460, al=<optimized out>) at util/hist.c:1056
        #8  0x00005555558cc8a4 in hist_entry_iter__add (iter=iter@entry=0x7fffde7fb460, al=al@entry=0x7fffde7fb420, max_stack_depth=<optimized out>, arg=arg@entry=0x7fffffff7db0)
            at util/hist.c:1231
        #9  0x00005555557cdc9a in perf_event__process_sample (machine=<optimized out>, sample=0x7fffde7fb4b0, evsel=<optimized out>, event=<optimized out>, tool=0x7fffffff7db0)
            at builtin-top.c:842
        #10 deliver_event (qe=<optimized out>, qevent=<optimized out>) at builtin-top.c:1202
        #11 0x00005555558a9318 in do_flush (show_progress=false, oe=0x7fffffff80e0) at util/ordered-events.c:244
        #12 __ordered_events__flush (oe=oe@entry=0x7fffffff80e0, how=how@entry=OE_FLUSH__TOP, timestamp=timestamp@entry=0) at util/ordered-events.c:323
        #13 0x00005555558a9789 in __ordered_events__flush (timestamp=<optimized out>, how=<optimized out>, oe=<optimized out>) at util/ordered-events.c:339
        #14 ordered_events__flush (how=OE_FLUSH__TOP, oe=0x7fffffff80e0) at util/ordered-events.c:341
        #15 ordered_events__flush (oe=oe@entry=0x7fffffff80e0, how=how@entry=OE_FLUSH__TOP) at util/ordered-events.c:339
        #16 0x00005555557cd631 in process_thread (arg=0x7fffffff7db0) at builtin-top.c:1114
        #17 0x00007ffff7bb817a in start_thread () from /lib64/libpthread.so.0
        #18 0x00007ffff5656dc3 in clone () from /lib64/libc.so.6
      
      If you look at the frame #2, the code is:
      
      488	 if (he->srcline) {
      489          he->srcline = strdup(he->srcline);
      490          if (he->srcline == NULL)
      491              goto err_rawdata;
      492	 }
      
      If he->srcline is not NULL (it is not NULL if it is uninitialized rubbish),
      it gets strdupped and strdupping a rubbish random string causes the problem.
      
      Also, if you look at the commit 1fb7d06a, it adds the srcline property
      into the struct, but not initializing it everywhere needed.
      
      Committer notes:
      
      Now I see, when using --ignore-callees=do_idle we end up here at line
      2189 in add_callchain_ip():
      
      2181         if (al.sym != NULL) {
      2182                 if (perf_hpp_list.parent && !*parent &&
      2183                     symbol__match_regex(al.sym, &parent_regex))
      2184                         *parent = al.sym;
      2185                 else if (have_ignore_callees && root_al &&
      2186                   symbol__match_regex(al.sym, &ignore_callees_regex)) {
      2187                         /* Treat this symbol as the root,
      2188                            forgetting its callees. */
      2189                         *root_al = al;
      2190                         callchain_cursor_reset(cursor);
      2191                 }
      2192         }
      
      And the al that doesn't have the ->srcline field initialized will be
      copied to the root_al, so then, back to:
      
      1211 int hist_entry_iter__add(struct hist_entry_iter *iter, struct addr_location *al,
      1212                          int max_stack_depth, void *arg)
      1213 {
      1214         int err, err2;
      1215         struct map *alm = NULL;
      1216
      1217         if (al)
      1218                 alm = map__get(al->map);
      1219
      1220         err = sample__resolve_callchain(iter->sample, &callchain_cursor, &iter->parent,
      1221                                         iter->evsel, al, max_stack_depth);
      1222         if (err) {
      1223                 map__put(alm);
      1224                 return err;
      1225         }
      1226
      1227         err = iter->ops->prepare_entry(iter, al);
      1228         if (err)
      1229                 goto out;
      1230
      1231         err = iter->ops->add_single_entry(iter, al);
      1232         if (err)
      1233                 goto out;
      1234
      
      That al at line 1221 is what hist_entry_iter__add() (called from
      sample__resolve_callchain()) saw as 'root_al', and then:
      
              iter->ops->add_single_entry(iter, al);
      
      will go on with al->srcline with a bogus value, I'll add the above
      sequence to the cset and apply, thanks!
      Signed-off-by: default avatarMichael Petlan <mpetlan@redhat.com>
      CC: Milian Wolff <milian.wolff@kdab.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Fixes: 1fb7d06a ("perf report Use srcline from callchain for hist entries")
      Link: https //lore.kernel.org/r/20210719145332.29747-1-mpetlan@redhat.com
      Reported-by: default avatarJuri Lelli <jlelli@redhat.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      57f0ff05
    • Adrian Hunter's avatar
      perf script: Fix ip display when type != attr->type · ff6f41fb
      Adrian Hunter authored
      set_print_ip_opts() was not being called when type != attr->type
      because there is not a one-to-one relationship between output types
      and attr->type. That resulted in ip not printing.
      
      The attr_type() function is removed, and the match of attr->type to
      output type is corrected.
      
      Example on ADL using taskset to select an atom cpu:
      
       # perf record -e cpu_atom/cpu-cycles/ taskset 0x1000 uname
       Linux
       [ perf record: Woken up 1 times to write data ]
       [ perf record: Captured and wrote 0.003 MB perf.data (7 samples) ]
      
       Before:
      
        # perf script | head
               taskset   428 [-01] 10394.179041:          1 cpu_atom/cpu-cycles/:
               taskset   428 [-01] 10394.179043:          1 cpu_atom/cpu-cycles/:
               taskset   428 [-01] 10394.179044:         11 cpu_atom/cpu-cycles/:
               taskset   428 [-01] 10394.179045:        407 cpu_atom/cpu-cycles/:
               taskset   428 [-01] 10394.179046:      16789 cpu_atom/cpu-cycles/:
               taskset   428 [-01] 10394.179052:     676300 cpu_atom/cpu-cycles/:
                 uname   428 [-01] 10394.179278:    4079859 cpu_atom/cpu-cycles/:
      
       After:
      
        # perf script | head
               taskset   428 10394.179041:          1 cpu_atom/cpu-cycles/:  ffffffff95a0bb97 __intel_pmu_enable_all.constprop.48+0x47 ([kernel.kallsyms])
               taskset   428 10394.179043:          1 cpu_atom/cpu-cycles/:  ffffffff95a0bb97 __intel_pmu_enable_all.constprop.48+0x47 ([kernel.kallsyms])
               taskset   428 10394.179044:         11 cpu_atom/cpu-cycles/:  ffffffff95a0bb97 __intel_pmu_enable_all.constprop.48+0x47 ([kernel.kallsyms])
               taskset   428 10394.179045:        407 cpu_atom/cpu-cycles/:  ffffffff95a0bb97 __intel_pmu_enable_all.constprop.48+0x47 ([kernel.kallsyms])
               taskset   428 10394.179046:      16789 cpu_atom/cpu-cycles/:  ffffffff95a0bb97 __intel_pmu_enable_all.constprop.48+0x47 ([kernel.kallsyms])
               taskset   428 10394.179052:     676300 cpu_atom/cpu-cycles/:      7f829ef73800 cfree+0x0 (/lib/libc-2.32.so)
                 uname   428 10394.179278:    4079859 cpu_atom/cpu-cycles/:  ffffffff95bae912 vma_interval_tree_remove+0x1f2 ([kernel.kallsyms])
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Reviewed-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lore.kernel.org/lkml/20210911133053.15682-1-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ff6f41fb
    • Ravi Bangoria's avatar
      perf annotate: Fix fused instr logic for assembly functions · 7efbcc8c
      Ravi Bangoria authored
      Some x86 microarchitectures fuse a subset of cmp/test/ALU instructions
      with branch instructions, and thus perf annotate highlight such valid
      pairs as fused.
      
      When annotated with source, perf uses struct disasm_line to contain
      either source or instruction line from objdump output. Usually, a C
      statement generates multiple instructions which include such
      cmp/test/ALU + branch instruction pairs. But in case of assembly
      function, each individual assembly source line generate one
      instruction.
      
      The 'perf annotate' instruction fusion logic assumes the previous
      disasm_line as the previous instruction line, which is wrong because,
      for assembly function, previous disasm_line contains source line.  And
      thus perf fails to highlight valid fused instruction pairs for assembly
      functions.
      
      Fix it by searching backward until we find an instruction line and
      consider that disasm_line as fused with current branch instruction.
      
      Before:
               │    cmpq    %rcx, RIP+8(%rsp)
          0.00 │      cmp    %rcx,0x88(%rsp)
               │    je      .Lerror_bad_iret      <--- Source line
          0.14 │   ┌──je     b4                   <--- Instruction line
               │   │movl    %ecx, %eax
      
      After:
               │    cmpq    %rcx, RIP+8(%rsp)
          0.00 │   ┌──cmp    %rcx,0x88(%rsp)
               │   │je      .Lerror_bad_iret
          0.14 │   ├──je     b4
               │   │movl    %ecx, %eax
      Reviewed-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Signed-off-by: default avatarRavi Bangoria <ravi.bangoria@amd.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kim Phillips <kim.phillips@amd.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https //lore.kernel.org/r/20210911043854.8373-1-ravi.bangoria@amd.com
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      7efbcc8c