1. 05 Apr, 2022 4 commits
    • Andrii Nakryiko's avatar
      libbpf: Wire up spec management and other arch-independent USDT logic · 999783c8
      Andrii Nakryiko authored
      Last part of architecture-agnostic user-space USDT handling logic is to
      set up BPF spec and, optionally, IP-to-ID maps from user-space.
      usdt_manager performs a compact spec ID allocation to utilize
      fixed-sized BPF maps as efficiently as possible. We also use hashmap to
      deduplicate USDT arg spec strings and map identical strings to single
      USDT spec, minimizing the necessary BPF map size. usdt_manager supports
      arbitrary sequences of attachment and detachment, both of the same USDT
      and multiple different USDTs and internally maintains a free list of
      unused spec IDs. bpf_link_usdt's logic is extended with proper setup and
      teardown of this spec ID free list and supporting BPF maps.
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Reviewed-by: default avatarAlan Maguire <alan.maguire@oracle.com>
      Reviewed-by: default avatarDave Marchevsky <davemarchevsky@fb.com>
      Link: https://lore.kernel.org/bpf/20220404234202.331384-5-andrii@kernel.org
      999783c8
    • Andrii Nakryiko's avatar
      libbpf: Add USDT notes parsing and resolution logic · 74cc6311
      Andrii Nakryiko authored
      Implement architecture-agnostic parts of USDT parsing logic. The code is
      the documentation in this case, it's futile to try to succinctly
      describe how USDT parsing is done in any sort of concreteness. But
      still, USDTs are recorded in special ELF notes section (.note.stapsdt),
      where each USDT call site is described separately. Along with USDT
      provider and USDT name, each such note contains USDT argument
      specification, which uses assembly-like syntax to describe how to fetch
      value of USDT argument. USDT arg spec could be just a constant, or
      a register, or a register dereference (most common cases in x86_64), but
      it technically can be much more complicated cases, like offset relative
      to global symbol and stuff like that. One of the later patches will
      implement most common subset of this for x86 and x86-64 architectures,
      which seems to handle a lot of real-world production application.
      
      USDT arg spec contains a compact encoding allowing usdt.bpf.h from
      previous patch to handle the above 3 cases. Instead of recording which
      register might be needed, we encode register's offset within struct
      pt_regs to simplify BPF-side implementation. USDT argument can be of
      different byte sizes (1, 2, 4, and 8) and signed or unsigned. To handle
      this, libbpf pre-calculates necessary bit shifts to do proper casting
      and sign-extension in a short sequences of left and right shifts.
      
      The rest is in the code with sometimes extensive comments and references
      to external "documentation" for USDTs.
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Reviewed-by: default avatarAlan Maguire <alan.maguire@oracle.com>
      Reviewed-by: default avatarDave Marchevsky <davemarchevsky@fb.com>
      Link: https://lore.kernel.org/bpf/20220404234202.331384-4-andrii@kernel.org
      74cc6311
    • Andrii Nakryiko's avatar
      libbpf: Wire up USDT API and bpf_link integration · 2e4913e0
      Andrii Nakryiko authored
      Wire up libbpf USDT support APIs without yet implementing all the
      nitty-gritty details of USDT discovery, spec parsing, and BPF map
      initialization.
      
      User-visible user-space API is simple and is conceptually very similar
      to uprobe API.
      
      bpf_program__attach_usdt() API allows to programmatically attach given
      BPF program to a USDT, specified through binary path (executable or
      shared lib), USDT provider and name. Also, just like in uprobe case, PID
      filter is specified (0 - self, -1 - any process, or specific PID).
      Optionally, USDT cookie value can be specified. Such single API
      invocation will try to discover given USDT in specified binary and will
      use (potentially many) BPF uprobes to attach this program in correct
      locations.
      
      Just like any bpf_program__attach_xxx() APIs, bpf_link is returned that
      represents this attachment. It is a virtual BPF link that doesn't have
      direct kernel object, as it can consist of multiple underlying BPF
      uprobe links. As such, attachment is not atomic operation and there can
      be brief moment when some USDT call sites are attached while others are
      still in the process of attaching. This should be taken into
      consideration by user. But bpf_program__attach_usdt() guarantees that
      in the case of success all USDT call sites are successfully attached, or
      all the successfuly attachments will be detached as soon as some USDT
      call sites failed to be attached. So, in theory, there could be cases of
      failed bpf_program__attach_usdt() call which did trigger few USDT
      program invocations. This is unavoidable due to multi-uprobe nature of
      USDT and has to be handled by user, if it's important to create an
      illusion of atomicity.
      
      USDT BPF programs themselves are marked in BPF source code as either
      SEC("usdt"), in which case they won't be auto-attached through
      skeleton's <skel>__attach() method, or it can have a full definition,
      which follows the spirit of fully-specified uprobes:
      SEC("usdt/<path>:<provider>:<name>"). In the latter case skeleton's
      attach method will attempt auto-attachment. Similarly, generic
      bpf_program__attach() will have enought information to go off of for
      parameterless attachment.
      
      USDT BPF programs are actually uprobes, and as such for kernel they are
      marked as BPF_PROG_TYPE_KPROBE.
      
      Another part of this patch is USDT-related feature probing:
        - BPF cookie support detection from user-space;
        - detection of kernel support for auto-refcounting of USDT semaphore.
      
      The latter is optional. If kernel doesn't support such feature and USDT
      doesn't rely on USDT semaphores, no error is returned. But if libbpf
      detects that USDT requires setting semaphores and kernel doesn't support
      this, libbpf errors out with explicit pr_warn() message. Libbpf doesn't
      support poking process's memory directly to increment semaphore value,
      like BCC does on legacy kernels, due to inherent raciness and danger of
      such process memory manipulation. Libbpf let's kernel take care of this
      properly or gives up.
      
      Logistically, all the extra USDT-related infrastructure of libbpf is put
      into a separate usdt.c file and abstracted behind struct usdt_manager.
      Each bpf_object has lazily-initialized usdt_manager pointer, which is
      only instantiated if USDT programs are attempted to be attached. Closing
      BPF object frees up usdt_manager resources. usdt_manager keeps track of
      USDT spec ID assignment and few other small things.
      
      Subsequent patches will fill out remaining missing pieces of USDT
      initialization and setup logic.
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Reviewed-by: default avatarAlan Maguire <alan.maguire@oracle.com>
      Link: https://lore.kernel.org/bpf/20220404234202.331384-3-andrii@kernel.org
      2e4913e0
    • Andrii Nakryiko's avatar
      libbpf: Add BPF-side of USDT support · d72e2968
      Andrii Nakryiko authored
      Add BPF-side implementation of libbpf-provided USDT support. This
      consists of single header library, usdt.bpf.h, which is meant to be used
      from user's BPF-side source code. This header is added to the list of
      installed libbpf header, along bpf_helpers.h and others.
      
      BPF-side implementation consists of two BPF maps:
        - spec map, which contains "a USDT spec" which encodes information
          necessary to be able to fetch USDT arguments and other information
          (argument count, user-provided cookie value, etc) at runtime;
        - IP-to-spec-ID map, which is only used on kernels that don't support
          BPF cookie feature. It allows to lookup spec ID based on the place
          in user application that triggers USDT program.
      
      These maps have default sizes, 256 and 1024, which are chosen
      conservatively to not waste a lot of space, but handling a lot of common
      cases. But there could be cases when user application needs to either
      trace a lot of different USDTs, or USDTs are heavily inlined and their
      arguments are located in a lot of differing locations. For such cases it
      might be necessary to size those maps up, which libbpf allows to do by
      overriding BPF_USDT_MAX_SPEC_CNT and BPF_USDT_MAX_IP_CNT macros.
      
      It is an important aspect to keep in mind. Single USDT (user-space
      equivalent of kernel tracepoint) can have multiple USDT "call sites".
      That is, single logical USDT is triggered from multiple places in user
      application. This can happen due to function inlining. Each such inlined
      instance of USDT invocation can have its own unique USDT argument
      specification (instructions about the location of the value of each of
      USDT arguments). So while USDT looks very similar to usual uprobe or
      kernel tracepoint, under the hood it's actually a collection of uprobes,
      each potentially needing different spec to know how to fetch arguments.
      
      User-visible API consists of three helper functions:
        - bpf_usdt_arg_cnt(), which returns number of arguments of current USDT;
        - bpf_usdt_arg(), which reads value of specified USDT argument (by
          it's zero-indexed position) and returns it as 64-bit value;
        - bpf_usdt_cookie(), which functions like BPF cookie for USDT
          programs; this is necessary as libbpf doesn't allow specifying actual
          BPF cookie and utilizes it internally for USDT support implementation.
      
      Each bpf_usdt_xxx() APIs expect struct pt_regs * context, passed into
      BPF program. On kernels that don't support BPF cookie it is used to
      fetch absolute IP address of the underlying uprobe.
      
      usdt.bpf.h also provides BPF_USDT() macro, which functions like
      BPF_PROG() and BPF_KPROBE() and allows much more user-friendly way to
      get access to USDT arguments, if USDT definition is static and known to
      the user. It is expected that majority of use cases won't have to use
      bpf_usdt_arg_cnt() and bpf_usdt_arg() directly and BPF_USDT() will cover
      all their needs.
      
      Last, usdt.bpf.h is utilizing BPF CO-RE for one single purpose: to
      detect kernel support for BPF cookie. If BPF CO-RE dependency is
      undesirable, user application can redefine BPF_USDT_HAS_BPF_COOKIE to
      either a boolean constant (or equivalently zero and non-zero), or even
      point it to its own .rodata variable that can be specified from user's
      application user-space code. It is important that
      BPF_USDT_HAS_BPF_COOKIE is known to BPF verifier as static value (thus
      .rodata and not just .data), as otherwise BPF code will still contain
      bpf_get_attach_cookie() BPF helper call and will fail validation at
      runtime, if not dead-code eliminated.
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Reviewed-by: default avatarAlan Maguire <alan.maguire@oracle.com>
      Link: https://lore.kernel.org/bpf/20220404234202.331384-2-andrii@kernel.org
      d72e2968
  2. 04 Apr, 2022 19 commits
  3. 03 Apr, 2022 3 commits
  4. 01 Apr, 2022 4 commits
  5. 31 Mar, 2022 10 commits
    • Xu Kuohai's avatar
      bpf, tests: Add load store test case for tail call · 38608ee7
      Xu Kuohai authored
      Add test case to enusre that the caller and callee's fp offsets are
      correct during tail call (mainly asserting for arm64 JIT).
      
      Tested on both big-endian and little-endian arm64 qemu, result:
      
       test_bpf: Summary: 1026 PASSED, 0 FAILED, [1014/1014 JIT'ed]
       test_bpf: test_tail_calls: Summary: 10 PASSED, 0 FAILED, [10/10 JIT'ed]
       test_bpf: test_skb_segment: Summary: 2 PASSED, 0 FAILED
      Signed-off-by: default avatarXu Kuohai <xukuohai@huawei.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20220321152852.2334294-6-xukuohai@huawei.com
      38608ee7
    • Xu Kuohai's avatar
      bpf, tests: Add tests for BPF_LDX/BPF_STX with different offsets · f516420f
      Xu Kuohai authored
      This patch adds tests to verify the behavior of BPF_LDX/BPF_STX +
      BPF_B/BPF_H/BPF_W/BPF_DW with negative offset, small positive offset,
      large positive offset, and misaligned offset.
      
      Tested on both big-endian and little-endian arm64 qemu, result:
      
       test_bpf: Summary: 1026 PASSED, 0 FAILED, [1014/1014 JIT'ed]']
       test_bpf: test_tail_calls: Summary: 8 PASSED, 0 FAILED, [8/8 JIT'ed]
       test_bpf: test_skb_segment: Summary: 2 PASSED, 0 FAILED
      Signed-off-by: default avatarXu Kuohai <xukuohai@huawei.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20220321152852.2334294-5-xukuohai@huawei.com
      f516420f
    • Xu Kuohai's avatar
      bpf, arm64: Adjust the offset of str/ldr(immediate) to positive number · 5b3d19b9
      Xu Kuohai authored
      The BPF STX/LDX instruction uses offset relative to the FP to address
      stack space. Since the BPF_FP locates at the top of the frame, the offset
      is usually a negative number. However, arm64 str/ldr immediate instruction
      requires that offset be a positive number.  Therefore, this patch tries to
      convert the offsets.
      
      The method is to find the negative offset furthest from the FP firstly.
      Then add it to the FP, calculate a bottom position, called FPB, and then
      adjust the offsets in other STR/LDX instructions relative to FPB.
      
      FPB is saved using the callee-saved register x27 of arm64 which is not
      used yet.
      
      Before adjusting the offset, the patch checks every instruction to ensure
      that the FP does not change in run-time. If the FP may change, no offset
      is adjusted.
      
      For example, for the following bpftrace command:
      
        bpftrace -e 'kprobe:do_sys_open { printf("opening: %s\n", str(arg1)); }'
      
      Without this patch, jited code(fragment):
      
         0:   bti     c
         4:   stp     x29, x30, [sp, #-16]!
         8:   mov     x29, sp
         c:   stp     x19, x20, [sp, #-16]!
        10:   stp     x21, x22, [sp, #-16]!
        14:   stp     x25, x26, [sp, #-16]!
        18:   mov     x25, sp
        1c:   mov     x26, #0x0                       // #0
        20:   bti     j
        24:   sub     sp, sp, #0x90
        28:   add     x19, x0, #0x0
        2c:   mov     x0, #0x0                        // #0
        30:   mov     x10, #0xffffffffffffff78        // #-136
        34:   str     x0, [x25, x10]
        38:   mov     x10, #0xffffffffffffff80        // #-128
        3c:   str     x0, [x25, x10]
        40:   mov     x10, #0xffffffffffffff88        // #-120
        44:   str     x0, [x25, x10]
        48:   mov     x10, #0xffffffffffffff90        // #-112
        4c:   str     x0, [x25, x10]
        50:   mov     x10, #0xffffffffffffff98        // #-104
        54:   str     x0, [x25, x10]
        58:   mov     x10, #0xffffffffffffffa0        // #-96
        5c:   str     x0, [x25, x10]
        60:   mov     x10, #0xffffffffffffffa8        // #-88
        64:   str     x0, [x25, x10]
        68:   mov     x10, #0xffffffffffffffb0        // #-80
        6c:   str     x0, [x25, x10]
        70:   mov     x10, #0xffffffffffffffb8        // #-72
        74:   str     x0, [x25, x10]
        78:   mov     x10, #0xffffffffffffffc0        // #-64
        7c:   str     x0, [x25, x10]
        80:   mov     x10, #0xffffffffffffffc8        // #-56
        84:   str     x0, [x25, x10]
        88:   mov     x10, #0xffffffffffffffd0        // #-48
        8c:   str     x0, [x25, x10]
        90:   mov     x10, #0xffffffffffffffd8        // #-40
        94:   str     x0, [x25, x10]
        98:   mov     x10, #0xffffffffffffffe0        // #-32
        9c:   str     x0, [x25, x10]
        a0:   mov     x10, #0xffffffffffffffe8        // #-24
        a4:   str     x0, [x25, x10]
        a8:   mov     x10, #0xfffffffffffffff0        // #-16
        ac:   str     x0, [x25, x10]
        b0:   mov     x10, #0xfffffffffffffff8        // #-8
        b4:   str     x0, [x25, x10]
        b8:   mov     x10, #0x8                       // #8
        bc:   ldr     x2, [x19, x10]
        [...]
      
      With this patch, jited code(fragment):
      
         0:   bti     c
         4:   stp     x29, x30, [sp, #-16]!
         8:   mov     x29, sp
         c:   stp     x19, x20, [sp, #-16]!
        10:   stp     x21, x22, [sp, #-16]!
        14:   stp     x25, x26, [sp, #-16]!
        18:   stp     x27, x28, [sp, #-16]!
        1c:   mov     x25, sp
        20:   sub     x27, x25, #0x88
        24:   mov     x26, #0x0                       // #0
        28:   bti     j
        2c:   sub     sp, sp, #0x90
        30:   add     x19, x0, #0x0
        34:   mov     x0, #0x0                        // #0
        38:   str     x0, [x27]
        3c:   str     x0, [x27, #8]
        40:   str     x0, [x27, #16]
        44:   str     x0, [x27, #24]
        48:   str     x0, [x27, #32]
        4c:   str     x0, [x27, #40]
        50:   str     x0, [x27, #48]
        54:   str     x0, [x27, #56]
        58:   str     x0, [x27, #64]
        5c:   str     x0, [x27, #72]
        60:   str     x0, [x27, #80]
        64:   str     x0, [x27, #88]
        68:   str     x0, [x27, #96]
        6c:   str     x0, [x27, #104]
        70:   str     x0, [x27, #112]
        74:   str     x0, [x27, #120]
        78:   str     x0, [x27, #128]
        7c:   ldr     x2, [x19, #8]
        [...]
      Signed-off-by: default avatarXu Kuohai <xukuohai@huawei.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20220321152852.2334294-4-xukuohai@huawei.com
      5b3d19b9
    • Xu Kuohai's avatar
      bpf, arm64: Optimize BPF store/load using arm64 str/ldr(immediate offset) · 7db6c0f1
      Xu Kuohai authored
      The current BPF store/load instruction is translated by the JIT into two
      instructions. The first instruction moves the immediate offset into a
      temporary register. The second instruction uses this temporary register
      to do the real store/load.
      
      In fact, arm64 supports addressing with immediate offsets. So This patch
      introduces optimization that uses arm64 str/ldr instruction with immediate
      offset when the offset fits.
      
      Example of generated instuction for r2 = *(u64 *)(r1 + 0):
      
      without optimization:
      mov x10, 0
      ldr x1, [x0, x10]
      
      with optimization:
      ldr x1, [x0, 0]
      
      If the offset is negative, or is not aligned correctly, or exceeds max
      value, rollback to the use of temporary register.
      Signed-off-by: default avatarXu Kuohai <xukuohai@huawei.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20220321152852.2334294-3-xukuohai@huawei.com
      7db6c0f1
    • Xu Kuohai's avatar
      arm64, insn: Add ldr/str with immediate offset · 30c90f67
      Xu Kuohai authored
      This patch introduces ldr/str with immediate offset support to simplify
      the JIT implementation of BPF LDX/STX instructions on arm64. Although
      arm64 ldr/str immediate is available in pre-index, post-index and
      unsigned offset forms, the unsigned offset form is sufficient for BPF,
      so this patch only adds this type.
      Signed-off-by: default avatarXu Kuohai <xukuohai@huawei.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20220321152852.2334294-2-xukuohai@huawei.com
      30c90f67
    • Linus Torvalds's avatar
      Merge tag 'net-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 2975dbdc
      Linus Torvalds authored
      Pull more networking updates from Jakub Kicinski:
       "Networking fixes and rethook patches.
      
        Features:
      
         - kprobes: rethook: x86: replace kretprobe trampoline with rethook
      
        Current release - regressions:
      
         - sfc: avoid null-deref on systems without NUMA awareness in the new
           queue sizing code
      
        Current release - new code bugs:
      
         - vxlan: do not feed vxlan_vnifilter_dump_dev with non-vxlan devices
      
         - eth: lan966x: fix null-deref on PHY pointer in timestamp ioctl when
           interface is down
      
        Previous releases - always broken:
      
         - openvswitch: correct neighbor discovery target mask field in the
           flow dump
      
         - wireguard: ignore v6 endpoints when ipv6 is disabled and fix a leak
      
         - rxrpc: fix call timer start racing with call destruction
      
         - rxrpc: fix null-deref when security type is rxrpc_no_security
      
         - can: fix UAF bugs around echo skbs in multiple drivers
      
        Misc:
      
         - docs: move netdev-FAQ to the 'process' section of the
           documentation"
      
      * tag 'net-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (57 commits)
        vxlan: do not feed vxlan_vnifilter_dump_dev with non vxlan devices
        openvswitch: Add recirc_id to recirc warning
        rxrpc: fix some null-ptr-deref bugs in server_key.c
        rxrpc: Fix call timer start racing with call destruction
        net: hns3: fix software vlan talbe of vlan 0 inconsistent with hardware
        net: hns3: fix the concurrency between functions reading debugfs
        docs: netdev: move the netdev-FAQ to the process pages
        docs: netdev: broaden the new vs old code formatting guidelines
        docs: netdev: call out the merge window in tag checking
        docs: netdev: add missing back ticks
        docs: netdev: make the testing requirement more stringent
        docs: netdev: add a question about re-posting frequency
        docs: netdev: rephrase the 'should I update patchwork' question
        docs: netdev: rephrase the 'Under review' question
        docs: netdev: shorten the name and mention msgid for patch status
        docs: netdev: note that RFC postings are allowed any time
        docs: netdev: turn the net-next closed into a Warning
        docs: netdev: move the patch marking section up
        docs: netdev: minor reword
        docs: netdev: replace references to old archives
        ...
      2975dbdc
    • Linus Torvalds's avatar
      Merge tag 'v5.18-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 · 93235e3d
      Linus Torvalds authored
      Pull crypto fixes from Herbert Xu:
      
       - Missing Kconfig dependency on arm that leads to boot failure
      
       - x86 SLS fixes
      
       - Reference leak in the stm32 driver
      
      * tag 'v5.18-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
        crypto: x86/sm3 - Fixup SLS
        crypto: x86/poly1305 - Fixup SLS
        crypto: x86/chacha20 - Avoid spurious jumps to other functions
        crypto: stm32 - fix reference leak in stm32_crc_remove
        crypto: arm/aes-neonbs-cbc - Select generic cbc and aes
      93235e3d
    • Eric Dumazet's avatar
      vxlan: do not feed vxlan_vnifilter_dump_dev with non vxlan devices · 9d570741
      Eric Dumazet authored
      vxlan_vnifilter_dump_dev() assumes it is called only
      for vxlan devices. Make sure it is the case.
      
      BUG: KASAN: slab-out-of-bounds in vxlan_vnifilter_dump_dev+0x9a0/0xb40 drivers/net/vxlan/vxlan_vnifilter.c:349
      Read of size 4 at addr ffff888060d1ce70 by task syz-executor.3/17662
      
      CPU: 0 PID: 17662 Comm: syz-executor.3 Tainted: G        W         5.17.0-syzkaller-12888-g77c9387c #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       <TASK>
       __dump_stack lib/dump_stack.c:88 [inline]
       dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
       print_address_description.constprop.0.cold+0xeb/0x495 mm/kasan/report.c:313
       print_report mm/kasan/report.c:429 [inline]
       kasan_report.cold+0xf4/0x1c6 mm/kasan/report.c:491
       vxlan_vnifilter_dump_dev+0x9a0/0xb40 drivers/net/vxlan/vxlan_vnifilter.c:349
       vxlan_vnifilter_dump+0x3ff/0x650 drivers/net/vxlan/vxlan_vnifilter.c:428
       netlink_dump+0x4b5/0xb70 net/netlink/af_netlink.c:2270
       __netlink_dump_start+0x647/0x900 net/netlink/af_netlink.c:2375
       netlink_dump_start include/linux/netlink.h:245 [inline]
       rtnetlink_rcv_msg+0x70c/0xb80 net/core/rtnetlink.c:5953
       netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2496
       netlink_unicast_kernel net/netlink/af_netlink.c:1319 [inline]
       netlink_unicast+0x543/0x7f0 net/netlink/af_netlink.c:1345
       netlink_sendmsg+0x904/0xe00 net/netlink/af_netlink.c:1921
       sock_sendmsg_nosec net/socket.c:705 [inline]
       sock_sendmsg+0xcf/0x120 net/socket.c:725
       ____sys_sendmsg+0x6e2/0x800 net/socket.c:2413
       ___sys_sendmsg+0xf3/0x170 net/socket.c:2467
       __sys_sendmsg+0xe5/0x1b0 net/socket.c:2496
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x35/0x80 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      RIP: 0033:0x7f87b8e89049
      
      Fixes: f9c4bb0b ("vxlan: vni filtering support on collect metadata device")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Acked-by: default avatarRoopa Prabhu <roopa@nvidia.com>
      Link: https://lore.kernel.org/r/20220330194643.2706132-1-eric.dumazet@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      9d570741
    • Stéphane Graber's avatar
      openvswitch: Add recirc_id to recirc warning · ea07af2e
      Stéphane Graber authored
      When hitting the recirculation limit, the kernel would currently log
      something like this:
      
      [   58.586597] openvswitch: ovs-system: deferred action limit reached, drop recirc action
      
      Which isn't all that useful to debug as we only have the interface name
      to go on but can't track it down to a specific flow.
      
      With this change, we now instead get:
      
      [   58.586597] openvswitch: ovs-system: deferred action limit reached, drop recirc action (recirc_id=0x9e)
      
      Which can now be correlated with the flow entries from OVS.
      Suggested-by: default avatarFrode Nordahl <frode.nordahl@canonical.com>
      Signed-off-by: default avatarStéphane Graber <stgraber@ubuntu.com>
      Tested-by: default avatarStephane Graber <stgraber@ubuntu.com>
      Acked-by: default avatarEelco Chaudron <echaudro@redhat.com>
      Link: https://lore.kernel.org/r/20220330194244.3476544-1-stgraber@ubuntu.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      ea07af2e
    • Jakub Kicinski's avatar
      Merge tag 'linux-can-fixes-for-5.18-20220331' of... · 46b55620
      Jakub Kicinski authored
      Merge tag 'linux-can-fixes-for-5.18-20220331' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can
      
      Marc Kleine-Budde says:
      
      ====================
      pull-request: can 2022-03-31
      
      The first patch is by Oliver Hartkopp and fixes MSG_PEEK feature in
      the CAN ISOTP protocol (broken in net-next for v5.18 only).
      
      Tom Rix's patch for the mcp251xfd driver fixes the propagation of an
      error value in case of an error.
      
      A patch by me for the m_can driver fixes a use-after-free in the xmit
      handler for m_can IP cores v3.0.x.
      
      Hangyu Hua contributes 3 patches fixing the same double free in the
      error path of the xmit handler in the ems_usb, usb_8dev and mcba_usb
      USB CAN driver.
      
      Pavel Skripkin contributes a patch for the mcba_usb driver to properly
      check the endpoint type.
      
      The last patch is by me and fixes a mem leak in the gs_usb, which was
      introduced in net-next for v5.18.
      
      * tag 'linux-can-fixes-for-5.18-20220331' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can:
        can: gs_usb: gs_make_candev(): fix memory leak for devices with extended bit timing configuration
        can: mcba_usb: properly check endpoint type
        can: mcba_usb: mcba_usb_start_xmit(): fix double dev_kfree_skb in error path
        can: usb_8dev: usb_8dev_start_xmit(): fix double dev_kfree_skb() in error path
        can: ems_usb: ems_usb_start_xmit(): fix double dev_kfree_skb() in error path
        can: m_can: m_can_tx_handler(): fix use after free of skb
        can: mcp251xfd: mcp251xfd_register_get_dev_id(): fix return of error value
        can: isotp: restore accidentally removed MSG_PEEK feature
      ====================
      
      Link: https://lore.kernel.org/r/Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      46b55620