1. 01 Mar, 2023 7 commits
  2. 28 Feb, 2023 4 commits
  3. 27 Feb, 2023 7 commits
  4. 25 Feb, 2023 3 commits
  5. 23 Feb, 2023 2 commits
  6. 22 Feb, 2023 17 commits
    • Dave Thaler's avatar
      bpf, docs: Add explanation of endianness · 746ce767
      Dave Thaler authored
      Document the discussion from the email thread on the IETF bpf list,
      where it was explained that the raw format varies by endianness
      of the processor.
      Signed-off-by: default avatarDave Thaler <dthaler@microsoft.com>
      Acked-by: default avatarDavid Vernet <void@manifault.com>
      Link: https://lore.kernel.org/r/20230220223742.1347-1-dthaler1968@googlemail.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      746ce767
    • Stanislav Fomichev's avatar
      selftests/bpf: Fix BPF_FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL for empty flow label · 9fa02892
      Stanislav Fomichev authored
      Kernel's flow dissector continues to parse the packet when
      the (optional) IPv6 flow label is empty even when instructed
      to stop (via BPF_FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL). Do
      the same in our reference BPF reimplementation.
      Signed-off-by: default avatarStanislav Fomichev <sdf@google.com>
      Acked-by: default avatarWillem de Bruijn <willemb@google.com>
      Link: https://lore.kernel.org/r/20230221180518.2139026-1-sdf@google.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      9fa02892
    • Pu Lehui's avatar
      riscv, bpf: Add kfunc support for RV64 · d40c3847
      Pu Lehui authored
      This patch adds kernel function call support for RV64. Since the offset
      from RV64 kernel and module functions to bpf programs is almost within
      the range of s32, the current infrastructure of RV64 is already
      sufficient for kfunc, so let's turn it on.
      Suggested-by: default avatarBjörn Töpel <bjorn@rivosinc.com>
      Signed-off-by: default avatarPu Lehui <pulehui@huawei.com>
      Acked-by: default avatarBjörn Töpel <bjorn@rivosinc.com>
      Link: https://lore.kernel.org/r/20230221140656.3480496-1-pulehui@huaweicloud.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      d40c3847
    • Ilya Leoshkevich's avatar
      bpf: Check for helper calls in check_subprogs() · df2ccc18
      Ilya Leoshkevich authored
      The condition src_reg != BPF_PSEUDO_CALL && imm == BPF_FUNC_tail_call
      may be satisfied by a kfunc call. This would lead to unnecessarily
      setting has_tail_call. Use src_reg == 0 instead.
      Signed-off-by: default avatarIlya Leoshkevich <iii@linux.ibm.com>
      Acked-by: default avatarStanislav Fomichev <sdf@google.com>
      Link: https://lore.kernel.org/r/20230220163756.753713-1-iii@linux.ibm.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      df2ccc18
    • Hengqi Chen's avatar
      LoongArch: BPF: Support mixing bpf2bpf and tailcalls · bb035ef0
      Hengqi Chen authored
      The current implementation already allow such mixing.
      Let's enable it in JIT.
      Signed-off-by: default avatarHengqi Chen <hengqi.chen@gmail.com>
      Link: https://lore.kernel.org/r/20230218105317.4139666-1-hengqi.chen@gmail.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      bb035ef0
    • Florent Revest's avatar
      selftests/bpf: Fix cross compilation with CLANG_CROSS_FLAGS · b539a287
      Florent Revest authored
      I cross-compile my BPF selftests with the following command:
      
      CLANG_CROSS_FLAGS="--target=aarch64-linux-gnu --sysroot=/sysroot/" \
        make LLVM=1 CC=clang CROSS_COMPILE=aarch64-linux-gnu- SRCARCH=arm64
      
      (Note the use of CLANG_CROSS_FLAGS to specify a custom sysroot instead
      of letting clang use gcc's default sysroot)
      
      However, CLANG_CROSS_FLAGS gets propagated to host tools builds (libbpf
      and bpftool) and because they reference it directly in their Makefiles,
      they end up cross-compiling host objects which results in linking
      errors.
      
      This patch ensures that CLANG_CROSS_FLAGS is reset if CROSS_COMPILE
      isn't set (for example when reaching a BPF host tool build).
      Signed-off-by: default avatarFlorent Revest <revest@chromium.org>
      Link: https://lore.kernel.org/r/20230217151832.27784-1-revest@chromium.orgSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      b539a287
    • Tiezhu Yang's avatar
      selftests/bpf: Remove not used headers · 1f265d2a
      Tiezhu Yang authored
      The following three uapi headers:
      
          tools/arch/arm64/include/uapi/asm/bpf_perf_event.h
          tools/arch/s390/include/uapi/asm/bpf_perf_event.h
          tools/arch/s390/include/uapi/asm/ptrace.h
      
      were introduced in commit 618e165b ("selftests/bpf: sync kernel headers
      and introduce arch support in Makefile"), they are not used any more after
      commit 720f228e ("bpf: fix broken BPF selftest build"), so remove them.
      Signed-off-by: default avatarTiezhu Yang <yangtiezhu@loongson.cn>
      Link: https://lore.kernel.org/r/1676533861-27508-1-git-send-email-yangtiezhu@loongson.cnSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      1f265d2a
    • Hou Tao's avatar
      bpf: Only allocate one bpf_mem_cache for bpf_cpumask_ma · 5d5de3a4
      Hou Tao authored
      The size of bpf_cpumask is fixed, so there is no need to allocate many
      bpf_mem_caches for bpf_cpumask_ma, just one bpf_mem_cache is enough.
      Also add comments for bpf_mem_alloc_init() in bpf_mem_alloc.h to prevent
      future miuse.
      Signed-off-by: default avatarHou Tao <houtao1@huawei.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Link: https://lore.kernel.org/r/20230216024821.2202916-1-houtao@huaweicloud.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      5d5de3a4
    • Kumar Kartikeya Dwivedi's avatar
      bpf: Wrap register invalidation with a helper · dbd8d228
      Kumar Kartikeya Dwivedi authored
      Typically, verifier should use env->allow_ptr_leaks when invaliding
      registers for users that don't have CAP_PERFMON or CAP_SYS_ADMIN to
      avoid leaking the pointer value. This is similar in spirit to
      c67cae55 ("bpf: Tighten ptr_to_btf_id checks."). In a lot of the
      existing checks, we know the capabilities are present, hence we don't do
      the check.
      
      Instead of being inconsistent in the application of the check, wrap the
      action of invalidating a register into a helper named 'mark_invalid_reg'
      and use it in a uniform fashion to replace open coded invalidation
      operations, so that the check is always made regardless of the call site
      and we don't have to remember whether it needs to be done or not for
      each case.
      Signed-off-by: default avatarKumar Kartikeya Dwivedi <memxor@gmail.com>
      Link: https://lore.kernel.org/r/20230221200646.2500777-7-memxor@gmail.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      dbd8d228
    • Kumar Kartikeya Dwivedi's avatar
      bpf: Fix check_reg_type for PTR_TO_BTF_ID · da03e43a
      Kumar Kartikeya Dwivedi authored
      The current code does type matching for the case where reg->type is
      PTR_TO_BTF_ID or has the PTR_TRUSTED flag. However, this only needs to
      occur for non-MEM_ALLOC and non-MEM_PERCPU cases, but will include both
      as per the current code.
      
      The MEM_ALLOC case with or without PTR_TRUSTED needs to be handled
      specially by the code for type_is_alloc case, while MEM_PERCPU case must
      be ignored. Hence, to restore correct behavior and for clarity,
      explicitly list out the handled PTR_TO_BTF_ID types which should be
      handled for each case using a switch statement.
      
      Helpers currently only take:
      	PTR_TO_BTF_ID
      	PTR_TO_BTF_ID | PTR_TRUSTED
      	PTR_TO_BTF_ID | MEM_RCU
      	PTR_TO_BTF_ID | MEM_ALLOC
      	PTR_TO_BTF_ID | MEM_PERCPU
      	PTR_TO_BTF_ID | MEM_PERCPU | PTR_TRUSTED
      
      This fix was also described (for the MEM_ALLOC case) in [0].
      
        [0]: https://lore.kernel.org/bpf/20221121160657.h6z7xuvedybp5y7s@apolloSigned-off-by: default avatarKumar Kartikeya Dwivedi <memxor@gmail.com>
      Link: https://lore.kernel.org/r/20230221200646.2500777-6-memxor@gmail.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      da03e43a
    • Kumar Kartikeya Dwivedi's avatar
      bpf: Remove unused MEM_ALLOC | PTR_TRUSTED checks · 521d3c0a
      Kumar Kartikeya Dwivedi authored
      The plan is to supposedly tag everything with PTR_TRUSTED eventually,
      however those changes should bring in their respective code, instead
      of leaving it around right now. It is arguable whether PTR_TRUSTED is
      required for all types, when it's only use case is making PTR_TO_BTF_ID
      a bit stronger, while all other types are trusted by default.
      
      Hence, just drop the two instances which do not occur in the verifier
      for now to avoid reader confusion.
      Signed-off-by: default avatarKumar Kartikeya Dwivedi <memxor@gmail.com>
      Link: https://lore.kernel.org/r/20230221200646.2500777-5-memxor@gmail.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      521d3c0a
    • Kumar Kartikeya Dwivedi's avatar
      bpf: Annotate data races in bpf_local_storage · 0a09a2f9
      Kumar Kartikeya Dwivedi authored
      There are a few cases where hlist_node is checked to be unhashed without
      holding the lock protecting its modification. In this case, one must use
      hlist_unhashed_lockless to avoid load tearing and KCSAN reports. Fix
      this by using lockless variant in places not protected by the lock.
      
      Since this is not prompted by any actual KCSAN reports but only from
      code review, I have not included a fixes tag.
      
      Cc: Martin KaFai Lau <martin.lau@kernel.org>
      Cc: KP Singh <kpsingh@kernel.org>
      Signed-off-by: default avatarKumar Kartikeya Dwivedi <memxor@gmail.com>
      Link: https://lore.kernel.org/r/20230221200646.2500777-4-memxor@gmail.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      0a09a2f9
    • Alexei Starovoitov's avatar
      Merge branch 'bpf: Allow reads from uninit stack' · bf9bec4c
      Alexei Starovoitov authored
      Eduard Zingerman says:
      
      ====================
      
      This patch-set modifies BPF verifier to accept programs that read from
      uninitialized stack locations, but only if executed in privileged mode.
      This provides significant verification performance gains: 30% to 70% less
      processed states for big number of test programs.
      
      The reason for performance gains comes from treating STACK_MISC and
      STACK_INVALID as compatible, when cached state is compared to current state
      in verifier.c:stacksafe().
      
      The change should not affect safety, because any value read from STACK_MISC
      location has full binary range (e.g. 0x00-0xff for byte-sized reads).
      
      Details and measurements are provided in the description for the patch #1.
      
      The change was suggested by Andrii Nakryiko, the initial patch was created
      by Alexei Starovoitov. The discussion could be found at [1].
      
      Changes v1 -> v2 (v1 available at [2]):
      - Calls to helper functions now convert STACK_INVALID to STACK_MISC
        (suggested by Andrii);
      - The test case progs/test_global_func10.c is updated to expect new
        error message. Before recent commit [3] exact content of error
        messages was not verified for this test.
      - Replaced incorrect '//'-style comments in test case asm blocks by
        '/*...*/'-style comments in order to fix compilation issues;
      - Changed the tag from "Suggested-By" to "Co-developed-by" for Alexei
        on patch #1, please let me know if this is appropriate use of the tag.
      
      [1] https://lore.kernel.org/bpf/CAADnVQKs2i1iuZ5SUGuJtxWVfGYR9kDgYKhq3rNV+kBLQCu7rA@mail.gmail.com/
      [2] https://lore.kernel.org/bpf/20230216183606.2483834-1-eddyz87@gmail.com/
      [3] 95ebb376 ("selftests/bpf: Convert test_global_funcs test to test_loader framework")
      ====================
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      bf9bec4c
    • Eduard Zingerman's avatar
      selftests/bpf: Tests for uninitialized stack reads · 6338a94d
      Eduard Zingerman authored
      Three testcases to make sure that stack reads from uninitialized
      locations are accepted by verifier when executed in privileged mode:
      - read from a fixed offset;
      - read from a variable offset;
      - passing a pointer to stack to a helper converts
        STACK_INVALID to STACK_MISC.
      Signed-off-by: default avatarEduard Zingerman <eddyz87@gmail.com>
      Acked-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Link: https://lore.kernel.org/r/20230219200427.606541-3-eddyz87@gmail.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      6338a94d
    • Eduard Zingerman's avatar
      bpf: Allow reads from uninit stack · 6715df8d
      Eduard Zingerman authored
      This commits updates the following functions to allow reads from
      uninitialized stack locations when env->allow_uninit_stack option is
      enabled:
      - check_stack_read_fixed_off()
      - check_stack_range_initialized(), called from:
        - check_stack_read_var_off()
        - check_helper_mem_access()
      
      Such change allows to relax logic in stacksafe() to treat STACK_MISC
      and STACK_INVALID in a same way and make the following stack slot
      configurations equivalent:
      
        |  Cached state    |  Current state   |
        |   stack slot     |   stack slot     |
        |------------------+------------------|
        | STACK_INVALID or | STACK_INVALID or |
        | STACK_MISC       | STACK_SPILL   or |
        |                  | STACK_MISC    or |
        |                  | STACK_ZERO    or |
        |                  | STACK_DYNPTR     |
      
      This leads to significant verification speed gains (see below).
      
      The idea was suggested by Andrii Nakryiko [1] and initial patch was
      created by Alexei Starovoitov [2].
      
      Currently the env->allow_uninit_stack is allowed for programs loaded
      by users with CAP_PERFMON or CAP_SYS_ADMIN capabilities.
      
      A number of test cases from verifier/*.c were expecting uninitialized
      stack access to be an error. These test cases were updated to execute
      in unprivileged mode (thus preserving the tests).
      
      The test progs/test_global_func10.c expected "invalid indirect read
      from stack" error message because of the access to uninitialized
      memory region. This error is no longer possible in privileged mode.
      The test is updated to provoke an error "invalid indirect access to
      stack" because of access to invalid stack address (such error is not
      verified by progs/test_global_func*.c series of tests).
      
      The following tests had to be removed because these can't be made
      unprivileged:
      - verifier/sock.c:
        - "sk_storage_get(map, skb->sk, &stack_value, 1): partially init
        stack_value"
        BPF_PROG_TYPE_SCHED_CLS programs are not executed in unprivileged mode.
      - verifier/var_off.c:
        - "indirect variable-offset stack access, max_off+size > max_initialized"
        - "indirect variable-offset stack access, uninitialized"
        These tests verify that access to uninitialized stack values is
        detected when stack offset is not a constant. However, variable
        stack access is prohibited in unprivileged mode, thus these tests
        are no longer valid.
      
       * * *
      
      Here is veristat log comparing this patch with current master on a
      set of selftest binaries listed in tools/testing/selftests/bpf/veristat.cfg
      and cilium BPF binaries (see [3]):
      
      $ ./veristat -e file,prog,states -C -f 'states_pct<-30' master.log current.log
      File                        Program                     States (A)  States (B)  States    (DIFF)
      --------------------------  --------------------------  ----------  ----------  ----------------
      bpf_host.o                  tail_handle_ipv6_from_host         349         244    -105 (-30.09%)
      bpf_host.o                  tail_handle_nat_fwd_ipv4          1320         895    -425 (-32.20%)
      bpf_lxc.o                   tail_handle_nat_fwd_ipv4          1320         895    -425 (-32.20%)
      bpf_sock.o                  cil_sock4_connect                   70          48     -22 (-31.43%)
      bpf_sock.o                  cil_sock4_sendmsg                   68          46     -22 (-32.35%)
      bpf_xdp.o                   tail_handle_nat_fwd_ipv4          1554         803    -751 (-48.33%)
      bpf_xdp.o                   tail_lb_ipv4                      6457        2473   -3984 (-61.70%)
      bpf_xdp.o                   tail_lb_ipv6                      7249        3908   -3341 (-46.09%)
      pyperf600_bpf_loop.bpf.o    on_event                           287         145    -142 (-49.48%)
      strobemeta.bpf.o            on_event                         15915        4772  -11143 (-70.02%)
      strobemeta_nounroll2.bpf.o  on_event                         17087        3820  -13267 (-77.64%)
      xdp_synproxy_kern.bpf.o     syncookie_tc                     21271        6635  -14636 (-68.81%)
      xdp_synproxy_kern.bpf.o     syncookie_xdp                    23122        6024  -17098 (-73.95%)
      --------------------------  --------------------------  ----------  ----------  ----------------
      
      Note: I limited selection by states_pct<-30%.
      
      Inspection of differences in pyperf600_bpf_loop behavior shows that
      the following patch for the test removes almost all differences:
      
          - a/tools/testing/selftests/bpf/progs/pyperf.h
          + b/tools/testing/selftests/bpf/progs/pyperf.h
          @ -266,8 +266,8 @ int __on_event(struct bpf_raw_tracepoint_args *ctx)
                  }
      
                  if (event->pthread_match || !pidData->use_tls) {
          -               void* frame_ptr;
          -               FrameData frame;
          +               void* frame_ptr = 0;
          +               FrameData frame = {};
                          Symbol sym = {};
                          int cur_cpu = bpf_get_smp_processor_id();
      
      W/o this patch the difference comes from the following pattern
      (for different variables):
      
          static bool get_frame_data(... FrameData *frame ...)
          {
              ...
              bpf_probe_read_user(&frame->f_code, ...);
              if (!frame->f_code)
                  return false;
              ...
              bpf_probe_read_user(&frame->co_name, ...);
              if (frame->co_name)
                  ...;
          }
      
          int __on_event(struct bpf_raw_tracepoint_args *ctx)
          {
              FrameData frame;
              ...
              get_frame_data(... &frame ...) // indirectly via a bpf_loop & callback
              ...
          }
      
          SEC("raw_tracepoint/kfree_skb")
          int on_event(struct bpf_raw_tracepoint_args* ctx)
          {
              ...
              ret |= __on_event(ctx);
              ret |= __on_event(ctx);
              ...
          }
      
      With regards to value `frame->co_name` the following is important:
      - Because of the conditional `if (!frame->f_code)` each call to
        __on_event() produces two states, one with `frame->co_name` marked
        as STACK_MISC, another with it as is (and marked STACK_INVALID on a
        first call).
      - The call to bpf_probe_read_user() does not mark stack slots
        corresponding to `&frame->co_name` as REG_LIVE_WRITTEN but it marks
        these slots as BPF_MISC, this happens because of the following loop
        in the check_helper_call():
      
      	for (i = 0; i < meta.access_size; i++) {
      		err = check_mem_access(env, insn_idx, meta.regno, i, BPF_B,
      				       BPF_WRITE, -1, false);
      		if (err)
      			return err;
      	}
      
        Note the size of the write, it is a one byte write for each byte
        touched by a helper. The BPF_B write does not lead to write marks
        for the target stack slot.
      - Which means that w/o this patch when second __on_event() call is
        verified `if (frame->co_name)` will propagate read marks first to a
        stack slot with STACK_MISC marks and second to a stack slot with
        STACK_INVALID marks and these states would be considered different.
      
      [1] https://lore.kernel.org/bpf/CAEf4BzY3e+ZuC6HUa8dCiUovQRg2SzEk7M-dSkqNZyn=xEmnPA@mail.gmail.com/
      [2] https://lore.kernel.org/bpf/CAADnVQKs2i1iuZ5SUGuJtxWVfGYR9kDgYKhq3rNV+kBLQCu7rA@mail.gmail.com/
      [3] git@github.com:anakryiko/cilium.git
      Suggested-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Co-developed-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarEduard Zingerman <eddyz87@gmail.com>
      Acked-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Link: https://lore.kernel.org/r/20230219200427.606541-2-eddyz87@gmail.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      6715df8d
    • Linus Torvalds's avatar
      Merge tag 'net-next-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next · 5b7c4cab
      Linus Torvalds authored
      Pull networking updates from Jakub Kicinski:
       "Core:
      
         - Add dedicated kmem_cache for typical/small skb->head, avoid having
           to access struct page at kfree time, and improve memory use.
      
         - Introduce sysctl to set default RPS configuration for new netdevs.
      
         - Define Netlink protocol specification format which can be used to
           describe messages used by each family and auto-generate parsers.
           Add tools for generating kernel data structures and uAPI headers.
      
         - Expose all net/core sysctls inside netns.
      
         - Remove 4s sleep in netpoll if carrier is instantly detected on
           boot.
      
         - Add configurable limit of MDB entries per port, and port-vlan.
      
         - Continue populating drop reasons throughout the stack.
      
         - Retire a handful of legacy Qdiscs and classifiers.
      
        Protocols:
      
         - Support IPv4 big TCP (TSO frames larger than 64kB).
      
         - Add IP_LOCAL_PORT_RANGE socket option, to control local port range
           on socket by socket basis.
      
         - Track and report in procfs number of MPTCP sockets used.
      
         - Support mixing IPv4 and IPv6 flows in the in-kernel MPTCP path
           manager.
      
         - IPv6: don't check net.ipv6.route.max_size and rely on garbage
           collection to free memory (similarly to IPv4).
      
         - Support Penultimate Segment Pop (PSP) flavor in SRv6 (RFC8986).
      
         - ICMP: add per-rate limit counters.
      
         - Add support for user scanning requests in ieee802154.
      
         - Remove static WEP support.
      
         - Support minimal Wi-Fi 7 Extremely High Throughput (EHT) rate
           reporting.
      
         - WiFi 7 EHT channel puncturing support (client & AP).
      
        BPF:
      
         - Add a rbtree data structure following the "next-gen data structure"
           precedent set by recently added linked list, that is, by using
           kfunc + kptr instead of adding a new BPF map type.
      
         - Expose XDP hints via kfuncs with initial support for RX hash and
           timestamp metadata.
      
         - Add BPF_F_NO_TUNNEL_KEY extension to bpf_skb_set_tunnel_key to
           better support decap on GRE tunnel devices not operating in collect
           metadata.
      
         - Improve x86 JIT's codegen for PROBE_MEM runtime error checks.
      
         - Remove the need for trace_printk_lock for bpf_trace_printk and
           bpf_trace_vprintk helpers.
      
         - Extend libbpf's bpf_tracing.h support for tracing arguments of
           kprobes/uprobes and syscall as a special case.
      
         - Significantly reduce the search time for module symbols by
           livepatch and BPF.
      
         - Enable cpumasks to be used as kptrs, which is useful for tracing
           programs tracking which tasks end up running on which CPUs in
           different time intervals.
      
         - Add support for BPF trampoline on s390x and riscv64.
      
         - Add capability to export the XDP features supported by the NIC.
      
         - Add __bpf_kfunc tag for marking kernel functions as kfuncs.
      
         - Add cgroup.memory=nobpf kernel parameter option to disable BPF
           memory accounting for container environments.
      
        Netfilter:
      
         - Remove the CLUSTERIP target. It has been marked as obsolete for
           years, and we still have WARN splats wrt races of the out-of-band
           /proc interface installed by this target.
      
         - Add 'destroy' commands to nf_tables. They are identical to the
           existing 'delete' commands, but do not return an error if the
           referenced object (set, chain, rule...) did not exist.
      
        Driver API:
      
         - Improve cpumask_local_spread() locality to help NICs set the right
           IRQ affinity on AMD platforms.
      
         - Separate C22 and C45 MDIO bus transactions more clearly.
      
         - Introduce new DCB table to control DSCP rewrite on egress.
      
         - Support configuration of Physical Layer Collision Avoidance (PLCA)
           Reconciliation Sublayer (RS) (802.3cg-2019). Modern version of
           shared medium Ethernet.
      
         - Support for MAC Merge layer (IEEE 802.3-2018 clause 99). Allowing
           preemption of low priority frames by high priority frames.
      
         - Add support for controlling MACSec offload using netlink SET.
      
         - Rework devlink instance refcounts to allow registration and
           de-registration under the instance lock. Split the code into
           multiple files, drop some of the unnecessarily granular locks and
           factor out common parts of netlink operation handling.
      
         - Add TX frame aggregation parameters (for USB drivers).
      
         - Add a new attr TCA_EXT_WARN_MSG to report TC (offload) warning
           messages with notifications for debug.
      
         - Allow offloading of UDP NEW connections via act_ct.
      
         - Add support for per action HW stats in TC.
      
         - Support hardware miss to TC action (continue processing in SW from
           a specific point in the action chain).
      
         - Warn if old Wireless Extension user space interface is used with
           modern cfg80211/mac80211 drivers. Do not support Wireless
           Extensions for Wi-Fi 7 devices at all. Everyone should switch to
           using nl80211 interface instead.
      
         - Improve the CAN bit timing configuration. Use extack to return
           error messages directly to user space, update the SJW handling,
           including the definition of a new default value that will benefit
           CAN-FD controllers, by increasing their oscillator tolerance.
      
        New hardware / drivers:
      
         - Ethernet:
            - nVidia BlueField-3 support (control traffic driver)
            - Ethernet support for imx93 SoCs
            - Motorcomm yt8531 gigabit Ethernet PHY
            - onsemi NCN26000 10BASE-T1S PHY (with support for PLCA)
            - Microchip LAN8841 PHY (incl. cable diagnostics and PTP)
            - Amlogic gxl MDIO mux
      
         - WiFi:
            - RealTek RTL8188EU (rtl8xxxu)
            - Qualcomm Wi-Fi 7 devices (ath12k)
      
         - CAN:
            - Renesas R-Car V4H
      
        Drivers:
      
         - Bluetooth:
            - Set Per Platform Antenna Gain (PPAG) for Intel controllers.
      
         - Ethernet NICs:
            - Intel (1G, igc):
               - support TSN / Qbv / packet scheduling features of i226 model
            - Intel (100G, ice):
               - use GNSS subsystem instead of TTY
               - multi-buffer XDP support
               - extend support for GPIO pins to E823 devices
            - nVidia/Mellanox:
               - update the shared buffer configuration on PFC commands
               - implement PTP adjphase function for HW offset control
               - TC support for Geneve and GRE with VF tunnel offload
               - more efficient crypto key management method
               - multi-port eswitch support
            - Netronome/Corigine:
               - add DCB IEEE support
               - support IPsec offloading for NFP3800
            - Freescale/NXP (enetc):
               - support XDP_REDIRECT for XDP non-linear buffers
               - improve reconfig, avoid link flap and waiting for idle
               - support MAC Merge layer
            - Other NICs:
               - sfc/ef100: add basic devlink support for ef100
               - ionic: rx_push mode operation (writing descriptors via MMIO)
               - bnxt: use the auxiliary bus abstraction for RDMA
               - r8169: disable ASPM and reset bus in case of tx timeout
               - cpsw: support QSGMII mode for J721e CPSW9G
               - cpts: support pulse-per-second output
               - ngbe: add an mdio bus driver
               - usbnet: optimize usbnet_bh() by avoiding unnecessary queuing
               - r8152: handle devices with FW with NCM support
               - amd-xgbe: support 10Mbps, 2.5GbE speeds and rx-adaptation
               - virtio-net: support multi buffer XDP
               - virtio/vsock: replace virtio_vsock_pkt with sk_buff
               - tsnep: XDP support
      
         - Ethernet high-speed switches:
            - nVidia/Mellanox (mlxsw):
               - add support for latency TLV (in FW control messages)
            - Microchip (sparx5):
               - separate explicit and implicit traffic forwarding rules, make
                 the implicit rules always active
               - add support for egress DSCP rewrite
               - IS0 VCAP support (Ingress Classification)
               - IS2 VCAP filters (protos, L3 addrs, L4 ports, flags, ToS
                 etc.)
               - ES2 VCAP support (Egress Access Control)
               - support for Per-Stream Filtering and Policing (802.1Q,
                 8.6.5.1)
      
         - Ethernet embedded switches:
            - Marvell (mv88e6xxx):
               - add MAB (port auth) offload support
               - enable PTP receive for mv88e6390
            - NXP (ocelot):
               - support MAC Merge layer
               - support for the the vsc7512 internal copper phys
            - Microchip:
               - lan9303: convert to PHYLINK
               - lan966x: support TC flower filter statistics
               - lan937x: PTP support for KSZ9563/KSZ8563 and LAN937x
               - lan937x: support Credit Based Shaper configuration
               - ksz9477: support Energy Efficient Ethernet
            - other:
               - qca8k: convert to regmap read/write API, use bulk operations
               - rswitch: Improve TX timestamp accuracy
      
         - Intel WiFi (iwlwifi):
            - EHT (Wi-Fi 7) rate reporting
            - STEP equalizer support: transfer some STEP (connection to radio
              on platforms with integrated wifi) related parameters from the
              BIOS to the firmware.
      
         - Qualcomm 802.11ax WiFi (ath11k):
            - IPQ5018 support
            - Fine Timing Measurement (FTM) responder role support
            - channel 177 support
      
         - MediaTek WiFi (mt76):
            - per-PHY LED support
            - mt7996: EHT (Wi-Fi 7) support
            - Wireless Ethernet Dispatch (WED) reset support
            - switch to using page pool allocator
      
         - RealTek WiFi (rtw89):
            - support new version of Bluetooth co-existance
      
         - Mobile:
            - rmnet: support TX aggregation"
      
      * tag 'net-next-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1872 commits)
        page_pool: add a comment explaining the fragment counter usage
        net: ethtool: fix __ethtool_dev_mm_supported() implementation
        ethtool: pse-pd: Fix double word in comments
        xsk: add linux/vmalloc.h to xsk.c
        sefltests: netdevsim: wait for devlink instance after netns removal
        selftest: fib_tests: Always cleanup before exit
        net/mlx5e: Align IPsec ASO result memory to be as required by hardware
        net/mlx5e: TC, Set CT miss to the specific ct action instance
        net/mlx5e: Rename CHAIN_TO_REG to MAPPED_OBJ_TO_REG
        net/mlx5: Refactor tc miss handling to a single function
        net/mlx5: Kconfig: Make tc offload depend on tc skb extension
        net/sched: flower: Support hardware miss to tc action
        net/sched: flower: Move filter handle initialization earlier
        net/sched: cls_api: Support hardware miss to tc action
        net/sched: Rename user cookie and act cookie
        sfc: fix builds without CONFIG_RTC_LIB
        sfc: clean up some inconsistent indentings
        net/mlx4_en: Introduce flexible array to silence overflow warning
        net: lan966x: Fix possible deadlock inside PTP
        net/ulp: Remove redundant ->clone() test in inet_clone_ulp().
        ...
      5b7c4cab
    • Linus Torvalds's avatar
      Merge tag 'v6.3-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 · 36289a03
      Linus Torvalds authored
      Pull crypto update from Herbert Xu:
       "API:
         - Use kmap_local instead of kmap_atomic
         - Change request callback to take void pointer
         - Print FIPS status in /proc/crypto (when enabled)
      
        Algorithms:
         - Add rfc4106/gcm support on arm64
         - Add ARIA AVX2/512 support on x86
      
        Drivers:
         - Add TRNG driver for StarFive SoC
         - Delete ux500/hash driver (subsumed by stm32/hash)
         - Add zlib support in qat
         - Add RSA support in aspeed"
      
      * tag 'v6.3-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (156 commits)
        crypto: x86/aria-avx - Do not use avx2 instructions
        crypto: aspeed - Fix modular aspeed-acry
        crypto: hisilicon/qm - fix coding style issues
        crypto: hisilicon/qm - update comments to match function
        crypto: hisilicon/qm - change function names
        crypto: hisilicon/qm - use min() instead of min_t()
        crypto: hisilicon/qm - remove some unused defines
        crypto: proc - Print fips status
        crypto: crypto4xx - Call dma_unmap_page when done
        crypto: octeontx2 - Fix objects shared between several modules
        crypto: nx - Fix sparse warnings
        crypto: ecc - Silence sparse warning
        tls: Pass rec instead of aead_req into tls_encrypt_done
        crypto: api - Remove completion function scaffolding
        tls: Remove completion function scaffolding
        tipc: Remove completion function scaffolding
        net: ipv6: Remove completion function scaffolding
        net: ipv4: Remove completion function scaffolding
        net: macsec: Remove completion function scaffolding
        dm: Remove completion function scaffolding
        ...
      36289a03