1. 07 Sep, 2022 31 commits
  2. 06 Sep, 2022 4 commits
    • Paolo Abeni's avatar
      Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next · 2786bcff
      Paolo Abeni authored
      Daniel Borkmann says:
      
      ====================
      pull-request: bpf-next 2022-09-05
      
      The following pull-request contains BPF updates for your *net-next* tree.
      
      We've added 106 non-merge commits during the last 18 day(s) which contain
      a total of 159 files changed, 5225 insertions(+), 1358 deletions(-).
      
      There are two small merge conflicts, resolve them as follows:
      
      1) tools/testing/selftests/bpf/DENYLIST.s390x
      
        Commit 27e23836 ("selftests/bpf: Add lru_bug to s390x deny list") in
        bpf tree was needed to get BPF CI green on s390x, but it conflicted with
        newly added tests on bpf-next. Resolve by adding both hunks, result:
      
        [...]
        lru_bug                                  # prog 'printk': failed to auto-attach: -524
        setget_sockopt                           # attach unexpected error: -524                                               (trampoline)
        cb_refs                                  # expected error message unexpected error: -524                               (trampoline)
        cgroup_hierarchical_stats                # JIT does not support calling kernel function                                (kfunc)
        htab_update                              # failed to attach: ERROR: strerror_r(-524)=22                                (trampoline)
        [...]
      
      2) net/core/filter.c
      
        Commit 1227c177 ("net: Fix data-races around sysctl_[rw]mem_(max|default).")
        from net tree conflicts with commit 29003875 ("bpf: Change bpf_setsockopt(SOL_SOCKET)
        to reuse sk_setsockopt()") from bpf-next tree. Take the code as it is from
        bpf-next tree, result:
      
        [...]
      	if (getopt) {
      		if (optname == SO_BINDTODEVICE)
      			return -EINVAL;
      		return sk_getsockopt(sk, SOL_SOCKET, optname,
      				     KERNEL_SOCKPTR(optval),
      				     KERNEL_SOCKPTR(optlen));
      	}
      
      	return sk_setsockopt(sk, SOL_SOCKET, optname,
      			     KERNEL_SOCKPTR(optval), *optlen);
        [...]
      
      The main changes are:
      
      1) Add any-context BPF specific memory allocator which is useful in particular for BPF
         tracing with bonus of performance equal to full prealloc, from Alexei Starovoitov.
      
      2) Big batch to remove duplicated code from bpf_{get,set}sockopt() helpers as an effort
         to reuse the existing core socket code as much as possible, from Martin KaFai Lau.
      
      3) Extend BPF flow dissector for BPF programs to just augment the in-kernel dissector
         with custom logic. In other words, allow for partial replacement, from Shmulik Ladkani.
      
      4) Add a new cgroup iterator to BPF with different traversal options, from Hao Luo.
      
      5) Support for BPF to collect hierarchical cgroup statistics efficiently through BPF
         integration with the rstat framework, from Yosry Ahmed.
      
      6) Support bpf_{g,s}et_retval() under more BPF cgroup hooks, from Stanislav Fomichev.
      
      7) BPF hash table and local storages fixes under fully preemptible kernel, from Hou Tao.
      
      8) Add various improvements to BPF selftests and libbpf for compilation with gcc BPF
         backend, from James Hilliard.
      
      9) Fix verifier helper permissions and reference state management for synchronous
         callbacks, from Kumar Kartikeya Dwivedi.
      
      10) Add support for BPF selftest's xskxceiver to also be used against real devices that
          support MAC loopback, from Maciej Fijalkowski.
      
      11) Various fixes to the bpf-helpers(7) man page generation script, from Quentin Monnet.
      
      12) Document BPF verifier's tnum_in(tnum_range(), ...) gotchas, from Shung-Hsi Yu.
      
      13) Various minor misc improvements all over the place.
      
      * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (106 commits)
        bpf: Optimize rcu_barrier usage between hash map and bpf_mem_alloc.
        bpf: Remove usage of kmem_cache from bpf_mem_cache.
        bpf: Remove prealloc-only restriction for sleepable bpf programs.
        bpf: Prepare bpf_mem_alloc to be used by sleepable bpf programs.
        bpf: Remove tracing program restriction on map types
        bpf: Convert percpu hash map to per-cpu bpf_mem_alloc.
        bpf: Add percpu allocation support to bpf_mem_alloc.
        bpf: Batch call_rcu callbacks instead of SLAB_TYPESAFE_BY_RCU.
        bpf: Adjust low/high watermarks in bpf_mem_cache
        bpf: Optimize call_rcu in non-preallocated hash map.
        bpf: Optimize element count in non-preallocated hash map.
        bpf: Relax the requirement to use preallocated hash maps in tracing progs.
        samples/bpf: Reduce syscall overhead in map_perf_test.
        selftests/bpf: Improve test coverage of test_maps
        bpf: Convert hash map to bpf_mem_alloc.
        bpf: Introduce any context BPF specific memory allocator.
        selftest/bpf: Add test for bpf_getsockopt()
        bpf: Change bpf_getsockopt(SOL_IPV6) to reuse do_ipv6_getsockopt()
        bpf: Change bpf_getsockopt(SOL_IP) to reuse do_ip_getsockopt()
        bpf: Change bpf_getsockopt(SOL_TCP) to reuse do_tcp_getsockopt()
        ...
      ====================
      
      Link: https://lore.kernel.org/r/20220905161136.9150-1-daniel@iogearbox.netSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      2786bcff
    • Sergei Antonov's avatar
      net: moxa: fix endianness-related issues from 'sparse' · 03fdb11d
      Sergei Antonov authored
      Sparse checker found two endianness-related issues:
      
      .../moxart_ether.c:34:15: warning: incorrect type in assignment (different base types)
      .../moxart_ether.c:34:15:    expected unsigned int [usertype]
      .../moxart_ether.c:34:15:    got restricted __le32 [usertype]
      
      .../moxart_ether.c:39:16: warning: cast to restricted __le32
      
      Fix them by using __le32 type instead of u32.
      Signed-off-by: default avatarSergei Antonov <saproj@gmail.com>
      Link: https://lore.kernel.org/r/20220902125037.1480268-1-saproj@gmail.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      03fdb11d
    • Sergei Antonov's avatar
      net: ftmac100: fix endianness-related issues from 'sparse' · 9df696b3
      Sergei Antonov authored
      Sparse found a number of endianness-related issues of these kinds:
      
      .../ftmac100.c:192:32: warning: restricted __le32 degrades to integer
      
      .../ftmac100.c:208:23: warning: incorrect type in assignment (different base types)
      .../ftmac100.c:208:23:    expected unsigned int rxdes0
      .../ftmac100.c:208:23:    got restricted __le32 [usertype]
      
      .../ftmac100.c:249:23: warning: invalid assignment: &=
      .../ftmac100.c:249:23:    left side has type unsigned int
      .../ftmac100.c:249:23:    right side has type restricted __le32
      
      .../ftmac100.c:527:16: warning: cast to restricted __le32
      
      Change type of some fields from 'unsigned int' to '__le32' to fix it.
      Signed-off-by: default avatarSergei Antonov <saproj@gmail.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Link: https://lore.kernel.org/r/20220902113749.1408562-1-saproj@gmail.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      9df696b3
    • Horatiu Vultur's avatar
      net: lan966x: Extend lan966x with RGMII support · d5edc797
      Horatiu Vultur authored
      Extend lan966x with RGMII support. The MAC supports all RGMII_* modes.
      Signed-off-by: default avatarHoratiu Vultur <horatiu.vultur@microchip.com>
      Link: https://lore.kernel.org/r/20220902111548.614525-1-horatiu.vultur@microchip.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      d5edc797
  3. 05 Sep, 2022 5 commits
    • Heiner Kallweit's avatar
      r8169: remove not needed net_ratelimit() check · 96efd6d0
      Heiner Kallweit authored
      We're not in a hot path and don't want to miss this message,
      therefore remove the net_ratelimit() check.
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      96efd6d0
    • Kees Cook's avatar
      netlink: Bounds-check struct nlmsgerr creation · 710d21fd
      Kees Cook authored
      In preparation for FORTIFY_SOURCE doing bounds-check on memcpy(),
      switch from __nlmsg_put to nlmsg_put(), and explain the bounds check
      for dealing with the memcpy() across a composite flexible array struct.
      Avoids this future run-time warning:
      
        memcpy: detected field-spanning write (size 32) of single field "&errmsg->msg" at net/netlink/af_netlink.c:2447 (size 16)
      
      Cc: Jakub Kicinski <kuba@kernel.org>
      Cc: Pablo Neira Ayuso <pablo@netfilter.org>
      Cc: Jozsef Kadlecsik <kadlec@netfilter.org>
      Cc: Florian Westphal <fw@strlen.de>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Paolo Abeni <pabeni@redhat.com>
      Cc: syzbot <syzkaller@googlegroups.com>
      Cc: netfilter-devel@vger.kernel.org
      Cc: coreteam@netfilter.org
      Cc: netdev@vger.kernel.org
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Link: https://lore.kernel.org/r/20220901071336.1418572-1-keescook@chromium.orgSigned-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      710d21fd
    • Daniel Borkmann's avatar
      Merge branch 'bpf-allocator' · 274052a2
      Daniel Borkmann authored
      Alexei Starovoitov says:
      
      ====================
      Introduce any context BPF specific memory allocator.
      
      Tracing BPF programs can attach to kprobe and fentry. Hence they run in
      unknown context where calling plain kmalloc() might not be safe. Front-end
      kmalloc() with per-cpu cache of free elements. Refill this cache asynchronously
      from irq_work.
      
      Major achievements enabled by bpf_mem_alloc:
      - Dynamically allocated hash maps used to be 10 times slower than fully
        preallocated. With bpf_mem_alloc and subsequent optimizations the speed
        of dynamic maps is equal to full prealloc.
      - Tracing bpf programs can use dynamically allocated hash maps. Potentially
        saving lots of memory. Typical hash map is sparsely populated.
      - Sleepable bpf programs can used dynamically allocated hash maps.
      
      Future work:
      - Expose bpf_mem_alloc as uapi FD to be used in dynptr_alloc, kptr_alloc
      - Convert lru map to bpf_mem_alloc
      - Further cleanup htab code. Example: htab_use_raw_lock can be removed.
      
      Changelog:
      
      v5->v6:
      - Debugged the reason for selftests/bpf/test_maps ooming in a small VM that BPF CI is using.
        Added patch 16 that optimizes the usage of rcu_barrier-s between bpf_mem_alloc and
        hash map. It drastically improved the speed of htab destruction.
      
      v4->v5:
      - Fixed missing migrate_disable in hash tab free path (Daniel)
      - Replaced impossible "memory leak" with WARN_ON_ONCE (Martin)
      - Dropped sysctl kernel.bpf_force_dyn_alloc patch (Daniel)
      - Added Andrii's ack
      - Added new patch 15 that removes kmem_cache usage from bpf_mem_alloc.
        It saves memory, speeds up map create/destroy operations
        while maintains hash map update/delete performance.
      
      v3->v4:
      - fix build issue due to missing local.h on 32-bit arch
      - add Kumar's ack
      - proposal for next steps from Delyan:
      https://lore.kernel.org/bpf/d3f76b27f4e55ec9e400ae8dcaecbb702a4932e8.camel@fb.com/
      
      v2->v3:
      - Rewrote the free_list algorithm based on discussions with Kumar. Patch 1.
      - Allowed sleepable bpf progs use dynamically allocated maps. Patches 13 and 14.
      - Added sysctl to force bpf_mem_alloc in hash map even if pre-alloc is
        requested to reduce memory consumption. Patch 15.
      - Fix: zero-fill percpu allocation
      - Single rcu_barrier at the end instead of each cpu during bpf_mem_alloc destruction
      
      v2 thread:
      https://lore.kernel.org/bpf/20220817210419.95560-1-alexei.starovoitov@gmail.com/
      
      v1->v2:
      - Moved unsafe direct call_rcu() from hash map into safe place inside bpf_mem_alloc. Patches 7 and 9.
      - Optimized atomic_inc/dec in hash map with percpu_counter. Patch 6.
      - Tuned watermarks per allocation size. Patch 8
      - Adopted this approach to per-cpu allocation. Patch 10.
      - Fully converted hash map to bpf_mem_alloc. Patch 11.
      - Removed tracing prog restriction on map types. Combination of all patches and final patch 12.
      
      v1 thread:
      https://lore.kernel.org/bpf/20220623003230.37497-1-alexei.starovoitov@gmail.com/
      
      LWN article:
      https://lwn.net/Articles/899274/
      ====================
      
      Link: https://lore.kernel.org/r/Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      274052a2
    • Alexei Starovoitov's avatar
      bpf: Optimize rcu_barrier usage between hash map and bpf_mem_alloc. · 9f2c6e96
      Alexei Starovoitov authored
      User space might be creating and destroying a lot of hash maps. Synchronous
      rcu_barrier-s in a destruction path of hash map delay freeing of hash buckets
      and other map memory and may cause artificial OOM situation under stress.
      Optimize rcu_barrier usage between bpf hash map and bpf_mem_alloc:
      - remove rcu_barrier from hash map, since htab doesn't use call_rcu
        directly and there are no callback to wait for.
      - bpf_mem_alloc has call_rcu_in_progress flag that indicates pending callbacks.
        Use it to avoid barriers in fast path.
      - When barriers are needed copy bpf_mem_alloc into temp structure
        and wait for rcu barrier-s in the worker to let the rest of
        hash map freeing to proceed.
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20220902211058.60789-17-alexei.starovoitov@gmail.com
      9f2c6e96
    • Alexei Starovoitov's avatar
      bpf: Remove usage of kmem_cache from bpf_mem_cache. · bfc03c15
      Alexei Starovoitov authored
      For bpf_mem_cache based hash maps the following stress test:
      for (i = 1; i <= 512; i <<= 1)
        for (j = 1; j <= 1 << 18; j <<= 1)
          fd = bpf_map_create(BPF_MAP_TYPE_HASH, NULL, i, j, 2, 0);
      creates many kmem_cache-s that are not mergeable in debug kernels
      and consume unnecessary amount of memory.
      Turned out bpf_mem_cache's free_list logic does batching well,
      so usage of kmem_cache for fixes size allocations doesn't bring
      any performance benefits vs normal kmalloc.
      Hence get rid of kmem_cache in bpf_mem_cache.
      That saves memory, speeds up map create/destroy operations,
      while maintains hash map update/delete performance.
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20220902211058.60789-16-alexei.starovoitov@gmail.com
      bfc03c15