1. 18 Aug, 2023 20 commits
  2. 17 Aug, 2023 2 commits
    • Jakub Kicinski's avatar
      Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next · f54a2a13
      Jakub Kicinski authored
      Daniel Borkmann says:
      
      ====================
      pull-request: bpf-next 2023-08-16
      
      We've added 17 non-merge commits during the last 6 day(s) which contain
      a total of 20 files changed, 1179 insertions(+), 37 deletions(-).
      
      The main changes are:
      
      1) Add a BPF hook in sys_socket() to change the protocol ID
         from IPPROTO_TCP to IPPROTO_MPTCP to cover migration for legacy
         applications, from Geliang Tang.
      
      2) Follow-up/fallout fix from the SO_REUSEPORT + bpf_sk_assign work
         to fix a splat on non-fullsock sks in inet[6]_steal_sock,
         from Lorenz Bauer.
      
      3) Improvements to struct_ops links to avoid forcing presence of
         update/validate callbacks. Also add bpf_struct_ops fields documentation,
         from David Vernet.
      
      4) Ensure libbpf sets close-on-exec flag on gzopen, from Marco Vedovati.
      
      5) Several new tcx selftest additions and bpftool link show support for
         tcx and xdp links, from Daniel Borkmann.
      
      6) Fix a smatch warning on uninitialized symbol in
         bpf_perf_link_fill_kprobe, from Yafang Shao.
      
      7) BPF selftest fixes e.g. misplaced break in kfunc_call test,
         from Yipeng Zou.
      
      8) Small cleanup to remove unused declaration bpf_link_new_file,
         from Yue Haibing.
      
      9) Small typo fix to bpftool's perf help message, from Daniel T. Lee.
      
      * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next:
        selftests/bpf: Add mptcpify test
        selftests/bpf: Fix error checks of mptcp open_and_load
        selftests/bpf: Add two mptcp netns helpers
        bpf: Add update_socket_protocol hook
        bpftool: Implement link show support for xdp
        bpftool: Implement link show support for tcx
        selftests/bpf: Add selftest for fill_link_info
        bpf: Fix uninitialized symbol in bpf_perf_link_fill_kprobe()
        net: Fix slab-out-of-bounds in inet[6]_steal_sock
        bpf: Document struct bpf_struct_ops fields
        bpf: Support default .validate() and .update() behavior for struct_ops links
        selftests/bpf: Add various more tcx test cases
        selftests/bpf: Clean up fmod_ret in bench_rename test script
        selftests/bpf: Fix repeat option when kfunc_call verification fails
        libbpf: Set close-on-exec flag on gzopen
        bpftool: fix perf help message
        bpf: Remove unused declaration bpf_link_new_file()
      ====================
      
      Link: https://lore.kernel.org/r/20230816212840.1539-1-daniel@iogearbox.netSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      f54a2a13
    • Jakub Kicinski's avatar
      Revert "net: ethernet: ti: am65-cpsw: add mqprio qdisc offload in channel mode" · 42b118c9
      Jakub Kicinski authored
      This reverts commit 90bc21aa.
      
      Patch was merged too hastily, Vladimir requested changes in:
      https://lore.kernel.org/all/20230816121305.5dio5tk3chge2ndh@skbuf/Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      42b118c9
  3. 16 Aug, 2023 18 commits
    • Martin KaFai Lau's avatar
      Merge branch 'bpf: Force to MPTCP' · de405373
      Martin KaFai Lau authored
      Geliang Tang says:
      
      ====================
      As is described in the "How to use MPTCP?" section in MPTCP wiki [1]:
      
      "Your app should create sockets with IPPROTO_MPTCP as the proto:
      ( socket(AF_INET, SOCK_STREAM, IPPROTO_MPTCP); ). Legacy apps can be
      forced to create and use MPTCP sockets instead of TCP ones via the
      mptcpize command bundled with the mptcpd daemon."
      
      But the mptcpize (LD_PRELOAD technique) command has some limitations
      [2]:
      
       - it doesn't work if the application is not using libc (e.g. GoLang
      apps)
       - in some envs, it might not be easy to set env vars / change the way
      apps are launched, e.g. on Android
       - mptcpize needs to be launched with all apps that want MPTCP: we could
      have more control from BPF to enable MPTCP only for some apps or all the
      ones of a netns or a cgroup, etc.
       - it is not in BPF, we cannot talk about it at netdev conf.
      
      So this patchset attempts to use BPF to implement functions similer to
      mptcpize.
      
      The main idea is to add a hook in sys_socket() to change the protocol id
      from IPPROTO_TCP (or 0) to IPPROTO_MPTCP.
      
      [1]
      https://github.com/multipath-tcp/mptcp_net-next/wiki
      [2]
      https://github.com/multipath-tcp/mptcp_net-next/issues/79
      
      v14:
       - Use getsockopt(MPTCP_INFO) to verify mptcp protocol intead of using
      nstat command.
      
      v13:
       - drop "Use random netns name for mptcp" patch.
      
      v12:
       - update diag_* log of update_socket_protocol.
       - add 'ip netns show' after 'ip netns del' to check if there is
      a test did not clean up its netns.
       - return libbpf_get_error() instead of -EIO for the error from
      open_and_load().
       - Use getsockopt(SOL_PROTOCOL) to verify mptcp protocol intead of
      using 'ss -tOni'.
      
      v11:
       - add comments about outputs of 'ss' and 'nstat'.
       - use "err = verify_mptcpify()" instead of using =+.
      
      v10:
       - drop "#ifdef CONFIG_BPF_JIT".
       - include vmlinux.h and bpf_tracing_net.h to avoid defining some
      macros.
       - drop unneeded checks for mptcp.
      
      v9:
       - update comment for 'update_socket_protocol'.
      
      v8:
       - drop the additional checks on the 'protocol' value after the
      'update_socket_protocol()' call.
      
      v7:
       - add __weak and __diag_* for update_socket_protocol.
      
      v6:
       - add update_socket_protocol.
      
      v5:
       - add bpf_mptcpify helper.
      
      v4:
       - use lsm_cgroup/socket_create
      
      v3:
       - patch 8: char cmd[128]; -> char cmd[256];
      
      v2:
       - Fix build selftests errors reported by CI
      
      Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/79
      ====================
      Signed-off-by: default avatarMartin KaFai Lau <martin.lau@kernel.org>
      de405373
    • Geliang Tang's avatar
      selftests/bpf: Add mptcpify test · ddba1224
      Geliang Tang authored
      Implement a new test program mptcpify: if the family is AF_INET or
      AF_INET6, the type is SOCK_STREAM, and the protocol ID is 0 or
      IPPROTO_TCP, set it to IPPROTO_MPTCP. It will be hooked in
      update_socket_protocol().
      
      Extend the MPTCP test base, add a selftest test_mptcpify() for the
      mptcpify case. Open and load the mptcpify test prog to mptcpify the
      TCP sockets dynamically, then use start_server() and connect_to_fd()
      to create a TCP socket, but actually what's created is an MPTCP
      socket, which can be verified through 'getsockopt(SOL_PROTOCOL)'
      and 'getsockopt(MPTCP_INFO)'.
      Acked-by: default avatarYonghong Song <yonghong.song@linux.dev>
      Reviewed-by: default avatarMatthieu Baerts <matthieu.baerts@tessares.net>
      Signed-off-by: default avatarGeliang Tang <geliang.tang@suse.com>
      Link: https://lore.kernel.org/r/364e72f307e7bb38382ec7442c182d76298a9c41.1692147782.git.geliang.tang@suse.comSigned-off-by: default avatarMartin KaFai Lau <martin.lau@kernel.org>
      ddba1224
    • Geliang Tang's avatar
      selftests/bpf: Fix error checks of mptcp open_and_load · 20774655
      Geliang Tang authored
      Return libbpf_get_error(), instead of -EIO, for the error from
      mptcp_sock__open_and_load().
      
      Load success means prog_fd and map_fd are always valid. So drop these
      unneeded ASSERT_GE checks for them in mptcp run_test().
      Acked-by: default avatarYonghong Song <yonghong.song@linux.dev>
      Signed-off-by: default avatarGeliang Tang <geliang.tang@suse.com>
      Link: https://lore.kernel.org/r/db5fcb93293df9ab173edcbaf8252465b80da6f2.1692147782.git.geliang.tang@suse.comSigned-off-by: default avatarMartin KaFai Lau <martin.lau@kernel.org>
      20774655
    • Geliang Tang's avatar
      selftests/bpf: Add two mptcp netns helpers · 97c9c652
      Geliang Tang authored
      Add two netns helpers for mptcp tests: create_netns() and
      cleanup_netns(). Use them in test_base().
      
      These new helpers will be re-used in the following commits
      introducing new tests.
      Acked-by: default avatarYonghong Song <yonghong.song@linux.dev>
      Reviewed-by: default avatarMatthieu Baerts <matthieu.baerts@tessares.net>
      Signed-off-by: default avatarGeliang Tang <geliang.tang@suse.com>
      Link: https://lore.kernel.org/r/7506371fb6c417b401cc9d7365fe455754f4ba3f.1692147782.git.geliang.tang@suse.comSigned-off-by: default avatarMartin KaFai Lau <martin.lau@kernel.org>
      97c9c652
    • Geliang Tang's avatar
      bpf: Add update_socket_protocol hook · 0dd061a6
      Geliang Tang authored
      Add a hook named update_socket_protocol in __sys_socket(), for bpf
      progs to attach to and update socket protocol. One user case is to
      force legacy TCP apps to create and use MPTCP sockets instead of
      TCP ones.
      
      Define a fmod_ret set named bpf_mptcp_fmodret_ids, add the hook
      update_socket_protocol into this set, and register it in
      bpf_mptcp_kfunc_init().
      
      Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/79Acked-by: default avatarMatthieu Baerts <matthieu.baerts@tessares.net>
      Acked-by: default avatarYonghong Song <yonghong.song@linux.dev>
      Signed-off-by: default avatarGeliang Tang <geliang.tang@suse.com>
      Link: https://lore.kernel.org/r/ac84be00f97072a46f8a72b4e2be46cbb7fa5053.1692147782.git.geliang.tang@suse.comSigned-off-by: default avatarMartin KaFai Lau <martin.lau@kernel.org>
      0dd061a6
    • Daniel Borkmann's avatar
      bpftool: Implement link show support for xdp · 053bbf9b
      Daniel Borkmann authored
      Add support to dump XDP link information to bpftool. This reuses the
      recently added show_link_ifindex_{plain,json}(). The XDP link info only
      exposes the ifindex.
      
      Below shows an example link dump output, and a cgroup link is included
      for comparison, too:
      
        # bpftool link
        [...]
        10: cgroup  prog 2466
              cgroup_id 1  attach_type cgroup_inet6_post_bind
        [...]
        16: xdp  prog 2477
              ifindex enp5s0(3)
        [...]
      
      Equivalent json output:
      
        # bpftool link --json
        [...]
        {
          "id": 10,
          "type": "cgroup",
          "prog_id": 2466,
          "cgroup_id": 1,
          "attach_type": "cgroup_inet6_post_bind"
        },
        [...]
        {
          "id": 16,
          "type": "xdp",
          "prog_id": 2477,
          "devname": "enp5s0",
          "ifindex": 3
        }
        [...]
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Reviewed-by: default avatarQuentin Monnet <quentin@isovalent.com>
      Link: https://lore.kernel.org/r/20230816095651.10014-2-daniel@iogearbox.netSigned-off-by: default avatarMartin KaFai Lau <martin.lau@kernel.org>
      053bbf9b
    • Daniel Borkmann's avatar
      bpftool: Implement link show support for tcx · e16e6c6d
      Daniel Borkmann authored
      Add support to dump tcx link information to bpftool. This adds a
      common helper show_link_ifindex_{plain,json}() which can be reused
      also for other link types. The plain text and json device output is
      the same format as in bpftool net dump.
      
      Below shows an example link dump output along with a cgroup link
      for comparison:
      
        # bpftool link
        [...]
        10: cgroup  prog 1977
              cgroup_id 1  attach_type cgroup_inet6_post_bind
        [...]
        13: tcx  prog 2053
              ifindex enp5s0(3)  attach_type tcx_ingress
        14: tcx  prog 2080
              ifindex enp5s0(3)  attach_type tcx_egress
        [...]
      
      Equivalent json output:
      
        # bpftool link --json
        [...]
        {
          "id": 10,
          "type": "cgroup",
          "prog_id": 1977,
          "cgroup_id": 1,
          "attach_type": "cgroup_inet6_post_bind"
        },
        [...]
        {
          "id": 13,
          "type": "tcx",
          "prog_id": 2053,
          "devname": "enp5s0",
          "ifindex": 3,
          "attach_type": "tcx_ingress"
        },
        {
          "id": 14,
          "type": "tcx",
          "prog_id": 2080,
          "devname": "enp5s0",
          "ifindex": 3,
          "attach_type": "tcx_egress"
        }
        [...]
      Suggested-by: default avatarYafang Shao <laoar.shao@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Reviewed-by: default avatarQuentin Monnet <quentin@isovalent.com>
      Acked-by: default avatarYafang Shao <laoar.shao@gmail.com>
      Link: https://lore.kernel.org/r/20230816095651.10014-1-daniel@iogearbox.netSigned-off-by: default avatarMartin KaFai Lau <martin.lau@kernel.org>
      e16e6c6d
    • Yafang Shao's avatar
      selftests/bpf: Add selftest for fill_link_info · 23cf7aa5
      Yafang Shao authored
      Add selftest for the fill_link_info of uprobe, kprobe and tracepoint.
      The result:
      
        $ tools/testing/selftests/bpf/test_progs --name=fill_link_info
        #79/1    fill_link_info/kprobe_link_info:OK
        #79/2    fill_link_info/kretprobe_link_info:OK
        #79/3    fill_link_info/kprobe_invalid_ubuff:OK
        #79/4    fill_link_info/tracepoint_link_info:OK
        #79/5    fill_link_info/uprobe_link_info:OK
        #79/6    fill_link_info/uretprobe_link_info:OK
        #79/7    fill_link_info/kprobe_multi_link_info:OK
        #79/8    fill_link_info/kretprobe_multi_link_info:OK
        #79/9    fill_link_info/kprobe_multi_invalid_ubuff:OK
        #79      fill_link_info:OK
        Summary: 1/9 PASSED, 0 SKIPPED, 0 FAILED
      
      The test case for kprobe_multi won't be run on aarch64, as it is not
      supported.
      Signed-off-by: default avatarYafang Shao <laoar.shao@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarYonghong Song <yonghong.song@linux.dev>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Link: https://lore.kernel.org/bpf/20230813141900.1268-3-laoar.shao@gmail.com
      23cf7aa5
    • Yafang Shao's avatar
      bpf: Fix uninitialized symbol in bpf_perf_link_fill_kprobe() · 0aa35162
      Yafang Shao authored
      The commit 1b715e1b ("bpf: Support ->fill_link_info for perf_event") leads
      to the following Smatch static checker warning:
      
          kernel/bpf/syscall.c:3416 bpf_perf_link_fill_kprobe()
          error: uninitialized symbol 'type'.
      
      That can happens when uname is NULL. So fix it by verifying the uname when we
      really need to fill it.
      
      Fixes: 1b715e1b ("bpf: Support ->fill_link_info for perf_event")
      Reported-by: default avatarDan Carpenter <dan.carpenter@linaro.org>
      Signed-off-by: default avatarYafang Shao <laoar.shao@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarYonghong Song <yonghong.song@linux.dev>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Closes: https://lore.kernel.org/bpf/85697a7e-f897-4f74-8b43-82721bebc462@kili.mountain
      Link: https://lore.kernel.org/bpf/20230813141900.1268-2-laoar.shao@gmail.com
      0aa35162
    • David S. Miller's avatar
      Merge branch 'ipv6-expired-routes' · 950fe358
      David S. Miller authored
      Kui-Feng Lee says:
      
      ====================
      Remove expired routes with a separated list of routes.
      
      FIB6 GC walks trees of fib6_tables to remove expired routes. Walking a tree
      can be expensive if the number of routes in a table is big, even if most of
      them are permanent. Checking routes in a separated list of routes having
      expiration will avoid this potential issue.
      
      Background
      ==========
      
      The size of a Linux IPv6 routing table can become a big problem if not
      managed appropriately.  Now, Linux has a garbage collector to remove
      expired routes periodically.  However, this may lead to a situation in
      which the routing path is blocked for a long period due to an
      excessive number of routes.
      
      For example, years ago, there is a commit c7bb4b89 ("ipv6: tcp:
      drop silly ICMPv6 packet too big messages").  The root cause is that
      malicious ICMPv6 packets were sent back for every small packet sent to
      them. These packets add routes with an expiration time that prompts
      the GC to periodically check all routes in the tables, including
      permanent ones.
      
      Why Route Expires
      =================
      
      Users can add IPv6 routes with an expiration time manually. However,
      the Neighbor Discovery protocol may also generate routes that can
      expire.  For example, Router Advertisement (RA) messages may create a
      default route with an expiration time. [RFC 4861] For IPv4, it is not
      possible to set an expiration time for a route, and there is no RA, so
      there is no need to worry about such issues.
      
      Create Routes with Expires
      ==========================
      
      You can create routes with expires with the  command.
      
      For example,
      
          ip -6 route add 2001:b000:591::3 via fe80::5054:ff:fe12:3457 \
              dev enp0s3 expires 30
      
      The route that has been generated will be deleted automatically in 30
      seconds.
      
      GC of FIB6
      ==========
      
      The function called fib6_run_gc() is responsible for performing
      garbage collection (GC) for the Linux IPv6 stack. It checks for the
      expiration of every route by traversing the trees of routing
      tables. The time taken to traverse a routing table increases with its
      size. Holding the routing table lock during traversal is particularly
      undesirable. Therefore, it is preferable to keep the lock for the
      shortest possible duration.
      
      Solution
      ========
      
      The cause of the issue is keeping the routing table locked during the
      traversal of large trees. To solve this problem, we can create a separate
      list of routes that have expiration. This will prevent GC from checking
      permanent routes.
      
      Result
      ======
      
      We conducted a test to measure the execution times of fib6_gc_timer_cb()
      and observed that it enhances the GC of FIB6. During the test, we added
      permanent routes with the following numbers: 1000, 3000, 6000, and
      9000. Additionally, we added a route with an expiration time.
      
      Here are the average execution times for the kernel without the patch.
       - 120020 ns with 1000 permanent routes
       - 308920 ns with 3000 ...
       - 581470 ns with 6000 ...
       - 855310 ns with 9000 ...
      
      The kernel with the patch consistently takes around 14000 ns to execute,
      regardless of the number of permanent routes that are installed.
      
      Major changes from v7:
      
       - Fix warings raised by the patchwork.
      
      Major changes from v6:
      
       - Remove unnecessary check of tb6 in fib6_clean_expires_locked().
      
       - Use ib6_clean_expires_locked() instead in fib6_purge_rt().
      
      Major changes from v5:
      
       - Change the order of adding new routes to the GC list and starting
         GC timer.
      
       - Remove time measurements from the test case.
      
       - Stop forcing GC flush.
      
      Major changes from v4:
      
       - Detect existence of 'strace' in the test case.
      
      Major changes from v3:
      
       - Fix the type of arg according to feedback.
      
       - Add 1k temporary routes and 5K permanent routes in the test case.
         Measure time spending on GC with strace.
      
      Major changes from v2:
      
       - Remove unnecessary and incorrect sysctl restoring in the test case.
      
      Major changes from v1:
      
       - Moved gc_link to avoid creating a hole in fib6_info.
      
       - Moved fib6_set_expires*() and fib6_clean_expires*() to the header
         file and inlined. And removed duplicated lines.
      
       - Added a test case.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      950fe358
    • Kui-Feng Lee's avatar
      selftests: fib_tests: Add a test case for IPv6 garbage collection · a63e10da
      Kui-Feng Lee authored
      Add 1000 IPv6 routes with expiration time (w/ and w/o additional 5000
      permanet routes in the background.)  Wait for a few seconds to make sure
      they are removed correctly.
      
      The expected output of the test looks like the following example.
      
      > Fib6 garbage collection test
      >     TEST: ipv6 route garbage collection [ OK ]
      Signed-off-by: default avatarKui-Feng Lee <thinker.li@gmail.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a63e10da
    • Kui-Feng Lee's avatar
      net/ipv6: Remove expired routes with a separated list of routes. · 3dec89b1
      Kui-Feng Lee authored
      FIB6 GC walks trees of fib6_tables to remove expired routes. Walking a tree
      can be expensive if the number of routes in a table is big, even if most of
      them are permanent. Checking routes in a separated list of routes having
      expiration will avoid this potential issue.
      Signed-off-by: default avatarKui-Feng Lee <thinker.li@gmail.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3dec89b1
    • Kai-Heng Feng's avatar
      e1000e: Use PME poll to circumvent unreliable ACPI wake · d1470851
      Kai-Heng Feng authored
      On some I219 devices, ethernet cable plugging detection only works once
      from PCI D3 state. Subsequent cable plugging does set PME bit correctly,
      but device still doesn't get woken up.
      
      Since I219 connects to the root complex directly, it relies on platform
      firmware (ACPI) to wake it up. In this case, the GPE from _PRW only
      works for first cable plugging but fails to notify the driver for
      subsequent plugging events.
      
      The issue was originally found on CNP, but the same issue can be found
      on ADL too. So workaround the issue by continuing use PME poll after
      first ACPI wake. As PME poll is always used, the runtime suspend
      restriction for CNP can also be removed.
      Signed-off-by: default avatarKai-Heng Feng <kai.heng.feng@canonical.com>
      Tested-by: default avatarNaama Meir <naamax.meir@linux.intel.com>
      Acked-by: default avatarSasha Neftin <sasha.neftin@intel.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Reviewed-by: default avatarLeon Romanovsky <leonro@nvidia.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d1470851
    • Abel Wu's avatar
      net-memcg: Fix scope of sockmem pressure indicators · ac8a5296
      Abel Wu authored
      Now there are two indicators of socket memory pressure sit inside
      struct mem_cgroup, socket_pressure and tcpmem_pressure, indicating
      memory reclaim pressure in memcg->memory and ->tcpmem respectively.
      
      When in legacy mode (cgroupv1), the socket memory is charged into
      ->tcpmem which is independent of ->memory, so socket_pressure has
      nothing to do with socket's pressure at all. Things could be worse
      by taking socket_pressure into consideration in legacy mode, as a
      pressure in ->memory can lead to premature reclamation/throttling
      in socket.
      
      While for the default mode (cgroupv2), the socket memory is charged
      into ->memory, and ->tcpmem/->tcpmem_pressure are simply not used.
      
      So {socket,tcpmem}_pressure are only used in default/legacy mode
      respectively for indicating socket memory pressure. This patch fixes
      the pieces of code that make mixed use of both.
      
      Fixes: 8e8ae645 ("mm: memcontrol: hook up vmpressure to socket pressure")
      Signed-off-by: default avatarAbel Wu <wuyun.abel@bytedance.com>
      Acked-by: default avatarShakeel Butt <shakeelb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ac8a5296
    • Louis Peens's avatar
      nfp: update maintainer · 7fd034bc
      Louis Peens authored
      Take over maintainership of the nfp driver from Simon as he
      is moving away from Corigine.
      Signed-off-by: default avatarLouis Peens <louis.peens@corigine.com>
      Acked-by: default avatarSimon Horman <simon.horman@corigine.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7fd034bc
    • Grygorii Strashko's avatar
      net: ethernet: ti: am65-cpsw: add mqprio qdisc offload in channel mode · 90bc21aa
      Grygorii Strashko authored
      This patch adds MQPRIO Qdisc offload in full 'channel' mode which allows
      not only setting up pri:tc mapping, but also configuring TX shapers on
      external port FIFOs. The K3 CPSW MQPRIO Qdisc offload is expected to work
      with VLAN/priority tagged packets. Non-tagged packets have to be mapped
      only to TC0.
      
      - TX traffic classes must be rated starting from TC that has highest
      priority and with no gaps
      - Traffic classes are used starting from 0, that has highest priority
      - min_rate defines Committed Information Rate (guaranteed)
      - max_rate defines Excess Information Rate (non guaranteed) and offloaded
      as (max_rate[i] - tcX_min_rate[i])
      - VLAN/priority tagged packets mapped to TC0 will exit switch with VLAN tag
      priority 0
      
      The configuration example:
       ethtool -L eth1 tx 5
       ethtool --set-priv-flags eth1 p0-rx-ptype-rrobin off
      
       tc qdisc add dev eth1 parent root handle 100: mqprio num_tc 3 \
       map 0 0 1 2 0 0 0 0 0 0 0 0 0 0 0 0 \
       queues 1@0 1@1 1@2 hw 1 mode channel \
       shaper bw_rlimit min_rate 0 100mbit 200mbit max_rate 0 101mbit 202mbit
      
       tc qdisc replace dev eth2 handle 100: parent root mqprio num_tc 1 \
       map 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 queues 1@0 hw 1
      
       ip link add link eth1 name eth1.100 type vlan id 100
       ip link set eth1.100 type vlan egress 0:0 1:1 2:2 3:3 4:4 5:5 6:6 7:7
      
      In the above example two ports share the same TX CPPI queue 0 for low
      priority traffic. 3 traffic classes are defined for eth1 and mapped to:
      TC0 - low priority, TX CPPI queue 0 -> ext Port 1 fifo0, no rate limit
      TC1 - prio 2, TX CPPI queue 1 -> ext Port 1 fifo1, CIR=100Mbit/s, EIR=1Mbit/s
      TC2 - prio 3, TX CPPI queue 2 -> ext Port 1 fifo2, CIR=200Mbit/s, EIR=2Mbit/s
      Signed-off-by: default avatarGrygorii Strashko <grygorii.strashko@ti.com>
      Signed-off-by: default avatarRoger Quadros <rogerq@kernel.org>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      90bc21aa
    • David S. Miller's avatar
      Merge branch 'inet-data-races' · 569dce3f
      David S. Miller authored
      Eric Dumazet says:
      
      ====================
      inet: socket lock and data-races avoidance
      
      In this series, I converted 20 bits in "struct inet_sock" and made
      them truly atomic.
      
      This allows to implement many IP_ socket options in a lockless
      fashion (no need to acquire socket lock), and fixes data-races
      that were showing up in various KCSAN reports.
      
      I also took care of IP_TTL/IP_MINTTL, but left few other options
      for another series.
      
      v4: Rebased after recent mptcp changes.
        Added Reviewed-by: tags from Simon (thanks !)
      
      v3: fixed patch 7, feedback from build bot about ipvs set_mcast_loop()
      
      v2: addressed a feedback from a build bot in patch 9 by removing
       unused issk variable in mptcp_setsockopt_sol_ip_set_transparent()
       Added Acked-by: tags from Soheil (thanks !)
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      569dce3f
    • Eric Dumazet's avatar
      inet: implement lockless IP_MINTTL · 12af7326
      Eric Dumazet authored
      inet->min_ttl is already read with READ_ONCE().
      
      Implementing IP_MINTTL socket option set/read
      without holding the socket lock is easy.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarSoheil Hassas Yeganeh <soheil@google.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      12af7326