1. 17 May, 2024 14 commits
    • Tom Parkin's avatar
      l2tp: fix ICMP error handling for UDP-encap sockets · 6e828dc6
      Tom Parkin authored
      Since commit a36e185e
      ("udp: Handle ICMP errors for tunnels with same destination port on both endpoints")
      UDP's handling of ICMP errors has allowed for UDP-encap tunnels to
      determine socket associations in scenarios where the UDP hash lookup
      could not.
      
      Subsequently, commit d26796ae
      ("udp: check udp sock encap_type in __udp_lib_err")
      subtly tweaked the approach such that UDP ICMP error handling would be
      skipped for any UDP socket which has encapsulation enabled.
      
      In the case of L2TP tunnel sockets using UDP-encap, this latter
      modification effectively broke ICMP error reporting for the L2TP
      control plane.
      
      To a degree this isn't catastrophic inasmuch as the L2TP control
      protocol defines a reliable transport on top of the underlying packet
      switching network which will eventually detect errors and time out.
      
      However, paying attention to the ICMP error reporting allows for more
      timely detection of errors in L2TP userspace, and aids in debugging
      connectivity issues.
      
      Reinstate ICMP error handling for UDP encap L2TP tunnels:
      
       * implement struct udp_tunnel_sock_cfg .encap_err_rcv in order to allow
         the L2TP code to handle ICMP errors;
      
       * only implement error-handling for tunnels which have a managed
         socket: unmanaged tunnels using a kernel socket have no userspace to
         report errors back to;
      
       * flag the error on the socket, which allows for userspace to get an
         error such as -ECONNREFUSED back from sendmsg/recvmsg;
      
       * pass the error into ip[v6]_icmp_error() which allows for userspace to
         get extended error information via. MSG_ERRQUEUE.
      
      Fixes: d26796ae ("udp: check udp sock encap_type in __udp_lib_err")
      Signed-off-by: default avatarTom Parkin <tparkin@katalix.com>
      Link: https://lore.kernel.org/r/20240513172248.623261-1-tparkin@katalix.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      6e828dc6
    • David S. Miller's avatar
      Merge branch 'wangxun-fixes' · f6f25eeb
      David S. Miller authored
      Jiawen Wu says:
      
      ====================
      Wangxun fixes
      
      Fixed some bugs when using ethtool to operate network devices.
      
      v4 -> v5:
      - Simplify if...else... to fix features.
      
      v3 -> v4:
      - Require both ctag and stag to be enabled or disabled.
      
      v2 -> v3:
      - Drop the first patch.
      
      v1 -> v2:
      - Factor out the same code.
      - Remove statistics printing with more than 64 queues.
      - Detail the commit logs to describe issues.
      - Remove reset flag check in wx_update_stats().
      - Change to set VLAN CTAG and STAG to be consistent.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f6f25eeb
    • Jiawen Wu's avatar
      net: txgbe: fix to control VLAN strip · 1d3c6414
      Jiawen Wu authored
      When VLAN tag strip is changed to enable or disable, the hardware requires
      the Rx ring to be in a disabled state, otherwise the feature cannot be
      changed.
      
      Fixes: f3b03c65 ("net: wangxun: Implement vlan add and kill functions")
      Signed-off-by: default avatarJiawen Wu <jiawenwu@trustnetic.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1d3c6414
    • Jiawen Wu's avatar
      net: wangxun: match VLAN CTAG and STAG features · ac71ab78
      Jiawen Wu authored
      Hardware requires VLAN CTAG and STAG configuration always matches. And
      whether VLAN CTAG or STAG changes, the configuration needs to be changed
      as well.
      
      Fixes: 6670f1ec ("net: txgbe: Add netdev features support")
      Signed-off-by: default avatarJiawen Wu <jiawenwu@trustnetic.com>
      Reviewed-by: default avatarSai Krishna <saikrishnag@marvell.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ac71ab78
    • Jiawen Wu's avatar
      net: wangxun: fix to change Rx features · 68067f06
      Jiawen Wu authored
      Fix the issue where some Rx features cannot be changed.
      
      When using ethtool -K to turn off rx offload, it returns error and
      displays "Could not change any device features". And netdev->features
      is not assigned a new value to actually configure the hardware.
      
      Fixes: 6dbedcff ("net: libwx: Implement xx_set_features ops")
      Signed-off-by: default avatarJiawen Wu <jiawenwu@trustnetic.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      68067f06
    • Eric Dumazet's avatar
      af_packet: do not call packet_read_pending() from tpacket_destruct_skb() · 581073f6
      Eric Dumazet authored
      trafgen performance considerably sank on hosts with many cores
      after the blamed commit.
      
      packet_read_pending() is very expensive, and calling it
      in af_packet fast path defeats Daniel intent in commit
      b0138408 ("packet: use percpu mmap tx frame pending refcount")
      
      tpacket_destruct_skb() makes room for one packet, we can immediately
      wakeup a producer, no need to completely drain the tx ring.
      
      Fixes: 89ed5b51 ("af_packet: Block execution of tasks waiting for transmit to complete in AF_PACKET")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Reviewed-by: default avatarWillem de Bruijn <willemb@google.com>
      Link: https://lore.kernel.org/r/20240515163358.4105915-1-edumazet@google.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      581073f6
    • Daniel Jurgens's avatar
      virtio_net: Fix missed rtnl_unlock · fa033def
      Daniel Jurgens authored
      The rtnl_lock would stay locked if allocating promisc_allmulti failed.
      Also changed the allocation to GFP_KERNEL.
      
      Fixes: ff7c7d9f ("virtio_net: Remove command data from control_buf")
      Reported-by: default avatarEric Dumazet <edumaset@google.com>
      Link: https://lore.kernel.org/netdev/CANn89iLazVaUCvhPm6RPJJ0owra_oFnx7Fhc8d60gV-65ad3WQ@mail.gmail.com/Signed-off-by: default avatarDaniel Jurgens <danielj@nvidia.com>
      Reviewed-by: default avatarBrett Creeley <brett.creeley@amd.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Acked-by: default avatarJason Wang <jasowang@redhat.com>
      Link: https://lore.kernel.org/r/20240515163125.569743-1-danielj@nvidia.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      fa033def
    • Eric Dumazet's avatar
      netrom: fix possible dead-lock in nr_rt_ioctl() · e03e7f20
      Eric Dumazet authored
      syzbot loves netrom, and found a possible deadlock in nr_rt_ioctl [1]
      
      Make sure we always acquire nr_node_list_lock before nr_node_lock(nr_node)
      
      [1]
      WARNING: possible circular locking dependency detected
      6.9.0-rc7-syzkaller-02147-g654de42f #0 Not tainted
      ------------------------------------------------------
      syz-executor350/5129 is trying to acquire lock:
       ffff8880186e2070 (&nr_node->node_lock){+...}-{2:2}, at: spin_lock_bh include/linux/spinlock.h:356 [inline]
       ffff8880186e2070 (&nr_node->node_lock){+...}-{2:2}, at: nr_node_lock include/net/netrom.h:152 [inline]
       ffff8880186e2070 (&nr_node->node_lock){+...}-{2:2}, at: nr_dec_obs net/netrom/nr_route.c:464 [inline]
       ffff8880186e2070 (&nr_node->node_lock){+...}-{2:2}, at: nr_rt_ioctl+0x1bb/0x1090 net/netrom/nr_route.c:697
      
      but task is already holding lock:
       ffffffff8f7053b8 (nr_node_list_lock){+...}-{2:2}, at: spin_lock_bh include/linux/spinlock.h:356 [inline]
       ffffffff8f7053b8 (nr_node_list_lock){+...}-{2:2}, at: nr_dec_obs net/netrom/nr_route.c:462 [inline]
       ffffffff8f7053b8 (nr_node_list_lock){+...}-{2:2}, at: nr_rt_ioctl+0x10a/0x1090 net/netrom/nr_route.c:697
      
      which lock already depends on the new lock.
      
      the existing dependency chain (in reverse order) is:
      
      -> #1 (nr_node_list_lock){+...}-{2:2}:
              lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5754
              __raw_spin_lock_bh include/linux/spinlock_api_smp.h:126 [inline]
              _raw_spin_lock_bh+0x35/0x50 kernel/locking/spinlock.c:178
              spin_lock_bh include/linux/spinlock.h:356 [inline]
              nr_remove_node net/netrom/nr_route.c:299 [inline]
              nr_del_node+0x4b4/0x820 net/netrom/nr_route.c:355
              nr_rt_ioctl+0xa95/0x1090 net/netrom/nr_route.c:683
              sock_do_ioctl+0x158/0x460 net/socket.c:1222
              sock_ioctl+0x629/0x8e0 net/socket.c:1341
              vfs_ioctl fs/ioctl.c:51 [inline]
              __do_sys_ioctl fs/ioctl.c:904 [inline]
              __se_sys_ioctl+0xfc/0x170 fs/ioctl.c:890
              do_syscall_x64 arch/x86/entry/common.c:52 [inline]
              do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83
             entry_SYSCALL_64_after_hwframe+0x77/0x7f
      
      -> #0 (&nr_node->node_lock){+...}-{2:2}:
              check_prev_add kernel/locking/lockdep.c:3134 [inline]
              check_prevs_add kernel/locking/lockdep.c:3253 [inline]
              validate_chain+0x18cb/0x58e0 kernel/locking/lockdep.c:3869
              __lock_acquire+0x1346/0x1fd0 kernel/locking/lockdep.c:5137
              lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5754
              __raw_spin_lock_bh include/linux/spinlock_api_smp.h:126 [inline]
              _raw_spin_lock_bh+0x35/0x50 kernel/locking/spinlock.c:178
              spin_lock_bh include/linux/spinlock.h:356 [inline]
              nr_node_lock include/net/netrom.h:152 [inline]
              nr_dec_obs net/netrom/nr_route.c:464 [inline]
              nr_rt_ioctl+0x1bb/0x1090 net/netrom/nr_route.c:697
              sock_do_ioctl+0x158/0x460 net/socket.c:1222
              sock_ioctl+0x629/0x8e0 net/socket.c:1341
              vfs_ioctl fs/ioctl.c:51 [inline]
              __do_sys_ioctl fs/ioctl.c:904 [inline]
              __se_sys_ioctl+0xfc/0x170 fs/ioctl.c:890
              do_syscall_x64 arch/x86/entry/common.c:52 [inline]
              do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83
             entry_SYSCALL_64_after_hwframe+0x77/0x7f
      
      other info that might help us debug this:
      
       Possible unsafe locking scenario:
      
             CPU0                    CPU1
             ----                    ----
        lock(nr_node_list_lock);
                                     lock(&nr_node->node_lock);
                                     lock(nr_node_list_lock);
        lock(&nr_node->node_lock);
      
       *** DEADLOCK ***
      
      1 lock held by syz-executor350/5129:
        #0: ffffffff8f7053b8 (nr_node_list_lock){+...}-{2:2}, at: spin_lock_bh include/linux/spinlock.h:356 [inline]
        #0: ffffffff8f7053b8 (nr_node_list_lock){+...}-{2:2}, at: nr_dec_obs net/netrom/nr_route.c:462 [inline]
        #0: ffffffff8f7053b8 (nr_node_list_lock){+...}-{2:2}, at: nr_rt_ioctl+0x10a/0x1090 net/netrom/nr_route.c:697
      
      stack backtrace:
      CPU: 0 PID: 5129 Comm: syz-executor350 Not tainted 6.9.0-rc7-syzkaller-02147-g654de42f #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/02/2024
      Call Trace:
       <TASK>
        __dump_stack lib/dump_stack.c:88 [inline]
        dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114
        check_noncircular+0x36a/0x4a0 kernel/locking/lockdep.c:2187
        check_prev_add kernel/locking/lockdep.c:3134 [inline]
        check_prevs_add kernel/locking/lockdep.c:3253 [inline]
        validate_chain+0x18cb/0x58e0 kernel/locking/lockdep.c:3869
        __lock_acquire+0x1346/0x1fd0 kernel/locking/lockdep.c:5137
        lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5754
        __raw_spin_lock_bh include/linux/spinlock_api_smp.h:126 [inline]
        _raw_spin_lock_bh+0x35/0x50 kernel/locking/spinlock.c:178
        spin_lock_bh include/linux/spinlock.h:356 [inline]
        nr_node_lock include/net/netrom.h:152 [inline]
        nr_dec_obs net/netrom/nr_route.c:464 [inline]
        nr_rt_ioctl+0x1bb/0x1090 net/netrom/nr_route.c:697
        sock_do_ioctl+0x158/0x460 net/socket.c:1222
        sock_ioctl+0x629/0x8e0 net/socket.c:1341
        vfs_ioctl fs/ioctl.c:51 [inline]
        __do_sys_ioctl fs/ioctl.c:904 [inline]
        __se_sys_ioctl+0xfc/0x170 fs/ioctl.c:890
        do_syscall_x64 arch/x86/entry/common.c:52 [inline]
        do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83
       entry_SYSCALL_64_after_hwframe+0x77/0x7f
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Link: https://lore.kernel.org/r/20240515142934.3708038-1-edumazet@google.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      e03e7f20
    • Michal Schmidt's avatar
      idpf: don't skip over ethtool tcp-data-split setting · 67708158
      Michal Schmidt authored
      Disabling tcp-data-split on idpf silently fails:
        # ethtool -G $NETDEV tcp-data-split off
        # ethtool -g $NETDEV | grep 'TCP data split'
        TCP data split:        on
      
      But it works if you also change 'tx' or 'rx':
        # ethtool -G $NETDEV tcp-data-split off tx 256
        # ethtool -g $NETDEV | grep 'TCP data split'
        TCP data split:        off
      
      The bug is in idpf_set_ringparam, where it takes a shortcut out if the
      TX and RX sizes are not changing. Fix it by checking also if the
      tcp-data-split setting remains unchanged. Only then can the soft reset
      be skipped.
      
      Fixes: 9b1aa3ef ("idpf: add get/set for Ethtool's header split ringparam")
      Reported-by: default avatarXu Du <xudu@redhat.com>
      Closes: https://issues.redhat.com/browse/RHEL-36182Signed-off-by: default avatarMichal Schmidt <mschmidt@redhat.com>
      Reviewed-by: default avatarAlexander Lobakin <aleksander.lobakin@intel.com>
      Link: https://lore.kernel.org/r/20240515092414.158079-1-mschmidt@redhat.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      67708158
    • Sagar Cheluvegowda's avatar
    • Tony Battersby's avatar
      bonding: fix oops during rmmod · a45835a0
      Tony Battersby authored
      "rmmod bonding" causes an oops ever since commit cc317ea3 ("bonding:
      remove redundant NULL check in debugfs function").  Here are the relevant
      functions being called:
      
      bonding_exit()
        bond_destroy_debugfs()
          debugfs_remove_recursive(bonding_debug_root);
          bonding_debug_root = NULL; <--------- SET TO NULL HERE
        bond_netlink_fini()
          rtnl_link_unregister()
            __rtnl_link_unregister()
              unregister_netdevice_many_notify()
                bond_uninit()
                  bond_debug_unregister()
                    (commit removed check for bonding_debug_root == NULL)
                    debugfs_remove()
                    simple_recursive_removal()
                      down_write() -> OOPS
      
      However, reverting the bad commit does not solve the problem completely
      because the original code contains a race that could cause the same
      oops, although it was much less likely to be triggered unintentionally:
      
      CPU1
        rmmod bonding
          bonding_exit()
            bond_destroy_debugfs()
              debugfs_remove_recursive(bonding_debug_root);
      
      CPU2
        echo -bond0 > /sys/class/net/bonding_masters
          bond_uninit()
            bond_debug_unregister()
              if (!bonding_debug_root)
      
      CPU1
              bonding_debug_root = NULL;
      
      So do NOT revert the bad commit (since the removed checks were racy
      anyway), and instead change the order of actions taken during module
      removal.  The same oops can also happen if there is an error during
      module init, so apply the same fix there.
      
      Fixes: cc317ea3 ("bonding: remove redundant NULL check in debugfs function")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarTony Battersby <tonyb@cybernetics.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Acked-by: default avatarJay Vosburgh <jay.vosburgh@canonical.com>
      Link: https://lore.kernel.org/r/641f914f-3216-4eeb-87dd-91b78aa97773@cybernetics.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      a45835a0
    • xu xin's avatar
      net/ipv6: Fix route deleting failure when metric equals 0 · bb487272
      xu xin authored
      Problem
      =========
      After commit 67f69513 ("ipv6: Move setting default metric for routes"),
      we noticed that the logic of assigning the default value of fc_metirc
      changed in the ioctl process. That is, when users use ioctl(fd, SIOCADDRT,
      rt) with a non-zero metric to add a route,  then they may fail to delete a
      route with passing in a metric value of 0 to the kernel by ioctl(fd,
      SIOCDELRT, rt). But iproute can succeed in deleting it.
      
      As a reference, when using iproute tools by netlink to delete routes with
      a metric parameter equals 0, like the command as follows:
      
      	ip -6 route del fe80::/64 via fe81::5054:ff:fe11:3451 dev eth0 metric 0
      
      the user can still succeed in deleting the route entry with the smallest
      metric.
      
      Root Reason
      ===========
      After commit 67f69513 ("ipv6: Move setting default metric for routes"),
      When ioctl() pass in SIOCDELRT with a zero metric, rtmsg_to_fib6_config()
      will set a defalut value (1024) to cfg->fc_metric in kernel, and in
      ip6_route_del() and the line 4074 at net/ipv3/route.c, it will check by
      
      	if (cfg->fc_metric && cfg->fc_metric != rt->fib6_metric)
      		continue;
      
      and the condition is true and skip the later procedure (deleting route)
      because cfg->fc_metric != rt->fib6_metric. But before that commit,
      cfg->fc_metric is still zero there, so the condition is false and it
      will do the following procedure (deleting).
      
      Solution
      ========
      In order to keep a consistent behaviour across netlink() and ioctl(), we
      should allow to delete a route with a metric value of 0. So we only do
      the default setting of fc_metric in route adding.
      
      CC: stable@vger.kernel.org # 5.4+
      Fixes: 67f69513 ("ipv6: Move setting default metric for routes")
      Co-developed-by: default avatarFan Yu <fan.yu9@zte.com.cn>
      Signed-off-by: default avatarFan Yu <fan.yu9@zte.com.cn>
      Signed-off-by: default avatarxu xin <xu.xin16@zte.com.cn>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Link: https://lore.kernel.org/r/20240514201102055dD2Ba45qKbLlUMxu_DTHP@zte.com.cnSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      bb487272
    • Hangbin Liu's avatar
      selftests/net: reduce xfrm_policy test time · 988af276
      Hangbin Liu authored
      The check_random_order test add/get plenty of xfrm rules, which consume
      a lot time on debug kernel and always TIMEOUT. Let's reduce the test
      loop and see if it works.
      Signed-off-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Link: https://lore.kernel.org/r/20240514095227.2597730-1-liuhangbin@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      988af276
    • Jakub Kicinski's avatar
      Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · 52d94c18
      Jakub Kicinski authored
      Daniel Borkmann says:
      
      ====================
      pull-request: bpf 2024-05-17
      
      We've added 7 non-merge commits during the last 2 day(s) which contain
      a total of 8 files changed, 20 insertions(+), 9 deletions(-).
      
      The main changes are:
      
      1) Fix KASAN slab-out-of-bounds in percpu_array_map_gen_lookup and add
         BPF selftests to cover this case, from Andrii Nakryiko.
         (Report https://lore.kernel.org/bpf/20240514231155.1004295-1-kuba@kernel.org/)
      
      2) Fix two BPF selftests to adjust for kernel changes after fast-forwarding
         Linus' tree to make BPF CI all green again, from Martin KaFai Lau.
      
      3) Fix libbpf feature detectors when using token_fd by adjusting the
         attribute size for memset to cover the former, also from Andrii Nakryiko.
      
      4) Fix the description of 'src' in ALU instructions for the BPF ISA
         standardization doc, from Puranjay Mohan.
      
      * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
        selftests/bpf: Adjust btf_dump test to reflect recent change in file_operations
        selftests/bpf: Adjust test_access_variable_array after a kernel function name change
        selftests/bpf: add more variations of map-in-map situations
        bpf: save extended inner map info for percpu array maps as well
        MAINTAINERS: Update ARM64 BPF JIT maintainer
        bpf, docs: Fix the description of 'src' in ALU instructions
        libbpf: fix feature detectors when using token_fd
      ====================
      
      Link: https://lore.kernel.org/r/20240517001600.23703-1-daniel@iogearbox.netSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      52d94c18
  2. 16 May, 2024 5 commits
  3. 15 May, 2024 21 commits
    • Andrii Nakryiko's avatar
      selftests/bpf: add more variations of map-in-map situations · 2322113a
      Andrii Nakryiko authored
      Add test cases validating usage of PERCPU_ARRAY and PERCPU_HASH maps as
      inner maps.
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Acked-by: default avatarKumar Kartikeya Dwivedi <memxor@gmail.com>
      Link: https://lore.kernel.org/r/20240515062440.846086-2-andrii@kernel.orgSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      2322113a
    • Andrii Nakryiko's avatar
      bpf: save extended inner map info for percpu array maps as well · 9ee98229
      Andrii Nakryiko authored
      ARRAY_OF_MAPS and HASH_OF_MAPS map types have special logic to save
      a few extra fields required for correct operations of ARRAY maps, when
      they are used as inner maps. PERCPU_ARRAY maps have similar
      requirements as they now support generating inline element lookup
      logic. So make sure that both classes of maps are handled correctly.
      Reported-by: default avatarJakub Kicinski <kuba@kernel.org>
      Fixes: db69718b ("bpf: inline bpf_map_lookup_elem() for PERCPU_ARRAY maps")
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Acked-by: default avatarKumar Kartikeya Dwivedi <memxor@gmail.com>
      Link: https://lore.kernel.org/r/20240515062440.846086-1-andrii@kernel.orgSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      9ee98229
    • Puranjay Mohan's avatar
      MAINTAINERS: Update ARM64 BPF JIT maintainer · 325423ca
      Puranjay Mohan authored
      Zi Shen Lim is not actively doing kernel development and has decided to
      tranfer the responsibility of maintaining the JIT to me.
      
      Add myself as the maintainer for BPF JIT for ARM64 and remove Zi Shen
      Lim.
      Signed-off-by: default avatarPuranjay Mohan <puranjay@kernel.org>
      Acked-by: default avatarZi Shen Lim <zlim.lnx@gmail.com>
      Link: https://lore.kernel.org/r/20240514183914.27737-1-puranjay@kernel.orgSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      325423ca
    • Puranjay Mohan's avatar
      bpf, docs: Fix the description of 'src' in ALU instructions · 7a803005
      Puranjay Mohan authored
      An ALU instruction's source operand can be the value in the source
      register or the 32-bit immediate value encoded in the instruction. This
      is controlled by the 's' bit of the 'opcode'.
      
      The current description explicitly uses the phrase 'value of the source
      register' when defining the meaning of 'src'.
      
      Change the description to use 'source operand' in place of 'value of the
      source register'.
      Signed-off-by: default avatarPuranjay Mohan <puranjay@kernel.org>
      Acked-by: default avatarDave Thaler <dthaler1968@gmail.com>
      Link: https://lore.kernel.org/r/20240514130303.113607-1-puranjay@kernel.orgSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      7a803005
    • Andrii Nakryiko's avatar
      libbpf: fix feature detectors when using token_fd · 1de27bba
      Andrii Nakryiko authored
      Adjust `union bpf_attr` size passed to kernel in two feature-detecting
      functions to take into account prog_token_fd field.
      
      Libbpf is avoiding memset()'ing entire `union bpf_attr` by only using
      minimal set of bpf_attr's fields. Two places have been missed when
      wiring BPF token support in libbpf's feature detection logic.
      
      Fix them trivially.
      
      Fixes: f3dcee93 ("libbpf: Wire up token_fd into feature probing logic")
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Link: https://lore.kernel.org/r/20240513180804.403775-1-andrii@kernel.orgSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      1de27bba
    • Jakub Kicinski's avatar
      621cde16
    • Ronald Wahl's avatar
      net: ks8851: Fix another TX stall caused by wrong ISR flag handling · 317a215d
      Ronald Wahl authored
      Under some circumstances it may happen that the ks8851 Ethernet driver
      stops sending data.
      
      Currently the interrupt handler resets the interrupt status flags in the
      hardware after handling TX. With this approach we may lose interrupts in
      the time window between handling the TX interrupt and resetting the TX
      interrupt status bit.
      
      When all of the three following conditions are true then transmitting
      data stops:
      
        - TX queue is stopped to wait for room in the hardware TX buffer
        - no queued SKBs in the driver (txq) that wait for being written to hw
        - hardware TX buffer is empty and the last TX interrupt was lost
      
      This is because reenabling the TX queue happens when handling the TX
      interrupt status but if the TX status bit has already been cleared then
      this interrupt will never come.
      
      With this commit the interrupt status flags will be cleared before they
      are handled. That way we stop losing interrupts.
      
      The wrong handling of the ISR flags was there from the beginning but
      with commit 3dc5d445 ("net: ks8851: Fix TX stall caused by TX
      buffer overrun") the issue becomes apparent.
      
      Fixes: 3dc5d445 ("net: ks8851: Fix TX stall caused by TX buffer overrun")
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Jakub Kicinski <kuba@kernel.org>
      Cc: Paolo Abeni <pabeni@redhat.com>
      Cc: Simon Horman <horms@kernel.org>
      Cc: netdev@vger.kernel.org
      Cc: stable@vger.kernel.org # 5.10+
      Signed-off-by: default avatarRonald Wahl <ronald.wahl@raritan.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      317a215d
    • Nikolay Aleksandrov's avatar
      net: bridge: mst: fix vlan use-after-free · 3a7c1661
      Nikolay Aleksandrov authored
      syzbot reported a suspicious rcu usage[1] in bridge's mst code. While
      fixing it I noticed that nothing prevents a vlan to be freed while
      walking the list from the same path (br forward delay timer). Fix the rcu
      usage and also make sure we are not accessing freed memory by making
      br_mst_vlan_set_state use rcu read lock.
      
      [1]
       WARNING: suspicious RCU usage
       6.9.0-rc6-syzkaller #0 Not tainted
       -----------------------------
       net/bridge/br_private.h:1599 suspicious rcu_dereference_protected() usage!
       ...
       stack backtrace:
       CPU: 1 PID: 8017 Comm: syz-executor.1 Not tainted 6.9.0-rc6-syzkaller #0
       Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024
       Call Trace:
        <IRQ>
        __dump_stack lib/dump_stack.c:88 [inline]
        dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114
        lockdep_rcu_suspicious+0x221/0x340 kernel/locking/lockdep.c:6712
        nbp_vlan_group net/bridge/br_private.h:1599 [inline]
        br_mst_set_state+0x1ea/0x650 net/bridge/br_mst.c:105
        br_set_state+0x28a/0x7b0 net/bridge/br_stp.c:47
        br_forward_delay_timer_expired+0x176/0x440 net/bridge/br_stp_timer.c:88
        call_timer_fn+0x18e/0x650 kernel/time/timer.c:1793
        expire_timers kernel/time/timer.c:1844 [inline]
        __run_timers kernel/time/timer.c:2418 [inline]
        __run_timer_base+0x66a/0x8e0 kernel/time/timer.c:2429
        run_timer_base kernel/time/timer.c:2438 [inline]
        run_timer_softirq+0xb7/0x170 kernel/time/timer.c:2448
        __do_softirq+0x2c6/0x980 kernel/softirq.c:554
        invoke_softirq kernel/softirq.c:428 [inline]
        __irq_exit_rcu+0xf2/0x1c0 kernel/softirq.c:633
        irq_exit_rcu+0x9/0x30 kernel/softirq.c:645
        instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1043 [inline]
        sysvec_apic_timer_interrupt+0xa6/0xc0 arch/x86/kernel/apic/apic.c:1043
        </IRQ>
        <TASK>
       asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:702
       RIP: 0010:lock_acquire+0x264/0x550 kernel/locking/lockdep.c:5758
       Code: 2b 00 74 08 4c 89 f7 e8 ba d1 84 00 f6 44 24 61 02 0f 85 85 01 00 00 41 f7 c7 00 02 00 00 74 01 fb 48 c7 44 24 40 0e 36 e0 45 <4b> c7 44 25 00 00 00 00 00 43 c7 44 25 09 00 00 00 00 43 c7 44 25
       RSP: 0018:ffffc90013657100 EFLAGS: 00000206
       RAX: 0000000000000001 RBX: 1ffff920026cae2c RCX: 0000000000000001
       RDX: dffffc0000000000 RSI: ffffffff8bcaca00 RDI: ffffffff8c1eaa60
       RBP: ffffc90013657260 R08: ffffffff92efe507 R09: 1ffffffff25dfca0
       R10: dffffc0000000000 R11: fffffbfff25dfca1 R12: 1ffff920026cae28
       R13: dffffc0000000000 R14: ffffc90013657160 R15: 0000000000000246
      
      Fixes: ec7328b5 ("net: bridge: mst: Multiple Spanning Tree (MST) mode")
      Reported-by: syzbot+fa04eb8a56fd923fc5d8@syzkaller.appspotmail.com
      Closes: https://syzkaller.appspot.com/bug?extid=fa04eb8a56fd923fc5d8Signed-off-by: default avatarNikolay Aleksandrov <razor@blackwall.org>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3a7c1661
    • Nikolay Aleksandrov's avatar
      selftests: net: bridge: increase IGMP/MLD exclude timeout membership interval · 06080ea2
      Nikolay Aleksandrov authored
      When running the bridge IGMP/MLD selftests on debug kernels we can get
      spurious errors when setting up the IGMP/MLD exclude timeout tests
      because the membership interval is just 3 seconds and the setup has 2
      seconds of sleep plus various validations, the one second that is left
      is not enough. Increase the membership interval from 3 to 5 seconds to
      make room for the setup validation and 2 seconds of sleep.
      
      Fixes: 34d7ecb3 ("selftests: net: bridge: update IGMP/MLD membership interval value")
      Reported-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarNikolay Aleksandrov <razor@blackwall.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      06080ea2
    • Nikolay Aleksandrov's avatar
      net: bridge: xmit: make sure we have at least eth header len bytes · 8bd67ebb
      Nikolay Aleksandrov authored
      syzbot triggered an uninit value[1] error in bridge device's xmit path
      by sending a short (less than ETH_HLEN bytes) skb. To fix it check if
      we can actually pull that amount instead of assuming.
      
      Tested with dropwatch:
       drop at: br_dev_xmit+0xb93/0x12d0 [bridge] (0xffffffffc06739b3)
       origin: software
       timestamp: Mon May 13 11:31:53 2024 778214037 nsec
       protocol: 0x88a8
       length: 2
       original length: 2
       drop reason: PKT_TOO_SMALL
      
      [1]
      BUG: KMSAN: uninit-value in br_dev_xmit+0x61d/0x1cb0 net/bridge/br_device.c:65
       br_dev_xmit+0x61d/0x1cb0 net/bridge/br_device.c:65
       __netdev_start_xmit include/linux/netdevice.h:4903 [inline]
       netdev_start_xmit include/linux/netdevice.h:4917 [inline]
       xmit_one net/core/dev.c:3531 [inline]
       dev_hard_start_xmit+0x247/0xa20 net/core/dev.c:3547
       __dev_queue_xmit+0x34db/0x5350 net/core/dev.c:4341
       dev_queue_xmit include/linux/netdevice.h:3091 [inline]
       __bpf_tx_skb net/core/filter.c:2136 [inline]
       __bpf_redirect_common net/core/filter.c:2180 [inline]
       __bpf_redirect+0x14a6/0x1620 net/core/filter.c:2187
       ____bpf_clone_redirect net/core/filter.c:2460 [inline]
       bpf_clone_redirect+0x328/0x470 net/core/filter.c:2432
       ___bpf_prog_run+0x13fe/0xe0f0 kernel/bpf/core.c:1997
       __bpf_prog_run512+0xb5/0xe0 kernel/bpf/core.c:2238
       bpf_dispatcher_nop_func include/linux/bpf.h:1234 [inline]
       __bpf_prog_run include/linux/filter.h:657 [inline]
       bpf_prog_run include/linux/filter.h:664 [inline]
       bpf_test_run+0x499/0xc30 net/bpf/test_run.c:425
       bpf_prog_test_run_skb+0x14ea/0x1f20 net/bpf/test_run.c:1058
       bpf_prog_test_run+0x6b7/0xad0 kernel/bpf/syscall.c:4269
       __sys_bpf+0x6aa/0xd90 kernel/bpf/syscall.c:5678
       __do_sys_bpf kernel/bpf/syscall.c:5767 [inline]
       __se_sys_bpf kernel/bpf/syscall.c:5765 [inline]
       __x64_sys_bpf+0xa0/0xe0 kernel/bpf/syscall.c:5765
       x64_sys_call+0x96b/0x3b50 arch/x86/include/generated/asm/syscalls_64.h:322
       do_syscall_x64 arch/x86/entry/common.c:52 [inline]
       do_syscall_64+0xcf/0x1e0 arch/x86/entry/common.c:83
       entry_SYSCALL_64_after_hwframe+0x77/0x7f
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Reported-by: syzbot+a63a1f6a062033cf0f40@syzkaller.appspotmail.com
      Closes: https://syzkaller.appspot.com/bug?extid=a63a1f6a062033cf0f40Signed-off-by: default avatarNikolay Aleksandrov <razor@blackwall.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8bd67ebb
    • Linus Torvalds's avatar
      Merge tag 'net-next-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next · 1b294a1f
      Linus Torvalds authored
      Pull networking updates from Jakub Kicinski:
       "Core & protocols:
      
         - Complete rework of garbage collection of AF_UNIX sockets.
      
           AF_UNIX is prone to forming reference count cycles due to fd
           passing functionality. New method based on Tarjan's Strongly
           Connected Components algorithm should be both faster and remove a
           lot of workarounds we accumulated over the years.
      
         - Add TCP fraglist GRO support, allowing chaining multiple TCP
           packets and forwarding them together. Useful for small switches /
           routers which lack basic checksum offload in some scenarios (e.g.
           PPPoE).
      
         - Support using SMP threads for handling packet backlog i.e. packet
           processing from software interfaces and old drivers which don't use
           NAPI. This helps move the processing out of the softirq jumble.
      
         - Continue work of converting from rtnl lock to RCU protection.
      
           Don't require rtnl lock when reading: IPv6 routing FIB, IPv6
           address labels, netdev threaded NAPI sysfs files, bonding driver's
           sysfs files, MPLS devconf, IPv4 FIB rules, netns IDs, tcp metrics,
           TC Qdiscs, neighbor entries, ARP entries via ioctl(SIOCGARP), a lot
           of the link information available via rtnetlink.
      
         - Small optimizations from Eric to UDP wake up handling, memory
           accounting, RPS/RFS implementation, TCP packet sizing etc.
      
         - Allow direct page recycling in the bulk API used by XDP, for +2%
           PPS.
      
         - Support peek with an offset on TCP sockets.
      
         - Add MPTCP APIs for querying last time packets were received/sent/acked
           and whether MPTCP "upgrade" succeeded on a TCP socket.
      
         - Add intra-node communication shortcut to improve SMC performance.
      
         - Add IPv6 (and IPv{4,6}-over-IPv{4,6}) support to the GTP protocol
           driver.
      
         - Add HSR-SAN (RedBOX) mode of operation to the HSR protocol driver.
      
         - Add reset reasons for tracing what caused a TCP reset to be sent.
      
         - Introduce direction attribute for xfrm (IPSec) states. State can be
           used either for input or output packet processing.
      
        Things we sprinkled into general kernel code:
      
         - Add bitmap_{read,write}(), bitmap_size(), expose BYTES_TO_BITS().
      
           This required touch-ups and renaming of a few existing users.
      
         - Add Endian-dependent __counted_by_{le,be} annotations.
      
         - Make building selftests "quieter" by printing summaries like
           "CC object.o" rather than full commands with all the arguments.
      
        Netfilter:
      
         - Use GFP_KERNEL to clone elements, to deal better with OOM
           situations and avoid failures in the .commit step.
      
        BPF:
      
         - Add eBPF JIT for ARCv2 CPUs.
      
         - Support attaching kprobe BPF programs through kprobe_multi link in
           a session mode, meaning, a BPF program is attached to both function
           entry and return, the entry program can decide if the return
           program gets executed and the entry program can share u64 cookie
           value with return program. "Session mode" is a common use-case for
           tetragon and bpftrace.
      
         - Add the ability to specify and retrieve BPF cookie for raw
           tracepoint programs in order to ease migration from classic to raw
           tracepoints.
      
         - Add an internal-only BPF per-CPU instruction for resolving per-CPU
           memory addresses and implement support in x86, ARM64 and RISC-V
           JITs. This allows inlining functions which need to access per-CPU
           state.
      
         - Optimize x86 BPF JIT's emit_mov_imm64, and add support for various
           atomics in bpf_arena which can be JITed as a single x86
           instruction. Support BPF arena on ARM64.
      
         - Add a new bpf_wq API for deferring events and refactor
           process-context bpf_timer code to keep common code where possible.
      
         - Harden the BPF verifier's and/or/xor value tracking.
      
         - Introduce crypto kfuncs to let BPF programs call kernel crypto
           APIs.
      
         - Support bpf_tail_call_static() helper for BPF programs with GCC 13.
      
         - Add bpf_preempt_{disable,enable}() kfuncs in order to allow a BPF
           program to have code sections where preemption is disabled.
      
        Driver API:
      
         - Skip software TC processing completely if all installed rules are
           marked as HW-only, instead of checking the HW-only flag rule by
           rule.
      
         - Add support for configuring PoE (Power over Ethernet), similar to
           the already existing support for PoDL (Power over Data Line)
           config.
      
         - Initial bits of a queue control API, for now allowing a single
           queue to be reset without disturbing packet flow to other queues.
      
         - Common (ethtool) statistics for hardware timestamping.
      
        Tests and tooling:
      
         - Remove the need to create a config file to run the net forwarding
           tests so that a naive "make run_tests" can exercise them.
      
         - Define a method of writing tests which require an external endpoint
           to communicate with (to send/receive data towards the test
           machine). Add a few such tests.
      
         - Create a shared code library for writing Python tests. Expose the
           YAML Netlink library from tools/ to the tests for easy Netlink
           access.
      
         - Move netfilter tests under net/, extend them, separate performance
           tests from correctness tests, and iron out issues found by running
           them "on every commit".
      
         - Refactor BPF selftests to use common network helpers.
      
         - Further work filling in YAML definitions of Netlink messages for:
           nftables, team driver, bonding interfaces, vlan interfaces, VF
           info, TC u32 mark, TC police action.
      
         - Teach Python YAML Netlink to decode attribute policies.
      
         - Extend the definition of the "indexed array" construct in the specs
           to cover arrays of scalars rather than just nests.
      
         - Add hyperlinks between definitions in generated Netlink docs.
      
        Drivers:
      
         - Make sure unsupported flower control flags are rejected by drivers,
           and make more drivers report errors directly to the application
           rather than dmesg (large number of driver changes from Asbjørn
           Sloth Tønnesen).
      
         - Ethernet high-speed NICs:
            - Broadcom (bnxt):
               - support multiple RSS contexts and steering traffic to them
               - support XDP metadata
               - make page pool allocations more NUMA aware
            - Intel (100G, ice, idpf):
               - extract datapath code common among Intel drivers into a library
               - use fewer resources in switchdev by sharing queues with the PF
               - add PFCP filter support
               - add Ethernet filter support
               - use a spinlock instead of HW lock in PTP clock ops
               - support 5 layer Tx scheduler topology
            - nVidia/Mellanox:
               - 800G link modes and 100G SerDes speeds
               - per-queue IRQ coalescing configuration
            - Marvell Octeon:
               - support offloading TC packet mark action
      
         - Ethernet NICs consumer, embedded and virtual:
            - stop lying about skb->truesize in USB Ethernet drivers, it
              messes up TCP memory calculations
            - Google cloud vNIC:
               - support changing ring size via ethtool
               - support ring reset using the queue control API
            - VirtIO net:
               - expose flow hash from RSS to XDP
               - per-queue statistics
               - add selftests
            - Synopsys (stmmac):
               - support controllers which require an RX clock signal from the
                 MII bus to perform their hardware initialization
            - TI:
               - icssg_prueth: support ICSSG-based Ethernet on AM65x SR1.0 devices
               - icssg_prueth: add SW TX / RX Coalescing based on hrtimers
               - cpsw: minimal XDP support
            - Renesas (ravb):
               - support describing the MDIO bus
            - Realtek (r8169):
               - add support for RTL8168M
            - Microchip Sparx5:
               - matchall and flower actions mirred and redirect
      
         - Ethernet switches:
            - nVidia/Mellanox:
               - improve events processing performance
            - Marvell:
               - add support for MV88E6250 family internal PHYs
            - Microchip:
               - add DCB and DSCP mapping support for KSZ switches
               - vsc73xx: convert to PHYLINK
            - Realtek:
               - rtl8226b/rtl8221b: add C45 instances and SerDes switching
      
         - Many driver changes related to PHYLIB and PHYLINK deprecated API
           cleanup
      
         - Ethernet PHYs:
            - Add a new driver for Airoha EN8811H 2.5 Gigabit PHY.
            - micrel: lan8814: add support for PPS out and external timestamp trigger
      
         - WiFi:
            - Disable Wireless Extensions (WEXT) in all Wi-Fi 7 devices
              drivers. Modern devices can only be configured using nl80211.
            - mac80211/cfg80211
               - handle color change per link for WiFi 7 Multi-Link Operation
            - Intel (iwlwifi):
               - don't support puncturing in 5 GHz
               - support monitor mode on passive channels
               - BZ-W device support
               - P2P with HE/EHT support
               - re-add support for firmware API 90
               - provide channel survey information for Automatic Channel Selection
            - MediaTek (mt76):
               - mt7921 LED control
               - mt7925 EHT radiotap support
               - mt7920e PCI support
            - Qualcomm (ath11k):
               - P2P support for QCA6390, WCN6855 and QCA2066
               - support hibernation
               - ieee80211-freq-limit Device Tree property support
            - Qualcomm (ath12k):
               - refactoring in preparation of multi-link support
               - suspend and hibernation support
               - ACPI support
               - debugfs support, including dfs_simulate_radar support
            - RealTek:
               - rtw88: RTL8723CS SDIO device support
               - rtw89: RTL8922AE Wi-Fi 7 PCI device support
               - rtw89: complete features of new WiFi 7 chip 8922AE including
                 BT-coexistence and Wake-on-WLAN
               - rtw89: use BIOS ACPI settings to set TX power and channels
               - rtl8xxxu: enable Management Frame Protection (MFP) support
      
         - Bluetooth:
            - support for Intel BlazarI and Filmore Peak2 (BE201)
            - support for MediaTek MT7921S SDIO
            - initial support for Intel PCIe BT driver
            - remove HCI_AMP support"
      
      * tag 'net-next-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1827 commits)
        selftests: netfilter: fix packetdrill conntrack testcase
        net: gro: fix napi_gro_cb zeroed alignment
        Bluetooth: btintel_pcie: Refactor and code cleanup
        Bluetooth: btintel_pcie: Fix warning reported by sparse
        Bluetooth: hci_core: Fix not handling hdev->le_num_of_adv_sets=1
        Bluetooth: btintel: Fix compiler warning for multi_v7_defconfig config
        Bluetooth: btintel_pcie: Fix compiler warnings
        Bluetooth: btintel_pcie: Add *setup* function to download firmware
        Bluetooth: btintel_pcie: Add support for PCIe transport
        Bluetooth: btintel: Export few static functions
        Bluetooth: HCI: Remove HCI_AMP support
        Bluetooth: L2CAP: Fix div-by-zero in l2cap_le_flowctl_init()
        Bluetooth: qca: Fix error code in qca_read_fw_build_info()
        Bluetooth: hci_conn: Use __counted_by() and avoid -Wfamnae warning
        Bluetooth: btintel: Add support for Filmore Peak2 (BE201)
        Bluetooth: btintel: Add support for BlazarI
        LE Create Connection command timeout increased to 20 secs
        dt-bindings: net: bluetooth: Add MediaTek MT7921S SDIO Bluetooth
        Bluetooth: compute LE flow credits based on recvbuf space
        Bluetooth: hci_sync: Use cmd->num_cis instead of magic number
        ...
      1b294a1f
    • Linus Torvalds's avatar
      Merge tag 'firewire-updates-6.10' of... · b850dc20
      Linus Torvalds authored
      Merge tag 'firewire-updates-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394
      
      Pull firewire updates from Takashi Sakamoto:
       "During the development period of v6.8 kernel, it became evident that
        there was a lack of helper utilities to trace the initial state of
        bus, while investigating certain PHYs compliant with different
        versions of IEEE 1394 specification.
      
        This series of changes includes the addition of tracepoints events,
        provided by 'firewire' subsystem. These events enable tracing of how
        firewire core functions during bus reset and asynchronous
        communication over IEEE 1394 bus.
      
        When implementing the tracepoints events, it was found that the
        existing serialization and deserialization helpers for several types
        of asynchronous packets are scattered across both firewire-core and
        firewire-ohci kernel modules. A set of inline functions is newly added
        to address it, along with some KUnit tests, serving as the foundation
        for the tracepoints events. This renders the dispersed code obsolete.
      
        The remaining changes constitute the final steps in phasing out the
        usage of deprecated PCI MSI APIs, in continuation from the previous
        version"
      
      * tag 'firewire-updates-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394: (29 commits)
        firewire: obsolete usage of *-objs in Makefile for KUnit test
        firewire: core: remove flag and width from u64 formats of tracepoints events
        firewire: core: fix type of timestamp for async_inbound_template tracepoints events
        firewire: core: add tracepoint event for handling bus reset
        Revert "firewire: core: option to log bus reset initiation"
        firewire: core: add tracepoints events for initiating bus reset
        firewire: ohci: obsolete OHCI_PARAM_DEBUG_BUSRESETS from debug module parameter
        firewire: ohci: add bus-reset event for initial set of handled irq
        firewire: core: add tracepoints event for asynchronous inbound phy packet
        firewire: core/cdev: add tracepoints events for asynchronous phy packet
        firewire: core: add tracepoints events for asynchronous outbound response
        firewire: core: add tracepoint event for asynchronous inbound request
        firewire: core: add tracepoints event for asynchronous inbound response
        firewire: core: add tracepoints events for asynchronous outbound request
        firewire: core: add support for Linux kernel tracepoints
        firewire: core: replace local macros with common inline functions for isochronous packet header
        firewire: core: add common macro to serialize/deserialize isochronous packet header
        firewire: core: obsolete tcode check macros with inline functions
        firewire: ohci: replace hard-coded values with common macros
        firewire: ohci: replace hard-coded values with inline functions for asynchronous packet header
        ...
      b850dc20
    • Linus Torvalds's avatar
      Merge tag 'for-6.10/dm-changes' of... · 4f8b6f25
      Linus Torvalds authored
      Merge tag 'for-6.10/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
      
      Pull device mapper updates from Mike Snitzer:
      
       - Add a dm-crypt optional "high_priority" flag that enables the crypt
         workqueues to use WQ_HIGHPRI.
      
       - Export dm-crypt workqueues via sysfs (by enabling WQ_SYSFS) to allow
         for improved visibility and controls over IO and crypt workqueues.
      
       - Fix dm-crypt to no longer constrain max_segment_size to PAGE_SIZE.
         This limit isn't needed given that the block core provides late bio
         splitting if bio exceeds underlying limits (e.g. max_segment_size).
      
       - Fix dm-crypt crypt_queue's use of WQ_UNBOUND to not use
         WQ_CPU_INTENSIVE because it is meaningless with WQ_UNBOUND.
      
       - Fix various issues with dm-delay target (ranging from a resource
         teardown fix, a fix for hung task when using kthread mode, and other
         improvements that followed from code inspection).
      
      * tag 'for-6.10/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
        dm-delay: remove timer_lock
        dm-delay: change locking to avoid contention
        dm-delay: fix max_delay calculations
        dm-delay: fix hung task introduced by kthread mode
        dm-delay: fix workqueue delay_timer race
        dm-crypt: don't set WQ_CPU_INTENSIVE for WQ_UNBOUND crypt_queue
        dm: use queue_limits_set
        dm-crypt: stop constraining max_segment_size to PAGE_SIZE
        dm-crypt: export sysfs of all workqueues
        dm-crypt: add the optional "high_priority" flag
      4f8b6f25
    • Linus Torvalds's avatar
      Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 113d1dd9
      Linus Torvalds authored
      Pull SCSI updates from James Bottomley:
       "Updates to the usual drivers (ufs, lpfc, qla2xxx, mpi3mr, libsas).
      
        The major update (which causes a conflict with block, see below) is
        Christoph removing the queue limits and their associated block
        helpers.
      
        The remaining patches are assorted minor fixes and deprecated function
        updates plus a bit of constification"
      
      * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (141 commits)
        scsi: mpi3mr: Sanitise num_phys
        scsi: lpfc: Copyright updates for 14.4.0.2 patches
        scsi: lpfc: Update lpfc version to 14.4.0.2
        scsi: lpfc: Add support for 32 byte CDBs
        scsi: lpfc: Change lpfc_hba hba_flag member into a bitmask
        scsi: lpfc: Introduce rrq_list_lock to protect active_rrq_list
        scsi: lpfc: Clear deferred RSCN processing flag when driver is unloading
        scsi: lpfc: Update logging of protection type for T10 DIF I/O
        scsi: lpfc: Change default logging level for unsolicited CT MIB commands
        scsi: target: Remove unused list 'device_list'
        scsi: iscsi: Remove unused list 'connlist_err'
        scsi: ufs: exynos: Add support for Tensor gs101 SoC
        scsi: ufs: exynos: Add some pa_dbg_ register offsets into drvdata
        scsi: ufs: exynos: Allow max frequencies up to 267Mhz
        scsi: ufs: exynos: Add EXYNOS_UFS_OPT_TIMER_TICK_SELECT option
        scsi: ufs: exynos: Add EXYNOS_UFS_OPT_UFSPR_SECURE option
        scsi: ufs: dt-bindings: exynos: Add gs101 compatible
        scsi: qla2xxx: Fix debugfs output for fw_resource_count
        scsi: qedf: Ensure the copied buf is NUL terminated
        scsi: bfa: Ensure the copied buf is NUL terminated
        ...
      113d1dd9
    • Linus Torvalds's avatar
      Merge tag 'ata-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux · b2665fe6
      Linus Torvalds authored
      Pull ata updates from Damien Le Moal:
      
       - Convert the bindings for the imx-pata and ahci-da850 drivers to DT
         schemas (from Animesh)
      
       - Correct the code to handle HAS_IOPORT dependencies and conditionally
         compile drivers as needed (from Niklas)
      
       - Correct the legacy_exit() function in the pata_legacy driver to
         properly handle cleanups on driver exit (from Sergey)
      
       - Small code simplification removing the ata_exec_internal_sg()
         function and folding it into its only caller (from me)
      
      * tag 'ata-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux:
        ata: pata_legacy: make legacy_exit() work again
        ata: libata-core: Remove ata_exec_internal_sg()
        ata: add HAS_IOPORT dependencies
        dt-bindings: ata: ahci-da850: Convert to dtschema
        dt-bindings: ata: imx-pata: Convert to dtschema
      b2665fe6
    • Linus Torvalds's avatar
      Merge tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fsverity/linux · b47c1823
      Linus Torvalds authored
      Pull fsverity update from Eric Biggers:
       "Fix a false positive kmemleak warning"
      
      * tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fsverity/linux:
        fsverity: use register_sysctl_init() to avoid kmemleak warning
      b47c1823
    • Linus Torvalds's avatar
      Merge tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/linux · fc883e7a
      Linus Torvalds authored
      Pull fscrypt update from Eric Biggers:
       "Improve the performance of opening unencrypted files on filesystems
        that support fscrypt"
      
      * tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/linux:
        fscrypt: try to avoid refing parent dentry in fscrypt_file_open
      fc883e7a
    • Linus Torvalds's avatar
      Merge tag 'for-linus-6.10-ofs1' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux · eafb55a3
      Linus Torvalds authored
      Pull orangefs update from Mike Marshall:
       "Fix out-of-bounds fsid access.
      
        Small fix to quiet warnings from string fortification helpers,
        suggested by Arnd Bergmann"
      
      * tag 'for-linus-6.10-ofs1' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux:
        orangefs: fix out-of-bounds fsid access
      eafb55a3
    • Linus Torvalds's avatar
      Merge tag 'gfs2-for-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2 · 9518ae6e
      Linus Torvalds authored
      Pull gfs2 updates from Andreas Gruenbacher:
      
       - Properly fix the glock shrinker this time: it broke in commit "gfs2:
         Make glock lru list scanning safer" and commit "gfs2: fix glock
         shrinker ref issues" wasn't actually enough to fix it
      
       - On unmount, keep glocks around long enough that no more dlm callbacks
         can occur on them
      
       - Some more folio conversion patches from Matthew Wilcox
      
       - Lots of other smaller fixes and cleanups
      
      * tag 'gfs2-for-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2: (27 commits)
        gfs2: make timeout values more explicit
        gfs2: Convert gfs2_aspace_writepage() to use a folio
        gfs2: Add a migrate_folio operation for journalled files
        gfs2: Simplify gfs2_read_super
        gfs2: Convert gfs2_page_mkwrite() to use a folio
        gfs2: gfs2_freeze_unlock cleanup
        gfs2: Remove and replace gfs2_glock_queue_work
        gfs2: do_xmote fixes
        gfs2: finish_xmote cleanup
        gfs2: Unlock fewer glocks on unmount
        gfs2: Fix potential glock use-after-free on unmount
        gfs2: Remove ill-placed consistency check
        gfs2: Fix lru_count accounting
        gfs2: Fix "Make glock lru list scanning safer"
        Revert "gfs2: fix glock shrinker ref issues"
        gfs2: Fix "ignore unlock failures after withdraw"
        gfs2: Get rid of unnecessary test_and_set_bit
        gfs2: Don't set GLF_LOCK in gfs2_dispose_glock_lru
        gfs2: Replace gfs2_glock_queue_put with gfs2_glock_put_async
        gfs2: Get rid of gfs2_glock_queue_put in signal_our_withdraw
        ...
      9518ae6e
    • Linus Torvalds's avatar
      Merge tag 'dlm-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm · 6fffab66
      Linus Torvalds authored
      Pull dlm updates from David Teigland:
       "This set includes some small fixes, and some big internal changes:
      
         - Fix a long standing race between the unlock callback for the last
           lkb struct, and removing the rsb that became unused after the final
           unlock. This could lead different nodes to inconsistent info about
           the rsb master node.
      
         - Remove unnecessary refcounting on callback structs, returning to
           the way things were done in the past.
      
         - Do message processing in softirq context. This allows dlm messages
           to be cleared more quickly and efficiently, reducing long lists of
           incomplete requests. A future change to run callbacks directly from
           this context will make this more effective.
      
         - The softirq message processing involved a number of patches
           changing mutexes to spinlocks and rwlocks, and a fair amount of
           code re-org in preparation.
      
         - Use an rhashtable for rsb structs, rather than our old internal
           hash table implementation. This also required some re-org of lists
           and locks preparation for the change.
      
         - Drop the dlm_scand kthread, and use timers to clear unused rsb
           structs. Scanning all rsb's periodically was a lot of wasted work.
      
         - Fix recent regression in logic for copying LVB data in user space
           lock requests"
      
      * tag 'dlm-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm: (34 commits)
        dlm: return -ENOMEM if ls_recover_buf fails
        dlm: fix sleep in atomic context
        dlm: use rwlock for lkbidr
        dlm: use rwlock for rsb hash table
        dlm: drop dlm_scand kthread and use timers
        dlm: do not use ref counts for rsb in the toss state
        dlm: switch to use rhashtable for rsbs
        dlm: add rsb lists for iteration
        dlm: merge toss and keep hash table lists into one list
        dlm: change to single hashtable lock
        dlm: increment ls_count for dlm_scand
        dlm: do message processing in softirq context
        dlm: use spin_lock_bh for message processing
        dlm: remove schedule in receive path
        dlm: convert ls_recv_active from rw_semaphore to rwlock
        dlm: avoid blocking receive at the end of recovery
        dlm: convert res_lock to spinlock
        dlm: convert ls_waiters_mutex to spinlock
        dlm: drop mutex use in waiters recovery
        dlm: add new struct to save position in dlm_copy_master_names
        ...
      6fffab66
    • Linus Torvalds's avatar
      Merge tag 'for-6.10-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux · a3d1f54d
      Linus Torvalds authored
      Pull btrfs updates from David Sterba:
       "This update brings a few minor performance improvements, otherwise
        there's a lot of refactoring, cleanups and other sort of not user
        visible changes.
      
        Performance improvements:
      
         - inline b-tree locking functions, improvement in metadata-heavy
           changes
      
         - relax locking on a range that's being reflinked, allows read
           operations to run in parallel
      
         - speed up NOCOW write checks (throughput +9% on a sample test)
      
         - extent locking ranges have been reduced in several places, namely
           around delayed ref processing
      
        Core:
      
         - more page to folio conversions:
            - relocation
            - send
            - compression
            - inline extent handling
            - super block write and wait
      
         - extent_map structure optimizations:
            - reduced structure size
            - code simplifications
            - add shrinker for allocated objects, the numbers can go high and
              could exhaust memory on smaller systems (reported) as they may
              not get an opportunity to be freed fast enough
      
         - extent locking optimizations:
            - reduce locking ranges where it does not seem to be necessary and
              are safe due to other means of synchronization
            - potential improvements due to lower contention,
              allocation/freeing and state management operations of extent
              state tracking structures
      
         - delayed ref cleanups and simplifications
      
         - updated trace points
      
         - improved error handling, warnings and assertions
      
         - cleanups and refactoring, unification of error handling paths"
      
      * tag 'for-6.10-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (122 commits)
        btrfs: qgroup: fix initialization of auto inherit array
        btrfs: count super block write errors in device instead of tracking folio error state
        btrfs: use the folio iterator in btrfs_end_super_write()
        btrfs: convert super block writes to folio in write_dev_supers()
        btrfs: convert super block writes to folio in wait_dev_supers()
        bio: Export bio_add_folio_nofail to modules
        btrfs: remove duplicate included header from fs.h
        btrfs: add a cached state to extent_clear_unlock_delalloc
        btrfs: push extent lock down in submit_one_async_extent
        btrfs: push lock_extent down in cow_file_range()
        btrfs: move can_cow_file_range_inline() outside of the extent lock
        btrfs: push lock_extent into cow_file_range_inline
        btrfs: push extent lock into cow_file_range
        btrfs: push extent lock into run_delalloc_cow
        btrfs: remove unlock_extent from run_delalloc_compressed
        btrfs: push extent lock down in run_delalloc_nocow
        btrfs: adjust while loop condition in run_delalloc_nocow
        btrfs: push extent lock into run_delalloc_nocow
        btrfs: push the extent lock into btrfs_run_delalloc_range
        btrfs: lock extent when doing inline extent in compression
        ...
      a3d1f54d