1. 11 Jul, 2024 14 commits
    • Paolo Abeni's avatar
      Merge tag 'nf-24-07-11' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf · d7c199e7
      Paolo Abeni authored
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter fixes for net
      
      The following batch contains Netfilter fixes for net:
      
      Patch #1 fixes a bogus WARN_ON splat in nfnetlink_queue.
      
      Patch #2 fixes a crash due to stack overflow in chain loop detection
      	 by using the existing chain validation routines
      
      Both patches from Florian Westphal.
      
      netfilter pull request 24-07-11
      
      * tag 'nf-24-07-11' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
        netfilter: nf_tables: prefer nft_chain_validate
        netfilter: nfnetlink_queue: drop bogus WARN_ON
      ====================
      
      Link: https://patch.msgid.link/20240711093948.3816-1-pablo@netfilter.orgSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      d7c199e7
    • Paolo Abeni's avatar
      Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · a819ff0c
      Paolo Abeni authored
      Daniel Borkmann says:
      
      ====================
      pull-request: bpf 2024-07-11
      
      The following pull-request contains BPF updates for your *net* tree.
      
      We've added 4 non-merge commits during the last 2 day(s) which contain
      a total of 4 files changed, 262 insertions(+), 19 deletions(-).
      
      The main changes are:
      
      1) Fixes for a BPF timer lockup and a use-after-free scenario when timers
         are used concurrently, from Kumar Kartikeya Dwivedi.
      
      2) Fix the argument order in the call to bpf_map_kvcalloc() which could
         otherwise lead to a compilation error, from Mohammad Shehar Yaar Tausif.
      
      bpf-for-netdev
      
      * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
        selftests/bpf: Add timer lockup selftest
        bpf: Defer work in bpf_timer_cancel_and_free
        bpf: Fail bpf_timer_cancel when callback is being cancelled
        bpf: fix order of args in call to bpf_map_kvcalloc
      ====================
      
      Link: https://patch.msgid.link/20240711084016.25757-1-daniel@iogearbox.netSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      a819ff0c
    • Daniel Borkmann's avatar
      net, sunrpc: Remap EPERM in case of connection failure in xs_tcp_setup_socket · 626dfed5
      Daniel Borkmann authored
      When using a BPF program on kernel_connect(), the call can return -EPERM. This
      causes xs_tcp_setup_socket() to loop forever, filling up the syslog and causing
      the kernel to potentially freeze up.
      
      Neil suggested:
      
        This will propagate -EPERM up into other layers which might not be ready
        to handle it. It might be safer to map EPERM to an error we would be more
        likely to expect from the network system - such as ECONNREFUSED or ENETDOWN.
      
      ECONNREFUSED as error seems reasonable. For programs setting a different error
      can be out of reach (see handling in 4fbac77d) in particular on kernels
      which do not have f10d0596 ("bpf: Make BPF_PROG_RUN_ARRAY return -err
      instead of allow boolean"), thus given that it is better to simply remap for
      consistent behavior. UDP does handle EPERM in xs_udp_send_request().
      
      Fixes: d74bad4e ("bpf: Hooks for sys_connect")
      Fixes: 4fbac77d ("bpf: Hooks for sys_bind")
      Co-developed-by: default avatarLex Siegel <usiegl00@gmail.com>
      Signed-off-by: default avatarLex Siegel <usiegl00@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Cc: Neil Brown <neilb@suse.de>
      Cc: Trond Myklebust <trondmy@kernel.org>
      Cc: Anna Schumaker <anna@kernel.org>
      Link: https://github.com/cilium/cilium/issues/33395
      Link: https://lore.kernel.org/bpf/171374175513.12877.8993642908082014881@noble.neil.brown.name
      Link: https://patch.msgid.link/9069ec1d59e4b2129fc23433349fd5580ad43921.1720075070.git.daniel@iogearbox.netSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      626dfed5
    • Chengen Du's avatar
      net/sched: Fix UAF when resolving a clash · 26488172
      Chengen Du authored
      KASAN reports the following UAF:
      
       BUG: KASAN: slab-use-after-free in tcf_ct_flow_table_process_conn+0x12b/0x380 [act_ct]
       Read of size 1 at addr ffff888c07603600 by task handler130/6469
      
       Call Trace:
        <IRQ>
        dump_stack_lvl+0x48/0x70
        print_address_description.constprop.0+0x33/0x3d0
        print_report+0xc0/0x2b0
        kasan_report+0xd0/0x120
        __asan_load1+0x6c/0x80
        tcf_ct_flow_table_process_conn+0x12b/0x380 [act_ct]
        tcf_ct_act+0x886/0x1350 [act_ct]
        tcf_action_exec+0xf8/0x1f0
        fl_classify+0x355/0x360 [cls_flower]
        __tcf_classify+0x1fd/0x330
        tcf_classify+0x21c/0x3c0
        sch_handle_ingress.constprop.0+0x2c5/0x500
        __netif_receive_skb_core.constprop.0+0xb25/0x1510
        __netif_receive_skb_list_core+0x220/0x4c0
        netif_receive_skb_list_internal+0x446/0x620
        napi_complete_done+0x157/0x3d0
        gro_cell_poll+0xcf/0x100
        __napi_poll+0x65/0x310
        net_rx_action+0x30c/0x5c0
        __do_softirq+0x14f/0x491
        __irq_exit_rcu+0x82/0xc0
        irq_exit_rcu+0xe/0x20
        common_interrupt+0xa1/0xb0
        </IRQ>
        <TASK>
        asm_common_interrupt+0x27/0x40
      
       Allocated by task 6469:
        kasan_save_stack+0x38/0x70
        kasan_set_track+0x25/0x40
        kasan_save_alloc_info+0x1e/0x40
        __kasan_krealloc+0x133/0x190
        krealloc+0xaa/0x130
        nf_ct_ext_add+0xed/0x230 [nf_conntrack]
        tcf_ct_act+0x1095/0x1350 [act_ct]
        tcf_action_exec+0xf8/0x1f0
        fl_classify+0x355/0x360 [cls_flower]
        __tcf_classify+0x1fd/0x330
        tcf_classify+0x21c/0x3c0
        sch_handle_ingress.constprop.0+0x2c5/0x500
        __netif_receive_skb_core.constprop.0+0xb25/0x1510
        __netif_receive_skb_list_core+0x220/0x4c0
        netif_receive_skb_list_internal+0x446/0x620
        napi_complete_done+0x157/0x3d0
        gro_cell_poll+0xcf/0x100
        __napi_poll+0x65/0x310
        net_rx_action+0x30c/0x5c0
        __do_softirq+0x14f/0x491
      
       Freed by task 6469:
        kasan_save_stack+0x38/0x70
        kasan_set_track+0x25/0x40
        kasan_save_free_info+0x2b/0x60
        ____kasan_slab_free+0x180/0x1f0
        __kasan_slab_free+0x12/0x30
        slab_free_freelist_hook+0xd2/0x1a0
        __kmem_cache_free+0x1a2/0x2f0
        kfree+0x78/0x120
        nf_conntrack_free+0x74/0x130 [nf_conntrack]
        nf_ct_destroy+0xb2/0x140 [nf_conntrack]
        __nf_ct_resolve_clash+0x529/0x5d0 [nf_conntrack]
        nf_ct_resolve_clash+0xf6/0x490 [nf_conntrack]
        __nf_conntrack_confirm+0x2c6/0x770 [nf_conntrack]
        tcf_ct_act+0x12ad/0x1350 [act_ct]
        tcf_action_exec+0xf8/0x1f0
        fl_classify+0x355/0x360 [cls_flower]
        __tcf_classify+0x1fd/0x330
        tcf_classify+0x21c/0x3c0
        sch_handle_ingress.constprop.0+0x2c5/0x500
        __netif_receive_skb_core.constprop.0+0xb25/0x1510
        __netif_receive_skb_list_core+0x220/0x4c0
        netif_receive_skb_list_internal+0x446/0x620
        napi_complete_done+0x157/0x3d0
        gro_cell_poll+0xcf/0x100
        __napi_poll+0x65/0x310
        net_rx_action+0x30c/0x5c0
        __do_softirq+0x14f/0x491
      
      The ct may be dropped if a clash has been resolved but is still passed to
      the tcf_ct_flow_table_process_conn function for further usage. This issue
      can be fixed by retrieving ct from skb again after confirming conntrack.
      
      Fixes: 0cc254e5 ("net/sched: act_ct: Offload connections with commit action")
      Co-developed-by: default avatarGerald Yang <gerald.yang@canonical.com>
      Signed-off-by: default avatarGerald Yang <gerald.yang@canonical.com>
      Signed-off-by: default avatarChengen Du <chengen.du@canonical.com>
      Link: https://patch.msgid.link/20240710053747.13223-1-chengen.du@canonical.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      26488172
    • Ronald Wahl's avatar
      net: ks8851: Fix potential TX stall after interface reopen · 7a99afef
      Ronald Wahl authored
      The amount of TX space in the hardware buffer is tracked in the tx_space
      variable. The initial value is currently only set during driver probing.
      
      After closing the interface and reopening it the tx_space variable has
      the last value it had before close. If it is smaller than the size of
      the first send packet after reopeing the interface the queue will be
      stopped. The queue is woken up after receiving a TX interrupt but this
      will never happen since we did not send anything.
      
      This commit moves the initialization of the tx_space variable to the
      ks8851_net_open function right before starting the TX queue. Also query
      the value from the hardware instead of using a hard coded value.
      
      Only the SPI chip variant is affected by this issue because only this
      driver variant actually depends on the tx_space variable in the xmit
      function.
      
      Fixes: 3dc5d445 ("net: ks8851: Fix TX stall caused by TX buffer overrun")
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Jakub Kicinski <kuba@kernel.org>
      Cc: Paolo Abeni <pabeni@redhat.com>
      Cc: Simon Horman <horms@kernel.org>
      Cc: netdev@vger.kernel.org
      Cc: stable@vger.kernel.org # 5.10+
      Signed-off-by: default avatarRonald Wahl <ronald.wahl@raritan.com>
      Reviewed-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Link: https://patch.msgid.link/20240709195845.9089-1-rwahl@gmx.deSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      7a99afef
    • Kuniyuki Iwashima's avatar
      udp: Set SOCK_RCU_FREE earlier in udp_lib_get_port(). · 5c0b485a
      Kuniyuki Iwashima authored
      syzkaller triggered the warning [0] in udp_v4_early_demux().
      
      In udp_v[46]_early_demux() and sk_lookup(), we do not touch the refcount
      of the looked-up sk and use sock_pfree() as skb->destructor, so we check
      SOCK_RCU_FREE to ensure that the sk is safe to access during the RCU grace
      period.
      
      Currently, SOCK_RCU_FREE is flagged for a bound socket after being put
      into the hash table.  Moreover, the SOCK_RCU_FREE check is done too early
      in udp_v[46]_early_demux() and sk_lookup(), so there could be a small race
      window:
      
        CPU1                                 CPU2
        ----                                 ----
        udp_v4_early_demux()                 udp_lib_get_port()
        |                                    |- hlist_add_head_rcu()
        |- sk = __udp4_lib_demux_lookup()    |
        |- DEBUG_NET_WARN_ON_ONCE(sk_is_refcounted(sk));
                                             `- sock_set_flag(sk, SOCK_RCU_FREE)
      
      We had the same bug in TCP and fixed it in commit 871019b2 ("net:
      set SOCK_RCU_FREE before inserting socket into hashtable").
      
      Let's apply the same fix for UDP.
      
      [0]:
      WARNING: CPU: 0 PID: 11198 at net/ipv4/udp.c:2599 udp_v4_early_demux+0x481/0xb70 net/ipv4/udp.c:2599
      Modules linked in:
      CPU: 0 PID: 11198 Comm: syz-executor.1 Not tainted 6.9.0-g93bda330 #13
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
      RIP: 0010:udp_v4_early_demux+0x481/0xb70 net/ipv4/udp.c:2599
      Code: c5 7a 15 fe bb 01 00 00 00 44 89 e9 31 ff d3 e3 81 e3 bf ef ff ff 89 de e8 2c 74 15 fe 85 db 0f 85 02 06 00 00 e8 9f 7a 15 fe <0f> 0b e8 98 7a 15 fe 49 8d 7e 60 e8 4f 39 2f fe 49 c7 46 60 20 52
      RSP: 0018:ffffc9000ce3fa58 EFLAGS: 00010293
      RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff8318c92c
      RDX: ffff888036ccde00 RSI: ffffffff8318c2f1 RDI: 0000000000000001
      RBP: ffff88805a2dd6e0 R08: 0000000000000001 R09: 0000000000000000
      R10: 0000000000000000 R11: 0001ffffffffffff R12: ffff88805a2dd680
      R13: 0000000000000007 R14: ffff88800923f900 R15: ffff88805456004e
      FS:  00007fc449127640(0000) GS:ffff88807dc00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007fc449126e38 CR3: 000000003de4b002 CR4: 0000000000770ef0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
      PKRU: 55555554
      Call Trace:
       <TASK>
       ip_rcv_finish_core.constprop.0+0xbdd/0xd20 net/ipv4/ip_input.c:349
       ip_rcv_finish+0xda/0x150 net/ipv4/ip_input.c:447
       NF_HOOK include/linux/netfilter.h:314 [inline]
       NF_HOOK include/linux/netfilter.h:308 [inline]
       ip_rcv+0x16c/0x180 net/ipv4/ip_input.c:569
       __netif_receive_skb_one_core+0xb3/0xe0 net/core/dev.c:5624
       __netif_receive_skb+0x21/0xd0 net/core/dev.c:5738
       netif_receive_skb_internal net/core/dev.c:5824 [inline]
       netif_receive_skb+0x271/0x300 net/core/dev.c:5884
       tun_rx_batched drivers/net/tun.c:1549 [inline]
       tun_get_user+0x24db/0x2c50 drivers/net/tun.c:2002
       tun_chr_write_iter+0x107/0x1a0 drivers/net/tun.c:2048
       new_sync_write fs/read_write.c:497 [inline]
       vfs_write+0x76f/0x8d0 fs/read_write.c:590
       ksys_write+0xbf/0x190 fs/read_write.c:643
       __do_sys_write fs/read_write.c:655 [inline]
       __se_sys_write fs/read_write.c:652 [inline]
       __x64_sys_write+0x41/0x50 fs/read_write.c:652
       x64_sys_call+0xe66/0x1990 arch/x86/include/generated/asm/syscalls_64.h:2
       do_syscall_x64 arch/x86/entry/common.c:52 [inline]
       do_syscall_64+0x4b/0x110 arch/x86/entry/common.c:83
       entry_SYSCALL_64_after_hwframe+0x4b/0x53
      RIP: 0033:0x7fc44a68bc1f
      Code: 89 54 24 18 48 89 74 24 10 89 7c 24 08 e8 e9 cf f5 ff 48 8b 54 24 18 48 8b 74 24 10 41 89 c0 8b 7c 24 08 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 31 44 89 c7 48 89 44 24 08 e8 3c d0 f5 ff 48
      RSP: 002b:00007fc449126c90 EFLAGS: 00000293 ORIG_RAX: 0000000000000001
      RAX: ffffffffffffffda RBX: 00000000004bc050 RCX: 00007fc44a68bc1f
      RDX: 0000000000000032 RSI: 00000000200000c0 RDI: 00000000000000c8
      RBP: 00000000004bc050 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000032 R11: 0000000000000293 R12: 0000000000000000
      R13: 000000000000000b R14: 00007fc44a5ec530 R15: 0000000000000000
       </TASK>
      
      Fixes: 6acc9b43 ("bpf: Add helper to retrieve socket in BPF")
      Reported-by: default avatarsyzkaller <syzkaller@googlegroups.com>
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Link: https://patch.msgid.link/20240709191356.24010-1-kuniyu@amazon.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      5c0b485a
    • Florian Westphal's avatar
      netfilter: nf_tables: prefer nft_chain_validate · cff3bd01
      Florian Westphal authored
      nft_chain_validate already performs loop detection because a cycle will
      result in a call stack overflow (ctx->level >= NFT_JUMP_STACK_SIZE).
      
      It also follows maps via ->validate callback in nft_lookup, so there
      appears no reason to iterate the maps again.
      
      nf_tables_check_loops() and all its helper functions can be removed.
      This improves ruleset load time significantly, from 23s down to 12s.
      
      This also fixes a crash bug. Old loop detection code can result in
      unbounded recursion:
      
      BUG: TASK stack guard page was hit at ....
      Oops: stack guard page: 0000 [#1] PREEMPT SMP KASAN
      CPU: 4 PID: 1539 Comm: nft Not tainted 6.10.0-rc5+ #1
      [..]
      
      with a suitable ruleset during validation of register stores.
      
      I can't see any actual reason to attempt to check for this from
      nft_validate_register_store(), at this point the transaction is still in
      progress, so we don't have a full picture of the rule graph.
      
      For nf-next it might make sense to either remove it or make this depend
      on table->validate_state in case we could catch an error earlier
      (for improved error reporting to userspace).
      
      Fixes: 20a69341 ("netfilter: nf_tables: add netlink set API")
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      cff3bd01
    • Florian Westphal's avatar
      netfilter: nfnetlink_queue: drop bogus WARN_ON · 631a4b3d
      Florian Westphal authored
      Happens when rules get flushed/deleted while packet is out, so remove
      this WARN_ON.
      
      This WARN exists in one form or another since v4.14, no need to backport
      this to older releases, hence use a more recent fixes tag.
      
      Fixes: 3f801968 ("netfilter: move nf_reinject into nfnetlink_queue modules")
      Reported-by: default avatarkernel test robot <oliver.sang@intel.com>
      Closes: https://lore.kernel.org/oe-lkp/202407081453.11ac0f63-lkp@intel.comSigned-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      631a4b3d
    • Oleksij Rempel's avatar
      ethtool: netlink: do not return SQI value if link is down · c184cf94
      Oleksij Rempel authored
      Do not attach SQI value if link is down. "SQI values are only valid if
      link-up condition is present" per OpenAlliance specification of
      100Base-T1 Interoperability Test suite [1]. The same rule would apply
      for other link types.
      
      [1] https://opensig.org/automotive-ethernet-specifications/#
      
      Fixes: 80660219 ("ethtool: provide UAPI for PHY Signal Quality Index (SQI)")
      Signed-off-by: default avatarOleksij Rempel <o.rempel@pengutronix.de>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Reviewed-by: default avatarWoojung Huh <woojung.huh@microchip.com>
      Link: https://patch.msgid.link/20240709061943.729381-1-o.rempel@pengutronix.deSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      c184cf94
    • Dmitry Antipov's avatar
      ppp: reject claimed-as-LCP but actually malformed packets · f2aeb730
      Dmitry Antipov authored
      Since 'ppp_async_encode()' assumes valid LCP packets (with code
      from 1 to 7 inclusive), add 'ppp_check_packet()' to ensure that
      LCP packet has an actual body beyond PPP_LCP header bytes, and
      reject claimed-as-LCP but actually malformed data otherwise.
      
      Reported-by: syzbot+ec0723ba9605678b14bf@syzkaller.appspotmail.com
      Closes: https://syzkaller.appspot.com/bug?extid=ec0723ba9605678b14bf
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarDmitry Antipov <dmantipov@yandex.ru>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      f2aeb730
    • Kumar Kartikeya Dwivedi's avatar
      selftests/bpf: Add timer lockup selftest · 50bd5a0c
      Kumar Kartikeya Dwivedi authored
      Add a selftest that tries to trigger a situation where two timer callbacks
      are attempting to cancel each other's timer. By running them continuously,
      we hit a condition where both run in parallel and cancel each other.
      
      Without the fix in the previous patch, this would cause a lockup as
      hrtimer_cancel on either side will wait for forward progress from the
      callback.
      
      Ensure that this situation leads to a EDEADLK error.
      Signed-off-by: default avatarKumar Kartikeya Dwivedi <memxor@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20240711052709.2148616-1-memxor@gmail.com
      50bd5a0c
    • Jian Hui Lee's avatar
      net: ethernet: mtk-star-emac: set mac_managed_pm when probing · 8c6790b5
      Jian Hui Lee authored
      The below commit introduced a warning message when phy state is not in
      the states: PHY_HALTED, PHY_READY, and PHY_UP.
      commit 744d23c7 ("net: phy: Warn about incorrect mdio_bus_phy_resume() state")
      
      mtk-star-emac doesn't need mdiobus suspend/resume. To fix the warning
      message during resume, indicate the phy resume/suspend is managed by the
      mac when probing.
      
      Fixes: 744d23c7 ("net: phy: Warn about incorrect mdio_bus_phy_resume() state")
      Signed-off-by: default avatarJian Hui Lee <jianhui.lee@canonical.com>
      Reviewed-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Link: https://patch.msgid.link/20240708065210.4178980-1-jianhui.lee@canonical.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      8c6790b5
    • Vitaly Lifshits's avatar
      e1000e: fix force smbus during suspend flow · 76a0a3f9
      Vitaly Lifshits authored
      Commit 861e8086 ("e1000e: move force SMBUS from enable ulp function
      to avoid PHY loss issue") resolved a PHY access loss during suspend on
      Meteor Lake consumer platforms, but it affected corporate systems
      incorrectly.
      
      A better fix, working for both consumer and corporate systems, was
      proposed in commit bfd546a5 ("e1000e: move force SMBUS near the end
      of enable_ulp function"). However, it introduced a regression on older
      devices, such as [8086:15B8], [8086:15F9], [8086:15BE].
      
      This patch aims to fix the secondary regression, by limiting the scope of
      the changes to Meteor Lake platforms only.
      
      Fixes: bfd546a5 ("e1000e: move force SMBUS near the end of enable_ulp function")
      Reported-by: default avatarTodd Brandt <todd.e.brandt@intel.com>
      Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218940Reported-by: default avatarDieter Mummenschanz <dmummenschanz@web.de>
      Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218936Signed-off-by: default avatarVitaly Lifshits <vitaly.lifshits@intel.com>
      Tested-by: Mor Bar-Gabay <morx.bar.gabay@intel.com> (A Contingent Worker at Intel)
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Link: https://patch.msgid.link/20240709203123.2103296-1-anthony.l.nguyen@intel.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      76a0a3f9
    • Eric Dumazet's avatar
      tcp: avoid too many retransmit packets · 97a90635
      Eric Dumazet authored
      If a TCP socket is using TCP_USER_TIMEOUT, and the other peer
      retracted its window to zero, tcp_retransmit_timer() can
      retransmit a packet every two jiffies (2 ms for HZ=1000),
      for about 4 minutes after TCP_USER_TIMEOUT has 'expired'.
      
      The fix is to make sure tcp_rtx_probe0_timed_out() takes
      icsk->icsk_user_timeout into account.
      
      Before blamed commit, the socket would not timeout after
      icsk->icsk_user_timeout, but would use standard exponential
      backoff for the retransmits.
      
      Also worth noting that before commit e89688e3 ("net: tcp:
      fix unexcepted socket die when snd_wnd is 0"), the issue
      would last 2 minutes instead of 4.
      
      Fixes: b701a99e ("tcp: Add tcp_clamp_rto_to_user_timeout() helper to improve accuracy")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Reviewed-by: default avatarJason Xing <kerneljasonxing@gmail.com>
      Reviewed-by: default avatarJon Maxwell <jmaxwell37@gmail.com>
      Reviewed-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Link: https://patch.msgid.link/20240710001402.2758273-1-edumazet@google.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      97a90635
  2. 10 Jul, 2024 6 commits
    • Alexei Starovoitov's avatar
      Merge branch 'fixes-for-bpf-timer-lockup-and-uaf' · 0c237341
      Alexei Starovoitov authored
      Kumar Kartikeya Dwivedi says:
      
      ====================
      Fixes for BPF timer lockup and UAF
      
      The following patches contain fixes for timer lockups and a
      use-after-free scenario.
      
      This set proposes to fix the following lockup situation for BPF timers.
      
      CPU 1					CPU 2
      
      bpf_timer_cb				bpf_timer_cb
        timer_cb1				  timer_cb2
          bpf_timer_cancel(timer_cb2)		    bpf_timer_cancel(timer_cb1)
            hrtimer_cancel			      hrtimer_cancel
      
      In this case, both callbacks will continue waiting for each other to
      finish synchronously, causing a lockup.
      
      The proposed fix adds support for tracking in-flight cancellations
      *begun by other timer callbacks* for a particular BPF timer.  Whenever
      preparing to call hrtimer_cancel, a callback will increment the target
      timer's counter, then inspect its in-flight cancellations, and if
      non-zero, return -EDEADLK to avoid situations where the target timer's
      callback is waiting for its completion.
      
      This does mean that in cases where a callback is fired and cancelled, it
      will be unable to cancel any timers in that execution. This can be
      alleviated by maintaining the list of waiting callbacks in bpf_hrtimer
      and searching through it to avoid interdependencies, but this may
      introduce additional delays in bpf_timer_cancel, in addition to
      requiring extra state at runtime which may need to be allocated or
      reused from bpf_hrtimer storage. Moreover, extra synchronization is
      needed to delete these elements from the list of waiting callbacks once
      hrtimer_cancel has finished.
      
      The second patch is for a deadlock situation similar to above in
      bpf_timer_cancel_and_free, but also a UAF scenario that can occur if
      timer is armed before entering it, if hrtimer_running check causes the
      hrtimer_cancel call to be skipped.
      
      As seen above, synchronous hrtimer_cancel would lead to deadlock (if
      same callback tries to free its timer, or two timers free each other),
      therefore we queue work onto the global workqueue to ensure outstanding
      timers are cancelled before bpf_hrtimer state is freed.
      
      Further details are in the patches.
      ====================
      
      Link: https://lore.kernel.org/r/20240709185440.1104957-1-memxor@gmail.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      0c237341
    • Kumar Kartikeya Dwivedi's avatar
      bpf: Defer work in bpf_timer_cancel_and_free · a6fcd19d
      Kumar Kartikeya Dwivedi authored
      Currently, the same case as previous patch (two timer callbacks trying
      to cancel each other) can be invoked through bpf_map_update_elem as
      well, or more precisely, freeing map elements containing timers. Since
      this relies on hrtimer_cancel as well, it is prone to the same deadlock
      situation as the previous patch.
      
      It would be sufficient to use hrtimer_try_to_cancel to fix this problem,
      as the timer cannot be enqueued after async_cancel_and_free. Once
      async_cancel_and_free has been done, the timer must be reinitialized
      before it can be armed again. The callback running in parallel trying to
      arm the timer will fail, and freeing bpf_hrtimer without waiting is
      sufficient (given kfree_rcu), and bpf_timer_cb will return
      HRTIMER_NORESTART, preventing the timer from being rearmed again.
      
      However, there exists a UAF scenario where the callback arms the timer
      before entering this function, such that if cancellation fails (due to
      timer callback invoking this routine, or the target timer callback
      running concurrently). In such a case, if the timer expiration is
      significantly far in the future, the RCU grace period expiration
      happening before it will free the bpf_hrtimer state and along with it
      the struct hrtimer, that is enqueued.
      
      Hence, it is clear cancellation needs to occur after
      async_cancel_and_free, and yet it cannot be done inline due to deadlock
      issues. We thus modify bpf_timer_cancel_and_free to defer work to the
      global workqueue, adding a work_struct alongside rcu_head (both used at
      _different_ points of time, so can share space).
      
      Update existing code comments to reflect the new state of affairs.
      
      Fixes: b00628b1 ("bpf: Introduce bpf timers.")
      Signed-off-by: default avatarKumar Kartikeya Dwivedi <memxor@gmail.com>
      Link: https://lore.kernel.org/r/20240709185440.1104957-3-memxor@gmail.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      a6fcd19d
    • Kumar Kartikeya Dwivedi's avatar
      bpf: Fail bpf_timer_cancel when callback is being cancelled · d4523831
      Kumar Kartikeya Dwivedi authored
      Given a schedule:
      
      timer1 cb			timer2 cb
      
      bpf_timer_cancel(timer2);	bpf_timer_cancel(timer1);
      
      Both bpf_timer_cancel calls would wait for the other callback to finish
      executing, introducing a lockup.
      
      Add an atomic_t count named 'cancelling' in bpf_hrtimer. This keeps
      track of all in-flight cancellation requests for a given BPF timer.
      Whenever cancelling a BPF timer, we must check if we have outstanding
      cancellation requests, and if so, we must fail the operation with an
      error (-EDEADLK) since cancellation is synchronous and waits for the
      callback to finish executing. This implies that we can enter a deadlock
      situation involving two or more timer callbacks executing in parallel
      and attempting to cancel one another.
      
      Note that we avoid incrementing the cancelling counter for the target
      timer (the one being cancelled) if bpf_timer_cancel is not invoked from
      a callback, to avoid spurious errors. The whole point of detecting
      cur->cancelling and returning -EDEADLK is to not enter a busy wait loop
      (which may or may not lead to a lockup). This does not apply in case the
      caller is in a non-callback context, the other side can continue to
      cancel as it sees fit without running into errors.
      
      Background on prior attempts:
      
      Earlier versions of this patch used a bool 'cancelling' bit and used the
      following pattern under timer->lock to publish cancellation status.
      
      lock(t->lock);
      t->cancelling = true;
      mb();
      if (cur->cancelling)
      	return -EDEADLK;
      unlock(t->lock);
      hrtimer_cancel(t->timer);
      t->cancelling = false;
      
      The store outside the critical section could overwrite a parallel
      requests t->cancelling assignment to true, to ensure the parallely
      executing callback observes its cancellation status.
      
      It would be necessary to clear this cancelling bit once hrtimer_cancel
      is done, but lack of serialization introduced races. Another option was
      explored where bpf_timer_start would clear the bit when (re)starting the
      timer under timer->lock. This would ensure serialized access to the
      cancelling bit, but may allow it to be cleared before in-flight
      hrtimer_cancel has finished executing, such that lockups can occur
      again.
      
      Thus, we choose an atomic counter to keep track of all outstanding
      cancellation requests and use it to prevent lockups in case callbacks
      attempt to cancel each other while executing in parallel.
      Reported-by: default avatarDohyun Kim <dohyunkim@google.com>
      Reported-by: default avatarNeel Natu <neelnatu@google.com>
      Fixes: b00628b1 ("bpf: Introduce bpf timers.")
      Signed-off-by: default avatarKumar Kartikeya Dwivedi <memxor@gmail.com>
      Link: https://lore.kernel.org/r/20240709185440.1104957-2-memxor@gmail.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      d4523831
    • Mohammad Shehar Yaar Tausif's avatar
      bpf: fix order of args in call to bpf_map_kvcalloc · af253aef
      Mohammad Shehar Yaar Tausif authored
      The original function call passed size of smap->bucket before the number of
      buckets which raises the error 'calloc-transposed-args' on compilation.
      
      Vlastimil Babka added:
      
      The order of parameters can be traced back all the way to 6ac99e8f
      ("bpf: Introduce bpf sk local storage") accross several refactorings,
      and that's why the commit is used as a Fixes: tag.
      
      In v6.10-rc1, a different commit 2c321f3f ("mm: change inlined
      allocation helpers to account at the call site") however exposed the
      order of args in a way that gcc-14 has enough visibility to start
      warning about it, because (in !CONFIG_MEMCG case) bpf_map_kvcalloc is
      then a macro alias for kvcalloc instead of a static inline wrapper.
      
      To sum up the warning happens when the following conditions are all met:
      
      - gcc-14 is used (didn't see it with gcc-13)
      - commit 2c321f3f is present
      - CONFIG_MEMCG is not enabled in .config
      - CONFIG_WERROR turns this from a compiler warning to error
      
      Fixes: 6ac99e8f ("bpf: Introduce bpf sk local storage")
      Reviewed-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Tested-by: default avatarChristian Kujau <lists@nerdbynature.de>
      Signed-off-by: default avatarMohammad Shehar Yaar Tausif <sheharyaar48@gmail.com>
      Signed-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Link: https://lore.kernel.org/r/20240710100521.15061-2-vbabka@suse.czSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      af253aef
    • Aleksander Jan Bajkowski's avatar
      net: ethernet: lantiq_etop: fix double free in detach · e1533b63
      Aleksander Jan Bajkowski authored
      The number of the currently released descriptor is never incremented
      which results in the same skb being released multiple times.
      
      Fixes: 504d4721 ("MIPS: Lantiq: Add ethernet driver")
      Reported-by: default avatarJoe Perches <joe@perches.com>
      Closes: https://lore.kernel.org/all/fc1bf93d92bb5b2f99c6c62745507cc22f3a7b2d.camel@perches.com/Signed-off-by: default avatarAleksander Jan Bajkowski <olek2@wp.pl>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Link: https://patch.msgid.link/20240708205826.5176-1-olek2@wp.plSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      e1533b63
    • Michal Kubiak's avatar
      i40e: Fix XDP program unloading while removing the driver · 01fc5142
      Michal Kubiak authored
      The commit 6533e558 ("i40e: Fix reset path while removing
      the driver") introduced a new PF state "__I40E_IN_REMOVE" to block
      modifying the XDP program while the driver is being removed.
      Unfortunately, such a change is useful only if the ".ndo_bpf()"
      callback was called out of the rmmod context because unloading the
      existing XDP program is also a part of driver removing procedure.
      In other words, from the rmmod context the driver is expected to
      unload the XDP program without reporting any errors. Otherwise,
      the kernel warning with callstack is printed out to dmesg.
      
      Example failing scenario:
       1. Load the i40e driver.
       2. Load the XDP program.
       3. Unload the i40e driver (using "rmmod" command).
      
      The example kernel warning log:
      
      [  +0.004646] WARNING: CPU: 94 PID: 10395 at net/core/dev.c:9290 unregister_netdevice_many_notify+0x7a9/0x870
      [...]
      [  +0.010959] RIP: 0010:unregister_netdevice_many_notify+0x7a9/0x870
      [...]
      [  +0.002726] Call Trace:
      [  +0.002457]  <TASK>
      [  +0.002119]  ? __warn+0x80/0x120
      [  +0.003245]  ? unregister_netdevice_many_notify+0x7a9/0x870
      [  +0.005586]  ? report_bug+0x164/0x190
      [  +0.003678]  ? handle_bug+0x3c/0x80
      [  +0.003503]  ? exc_invalid_op+0x17/0x70
      [  +0.003846]  ? asm_exc_invalid_op+0x1a/0x20
      [  +0.004200]  ? unregister_netdevice_many_notify+0x7a9/0x870
      [  +0.005579]  ? unregister_netdevice_many_notify+0x3cc/0x870
      [  +0.005586]  unregister_netdevice_queue+0xf7/0x140
      [  +0.004806]  unregister_netdev+0x1c/0x30
      [  +0.003933]  i40e_vsi_release+0x87/0x2f0 [i40e]
      [  +0.004604]  i40e_remove+0x1a1/0x420 [i40e]
      [  +0.004220]  pci_device_remove+0x3f/0xb0
      [  +0.003943]  device_release_driver_internal+0x19f/0x200
      [  +0.005243]  driver_detach+0x48/0x90
      [  +0.003586]  bus_remove_driver+0x6d/0xf0
      [  +0.003939]  pci_unregister_driver+0x2e/0xb0
      [  +0.004278]  i40e_exit_module+0x10/0x5f0 [i40e]
      [  +0.004570]  __do_sys_delete_module.isra.0+0x197/0x310
      [  +0.005153]  do_syscall_64+0x85/0x170
      [  +0.003684]  ? syscall_exit_to_user_mode+0x69/0x220
      [  +0.004886]  ? do_syscall_64+0x95/0x170
      [  +0.003851]  ? exc_page_fault+0x7e/0x180
      [  +0.003932]  entry_SYSCALL_64_after_hwframe+0x71/0x79
      [  +0.005064] RIP: 0033:0x7f59dc9347cb
      [  +0.003648] Code: 73 01 c3 48 8b 0d 65 16 0c 00 f7 d8 64 89 01 48 83
      c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00 00 0f
      05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 35 16 0c 00 f7 d8 64 89 01 48
      [  +0.018753] RSP: 002b:00007ffffac99048 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
      [  +0.007577] RAX: ffffffffffffffda RBX: 0000559b9bb2f6e0 RCX: 00007f59dc9347cb
      [  +0.007140] RDX: 0000000000000000 RSI: 0000000000000800 RDI: 0000559b9bb2f748
      [  +0.007146] RBP: 00007ffffac99070 R08: 1999999999999999 R09: 0000000000000000
      [  +0.007133] R10: 00007f59dc9a5ac0 R11: 0000000000000206 R12: 0000000000000000
      [  +0.007141] R13: 00007ffffac992d8 R14: 0000559b9bb2f6e0 R15: 0000000000000000
      [  +0.007151]  </TASK>
      [  +0.002204] ---[ end trace 0000000000000000 ]---
      
      Fix this by checking if the XDP program is being loaded or unloaded.
      Then, block only loading a new program while "__I40E_IN_REMOVE" is set.
      Also, move testing "__I40E_IN_REMOVE" flag to the beginning of XDP_SETUP
      callback to avoid unnecessary operations and checks.
      
      Fixes: 6533e558 ("i40e: Fix reset path while removing the driver")
      Signed-off-by: default avatarMichal Kubiak <michal.kubiak@intel.com>
      Reviewed-by: default avatarMaciej Fijalkowski <maciej.fijalkowski@intel.com>
      Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel)
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Link: https://patch.msgid.link/20240708230750.625986-1-anthony.l.nguyen@intel.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      01fc5142
  3. 09 Jul, 2024 7 commits
    • Hugh Dickins's avatar
      net: fix rc7's __skb_datagram_iter() · f1538310
      Hugh Dickins authored
      X would not start in my old 32-bit partition (and the "n"-handling looks
      just as wrong on 64-bit, but for whatever reason did not show up there):
      "n" must be accumulated over all pages before it's added to "offset" and
      compared with "copy", immediately after the skb_frag_foreach_page() loop.
      
      Fixes: d2d30a37 ("net: allow skb_datagram_iter to be called from any context")
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Reviewed-by: default avatarSagi Grimberg <sagi@grimberg.me>
      Link: https://patch.msgid.link/fef352e8-b89a-da51-f8ce-04bc39ee6481@google.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      f1538310
    • Paolo Abeni's avatar
      Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · 528269fe
      Paolo Abeni authored
      Daniel Borkmann says:
      
      ====================
      pull-request: bpf 2024-07-09
      
      The following pull-request contains BPF updates for your *net* tree.
      
      We've added 3 non-merge commits during the last 1 day(s) which contain
      a total of 5 files changed, 81 insertions(+), 11 deletions(-).
      
      The main changes are:
      
      1) Fix a use-after-free in a corner case where tcx_entry got released too
         early. Also add BPF test coverage along with the fix, from Daniel Borkmann.
      
      2) Fix a kernel panic on Loongarch in sk_msg_recvmsg() which got triggered
         by running BPF sockmap selftests, from Geliang Tang.
      
      bpf-for-netdev
      
      * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
        skmsg: Skip zero length skb in sk_msg_recvmsg
        selftests/bpf: Extend tcx tests to cover late tcx_entry release
        bpf: Fix too early release of tcx_entry
      ====================
      
      Link: https://patch.msgid.link/20240709091452.27840-1-daniel@iogearbox.netSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      528269fe
    • Ronald Wahl's avatar
      net: ks8851: Fix deadlock with the SPI chip variant · 0913ec33
      Ronald Wahl authored
      When SMP is enabled and spinlocks are actually functional then there is
      a deadlock with the 'statelock' spinlock between ks8851_start_xmit_spi
      and ks8851_irq:
      
          watchdog: BUG: soft lockup - CPU#0 stuck for 27s!
          call trace:
            queued_spin_lock_slowpath+0x100/0x284
            do_raw_spin_lock+0x34/0x44
            ks8851_start_xmit_spi+0x30/0xb8
            ks8851_start_xmit+0x14/0x20
            netdev_start_xmit+0x40/0x6c
            dev_hard_start_xmit+0x6c/0xbc
            sch_direct_xmit+0xa4/0x22c
            __qdisc_run+0x138/0x3fc
            qdisc_run+0x24/0x3c
            net_tx_action+0xf8/0x130
            handle_softirqs+0x1ac/0x1f0
            __do_softirq+0x14/0x20
            ____do_softirq+0x10/0x1c
            call_on_irq_stack+0x3c/0x58
            do_softirq_own_stack+0x1c/0x28
            __irq_exit_rcu+0x54/0x9c
            irq_exit_rcu+0x10/0x1c
            el1_interrupt+0x38/0x50
            el1h_64_irq_handler+0x18/0x24
            el1h_64_irq+0x64/0x68
            __netif_schedule+0x6c/0x80
            netif_tx_wake_queue+0x38/0x48
            ks8851_irq+0xb8/0x2c8
            irq_thread_fn+0x2c/0x74
            irq_thread+0x10c/0x1b0
            kthread+0xc8/0xd8
            ret_from_fork+0x10/0x20
      
      This issue has not been identified earlier because tests were done on
      a device with SMP disabled and so spinlocks were actually NOPs.
      
      Now use spin_(un)lock_bh for TX queue related locking to avoid execution
      of softirq work synchronously that would lead to a deadlock.
      
      Fixes: 3dc5d445 ("net: ks8851: Fix TX stall caused by TX buffer overrun")
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Jakub Kicinski <kuba@kernel.org>
      Cc: Paolo Abeni <pabeni@redhat.com>
      Cc: Simon Horman <horms@kernel.org>
      Cc: netdev@vger.kernel.org
      Cc: stable@vger.kernel.org # 5.10+
      Signed-off-by: default avatarRonald Wahl <ronald.wahl@raritan.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Link: https://patch.msgid.link/20240706101337.854474-1-rwahl@gmx.deSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      0913ec33
    • Aleksandr Mishin's avatar
      octeontx2-af: Fix incorrect value output on error path in rvu_check_rsrc_availability() · 442e26af
      Aleksandr Mishin authored
      In rvu_check_rsrc_availability() in case of invalid SSOW req, an incorrect
      data is printed to error log. 'req->sso' value is printed instead of
      'req->ssow'. Looks like "copy-paste" mistake.
      
      Fix this mistake by replacing 'req->sso' with 'req->ssow'.
      
      Found by Linux Verification Center (linuxtesting.org) with SVACE.
      
      Fixes: 746ea742 ("octeontx2-af: Add RVU block LF provisioning support")
      Signed-off-by: default avatarAleksandr Mishin <amishin@t-argos.ru>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Link: https://patch.msgid.link/20240705095317.12640-1-amishin@t-argos.ruSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      442e26af
    • Jakub Kicinski's avatar
      bnxt: fix crashes when reducing ring count with active RSS contexts · 0d1b7d6c
      Jakub Kicinski authored
      bnxt doesn't check if a ring is used by RSS contexts when reducing
      ring count. Core performs a similar check for the drivers for
      the main context, but core doesn't know about additional contexts,
      so it can't validate them. bnxt_fill_hw_rss_tbl_p5() uses ring
      id to index bp->rx_ring[], which without the check may end up
      being out of bounds.
      
        BUG: KASAN: slab-out-of-bounds in __bnxt_hwrm_vnic_set_rss+0xb79/0xe40
        Read of size 2 at addr ffff8881c5809618 by task ethtool/31525
        Call Trace:
        __bnxt_hwrm_vnic_set_rss+0xb79/0xe40
         bnxt_hwrm_vnic_rss_cfg_p5+0xf7/0x460
         __bnxt_setup_vnic_p5+0x12e/0x270
         __bnxt_open_nic+0x2262/0x2f30
         bnxt_open_nic+0x5d/0xf0
         ethnl_set_channels+0x5d4/0xb30
         ethnl_default_set_doit+0x2f1/0x620
      
      Core does track the additional contexts in net-next, so we can
      move this validation out of the driver as a follow up there.
      
      Fixes: b3d0083c ("bnxt_en: Support RSS contexts in ethtool .{get|set}_rxfh()")
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Reviewed-by: default avatarPavan Chebbi <pavan.chebbi@broadcom.com>
      Link: https://patch.msgid.link/20240705020005.681746-1-kuba@kernel.orgSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      0d1b7d6c
    • Geliang Tang's avatar
      skmsg: Skip zero length skb in sk_msg_recvmsg · f0c18025
      Geliang Tang authored
      When running BPF selftests (./test_progs -t sockmap_basic) on a Loongarch
      platform, the following kernel panic occurs:
      
        [...]
        Oops[#1]:
        CPU: 22 PID: 2824 Comm: test_progs Tainted: G           OE  6.10.0-rc2+ #18
        Hardware name: LOONGSON Dabieshan/Loongson-TC542F0, BIOS Loongson-UDK2018
           ... ...
           ra: 90000000048bf6c0 sk_msg_recvmsg+0x120/0x560
          ERA: 9000000004162774 copy_page_to_iter+0x74/0x1c0
         CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
         PRMD: 0000000c (PPLV0 +PIE +PWE)
         EUEN: 00000007 (+FPE +SXE +ASXE -BTE)
         ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
        ESTAT: 00010000 [PIL] (IS= ECode=1 EsubCode=0)
         BADV: 0000000000000040
         PRID: 0014c011 (Loongson-64bit, Loongson-3C5000)
        Modules linked in: bpf_testmod(OE) xt_CHECKSUM xt_MASQUERADE xt_conntrack
        Process test_progs (pid: 2824, threadinfo=0000000000863a31, task=...)
        Stack : ...
        Call Trace:
        [<9000000004162774>] copy_page_to_iter+0x74/0x1c0
        [<90000000048bf6c0>] sk_msg_recvmsg+0x120/0x560
        [<90000000049f2b90>] tcp_bpf_recvmsg_parser+0x170/0x4e0
        [<90000000049aae34>] inet_recvmsg+0x54/0x100
        [<900000000481ad5c>] sock_recvmsg+0x7c/0xe0
        [<900000000481e1a8>] __sys_recvfrom+0x108/0x1c0
        [<900000000481e27c>] sys_recvfrom+0x1c/0x40
        [<9000000004c076ec>] do_syscall+0x8c/0xc0
        [<9000000003731da4>] handle_syscall+0xc4/0x160
        Code: ...
        ---[ end trace 0000000000000000 ]---
        Kernel panic - not syncing: Fatal exception
        Kernel relocated by 0x3510000
         .text @ 0x9000000003710000
         .data @ 0x9000000004d70000
         .bss  @ 0x9000000006469400
        ---[ end Kernel panic - not syncing: Fatal exception ]---
        [...]
      
      This crash happens every time when running sockmap_skb_verdict_shutdown
      subtest in sockmap_basic.
      
      This crash is because a NULL pointer is passed to page_address() in the
      sk_msg_recvmsg(). Due to the different implementations depending on the
      architecture, page_address(NULL) will trigger a panic on Loongarch
      platform but not on x86 platform. So this bug was hidden on x86 platform
      for a while, but now it is exposed on Loongarch platform. The root cause
      is that a zero length skb (skb->len == 0) was put on the queue.
      
      This zero length skb is a TCP FIN packet, which was sent by shutdown(),
      invoked in test_sockmap_skb_verdict_shutdown():
      
      	shutdown(p1, SHUT_WR);
      
      In this case, in sk_psock_skb_ingress_enqueue(), num_sge is zero, and no
      page is put to this sge (see sg_set_page in sg_set_page), but this empty
      sge is queued into ingress_msg list.
      
      And in sk_msg_recvmsg(), this empty sge is used, and a NULL page is got by
      sg_page(sge). Pass this NULL page to copy_page_to_iter(), which passes it
      to kmap_local_page() and to page_address(), then kernel panics.
      
      To solve this, we should skip this zero length skb. So in sk_msg_recvmsg(),
      if copy is zero, that means it's a zero length skb, skip invoking
      copy_page_to_iter(). We are using the EFAULT return triggered by
      copy_page_to_iter to check for is_fin in tcp_bpf.c.
      
      Fixes: 604326b4 ("bpf, sockmap: convert to generic sk_msg interface")
      Suggested-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: default avatarGeliang Tang <tanggeliang@kylinos.cn>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Reviewed-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Link: https://lore.kernel.org/bpf/e3a16eacdc6740658ee02a33489b1b9d4912f378.1719992715.git.tanggeliang@kylinos.cn
      f0c18025
    • Oleksij Rempel's avatar
      net: phy: microchip: lan87xx: reinit PHY after cable test · 30f747b8
      Oleksij Rempel authored
      Reinit PHY after cable test, otherwise link can't be established on
      tested port. This issue is reproducible on LAN9372 switches with
      integrated 100BaseT1 PHYs.
      
      Fixes: 78805025 ("net: phy: microchip_t1: add cable test support for lan87xx phy")
      Signed-off-by: default avatarOleksij Rempel <o.rempel@pengutronix.de>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Reviewed-by: default avatarMichal Kubiak <michal.kubiak@intel.com>
      Reviewed-by: default avatarFlorian Fainelli <florian.fainelli@broadcom.com>
      Link: https://patch.msgid.link/20240705084954.83048-1-o.rempel@pengutronix.deSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      30f747b8
  4. 08 Jul, 2024 3 commits
    • Daniel Borkmann's avatar
      selftests/bpf: Extend tcx tests to cover late tcx_entry release · 5f1d18de
      Daniel Borkmann authored
      Add a test case which replaces an active ingress qdisc while keeping the
      miniq in-tact during the transition period to the new clsact qdisc.
      
        # ./vmtest.sh -- ./test_progs -t tc_link
        [...]
        ./test_progs -t tc_link
        [    3.412871] bpf_testmod: loading out-of-tree module taints kernel.
        [    3.413343] bpf_testmod: module verification failed: signature and/or required key missing - tainting kernel
        #332     tc_links_after:OK
        #333     tc_links_append:OK
        #334     tc_links_basic:OK
        #335     tc_links_before:OK
        #336     tc_links_chain_classic:OK
        #337     tc_links_chain_mixed:OK
        #338     tc_links_dev_chain0:OK
        #339     tc_links_dev_cleanup:OK
        #340     tc_links_dev_mixed:OK
        #341     tc_links_ingress:OK
        #342     tc_links_invalid:OK
        #343     tc_links_prepend:OK
        #344     tc_links_replace:OK
        #345     tc_links_revision:OK
        Summary: 14/0 PASSED, 0 SKIPPED, 0 FAILED
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Cc: Martin KaFai Lau <martin.lau@kernel.org>
      Link: https://lore.kernel.org/r/20240708133130.11609-2-daniel@iogearbox.netSigned-off-by: default avatarMartin KaFai Lau <martin.lau@kernel.org>
      5f1d18de
    • Daniel Borkmann's avatar
      bpf: Fix too early release of tcx_entry · 1cb6f0ba
      Daniel Borkmann authored
      Pedro Pinto and later independently also Hyunwoo Kim and Wongi Lee reported
      an issue that the tcx_entry can be released too early leading to a use
      after free (UAF) when an active old-style ingress or clsact qdisc with a
      shared tc block is later replaced by another ingress or clsact instance.
      
      Essentially, the sequence to trigger the UAF (one example) can be as follows:
      
        1. A network namespace is created
        2. An ingress qdisc is created. This allocates a tcx_entry, and
           &tcx_entry->miniq is stored in the qdisc's miniqp->p_miniq. At the
           same time, a tcf block with index 1 is created.
        3. chain0 is attached to the tcf block. chain0 must be connected to
           the block linked to the ingress qdisc to later reach the function
           tcf_chain0_head_change_cb_del() which triggers the UAF.
        4. Create and graft a clsact qdisc. This causes the ingress qdisc
           created in step 1 to be removed, thus freeing the previously linked
           tcx_entry:
      
           rtnetlink_rcv_msg()
             => tc_modify_qdisc()
               => qdisc_create()
                 => clsact_init() [a]
               => qdisc_graft()
                 => qdisc_destroy()
                   => __qdisc_destroy()
                     => ingress_destroy() [b]
                       => tcx_entry_free()
                         => kfree_rcu() // tcx_entry freed
      
        5. Finally, the network namespace is closed. This registers the
           cleanup_net worker, and during the process of releasing the
           remaining clsact qdisc, it accesses the tcx_entry that was
           already freed in step 4, causing the UAF to occur:
      
           cleanup_net()
             => ops_exit_list()
               => default_device_exit_batch()
                 => unregister_netdevice_many()
                   => unregister_netdevice_many_notify()
                     => dev_shutdown()
                       => qdisc_put()
                         => clsact_destroy() [c]
                           => tcf_block_put_ext()
                             => tcf_chain0_head_change_cb_del()
                               => tcf_chain_head_change_item()
                                 => clsact_chain_head_change()
                                   => mini_qdisc_pair_swap() // UAF
      
      There are also other variants, the gist is to add an ingress (or clsact)
      qdisc with a specific shared block, then to replace that qdisc, waiting
      for the tcx_entry kfree_rcu() to be executed and subsequently accessing
      the current active qdisc's miniq one way or another.
      
      The correct fix is to turn the miniq_active boolean into a counter. What
      can be observed, at step 2 above, the counter transitions from 0->1, at
      step [a] from 1->2 (in order for the miniq object to remain active during
      the replacement), then in [b] from 2->1 and finally [c] 1->0 with the
      eventual release. The reference counter in general ranges from [0,2] and
      it does not need to be atomic since all access to the counter is protected
      by the rtnl mutex. With this in place, there is no longer a UAF happening
      and the tcx_entry is freed at the correct time.
      
      Fixes: e420bed0 ("bpf: Add fd-based tcx multi-prog infra with link support")
      Reported-by: default avatarPedro Pinto <xten@osec.io>
      Co-developed-by: default avatarPedro Pinto <xten@osec.io>
      Signed-off-by: default avatarPedro Pinto <xten@osec.io>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Cc: Hyunwoo Kim <v4bel@theori.io>
      Cc: Wongi Lee <qwerty@theori.io>
      Cc: Martin KaFai Lau <martin.lau@kernel.org>
      Link: https://lore.kernel.org/r/20240708133130.11609-1-daniel@iogearbox.netSigned-off-by: default avatarMartin KaFai Lau <martin.lau@kernel.org>
      1cb6f0ba
    • Chris Packham's avatar
      docs: networking: devlink: capitalise length value · 83c36e7c
      Chris Packham authored
      Correct the example to match the help text from the devlink utility.
      Signed-off-by: default avatarChris Packham <chris.packham@alliedtelesis.co.nz>
      Reviewed-by: default avatarPrzemek Kitszel <przemyslaw.kitszel@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      83c36e7c
  5. 06 Jul, 2024 7 commits
    • Neal Cardwell's avatar
      tcp: fix incorrect undo caused by DSACK of TLP retransmit · 0ec986ed
      Neal Cardwell authored
      Loss recovery undo_retrans bookkeeping had a long-standing bug where a
      DSACK from a spurious TLP retransmit packet could cause an erroneous
      undo of a fast recovery or RTO recovery that repaired a single
      really-lost packet (in a sequence range outside that of the TLP
      retransmit). Basically, because the loss recovery state machine didn't
      account for the fact that it sent a TLP retransmit, the DSACK for the
      TLP retransmit could erroneously be implicitly be interpreted as
      corresponding to the normal fast recovery or RTO recovery retransmit
      that plugged a real hole, thus resulting in an improper undo.
      
      For example, consider the following buggy scenario where there is a
      real packet loss but the congestion control response is improperly
      undone because of this bug:
      
      + send packets P1, P2, P3, P4
      + P1 is really lost
      + send TLP retransmit of P4
      + receive SACK for original P2, P3, P4
      + enter fast recovery, fast-retransmit P1, increment undo_retrans to 1
      + receive DSACK for TLP P4, decrement undo_retrans to 0, undo (bug!)
      + receive cumulative ACK for P1-P4 (fast retransmit plugged real hole)
      
      The fix: when we initialize undo machinery in tcp_init_undo(), if
      there is a TLP retransmit in flight, then increment tp->undo_retrans
      so that we make sure that we receive a DSACK corresponding to the TLP
      retransmit, as well as DSACKs for all later normal retransmits, before
      triggering a loss recovery undo. Note that we also have to move the
      line that clears tp->tlp_high_seq for RTO recovery, so that upon RTO
      we remember the tp->tlp_high_seq value until tcp_init_undo() and clear
      it only afterward.
      
      Also note that the bug dates back to the original 2013 TLP
      implementation, commit 6ba8a3b1 ("tcp: Tail loss probe (TLP)").
      
      However, this patch will only compile and work correctly with kernels
      that have tp->tlp_retrans, which was added only in v5.8 in 2020 in
      commit 76be93fc ("tcp: allow at most one TLP probe per flight").
      So we associate this fix with that later commit.
      
      Fixes: 76be93fc ("tcp: allow at most one TLP probe per flight")
      Signed-off-by: default avatarNeal Cardwell <ncardwell@google.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Cc: Kevin Yang <yyd@google.com>
      Link: https://patch.msgid.link/20240703171246.1739561-1-ncardwell.sw@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      0ec986ed
    • Jakub Kicinski's avatar
      Merge branch 'wireguard-fixes-for-6-10-rc7' · 842c361b
      Jakub Kicinski authored
      Jason A. Donenfeld says:
      
      ====================
      wireguard fixes for 6.10-rc7
      
      These are four small fixes for WireGuard, which are all marked for
      stable:
      
      1) A QEMU command line fix to remove deprecated flags.
      
      2) Use of proper unaligned helpers to avoid unaligned memory access on
         some systems, from Helge.
      
      3) Two patches to annotate intentional data races, so KCSAN and syzbot
         don't get upset.
      ====================
      
      Link: https://patch.msgid.link/20240704154517.1572127-1-Jason@zx2c4.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      842c361b
    • Jason A. Donenfeld's avatar
      wireguard: send: annotate intentional data race in checking empty queue · 381a7d45
      Jason A. Donenfeld authored
      KCSAN reports a race in wg_packet_send_keepalive, which is intentional:
      
          BUG: KCSAN: data-race in wg_packet_send_keepalive / wg_packet_send_staged_packets
      
          write to 0xffff88814cd91280 of 8 bytes by task 3194 on cpu 0:
           __skb_queue_head_init include/linux/skbuff.h:2162 [inline]
           skb_queue_splice_init include/linux/skbuff.h:2248 [inline]
           wg_packet_send_staged_packets+0xe5/0xad0 drivers/net/wireguard/send.c:351
           wg_xmit+0x5b8/0x660 drivers/net/wireguard/device.c:218
           __netdev_start_xmit include/linux/netdevice.h:4940 [inline]
           netdev_start_xmit include/linux/netdevice.h:4954 [inline]
           xmit_one net/core/dev.c:3548 [inline]
           dev_hard_start_xmit+0x11b/0x3f0 net/core/dev.c:3564
           __dev_queue_xmit+0xeff/0x1d80 net/core/dev.c:4349
           dev_queue_xmit include/linux/netdevice.h:3134 [inline]
           neigh_connected_output+0x231/0x2a0 net/core/neighbour.c:1592
           neigh_output include/net/neighbour.h:542 [inline]
           ip6_finish_output2+0xa66/0xce0 net/ipv6/ip6_output.c:137
           ip6_finish_output+0x1a5/0x490 net/ipv6/ip6_output.c:222
           NF_HOOK_COND include/linux/netfilter.h:303 [inline]
           ip6_output+0xeb/0x220 net/ipv6/ip6_output.c:243
           dst_output include/net/dst.h:451 [inline]
           NF_HOOK include/linux/netfilter.h:314 [inline]
           ndisc_send_skb+0x4a2/0x670 net/ipv6/ndisc.c:509
           ndisc_send_rs+0x3ab/0x3e0 net/ipv6/ndisc.c:719
           addrconf_dad_completed+0x640/0x8e0 net/ipv6/addrconf.c:4295
           addrconf_dad_work+0x891/0xbc0
           process_one_work kernel/workqueue.c:2633 [inline]
           process_scheduled_works+0x5b8/0xa30 kernel/workqueue.c:2706
           worker_thread+0x525/0x730 kernel/workqueue.c:2787
           kthread+0x1d7/0x210 kernel/kthread.c:388
           ret_from_fork+0x48/0x60 arch/x86/kernel/process.c:147
           ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:242
      
          read to 0xffff88814cd91280 of 8 bytes by task 3202 on cpu 1:
           skb_queue_empty include/linux/skbuff.h:1798 [inline]
           wg_packet_send_keepalive+0x20/0x100 drivers/net/wireguard/send.c:225
           wg_receive_handshake_packet drivers/net/wireguard/receive.c:186 [inline]
           wg_packet_handshake_receive_worker+0x445/0x5e0 drivers/net/wireguard/receive.c:213
           process_one_work kernel/workqueue.c:2633 [inline]
           process_scheduled_works+0x5b8/0xa30 kernel/workqueue.c:2706
           worker_thread+0x525/0x730 kernel/workqueue.c:2787
           kthread+0x1d7/0x210 kernel/kthread.c:388
           ret_from_fork+0x48/0x60 arch/x86/kernel/process.c:147
           ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:242
      
          value changed: 0xffff888148fef200 -> 0xffff88814cd91280
      
      Mark this race as intentional by using the skb_queue_empty_lockless()
      function rather than skb_queue_empty(), which uses READ_ONCE()
      internally to annotate the race.
      
      Cc: stable@vger.kernel.org
      Fixes: e7096c13 ("net: WireGuard secure network tunnel")
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Link: https://patch.msgid.link/20240704154517.1572127-5-Jason@zx2c4.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      381a7d45
    • Jason A. Donenfeld's avatar
      wireguard: queueing: annotate intentional data race in cpu round robin · 2fe3d6d2
      Jason A. Donenfeld authored
      KCSAN reports a race in the CPU round robin function, which, as the
      comment points out, is intentional:
      
          BUG: KCSAN: data-race in wg_packet_send_staged_packets / wg_packet_send_staged_packets
      
          read to 0xffff88811254eb28 of 4 bytes by task 3160 on cpu 1:
           wg_cpumask_next_online drivers/net/wireguard/queueing.h:127 [inline]
           wg_queue_enqueue_per_device_and_peer drivers/net/wireguard/queueing.h:173 [inline]
           wg_packet_create_data drivers/net/wireguard/send.c:320 [inline]
           wg_packet_send_staged_packets+0x60e/0xac0 drivers/net/wireguard/send.c:388
           wg_packet_send_keepalive+0xe2/0x100 drivers/net/wireguard/send.c:239
           wg_receive_handshake_packet drivers/net/wireguard/receive.c:186 [inline]
           wg_packet_handshake_receive_worker+0x449/0x5f0 drivers/net/wireguard/receive.c:213
           process_one_work kernel/workqueue.c:3248 [inline]
           process_scheduled_works+0x483/0x9a0 kernel/workqueue.c:3329
           worker_thread+0x526/0x720 kernel/workqueue.c:3409
           kthread+0x1d1/0x210 kernel/kthread.c:389
           ret_from_fork+0x4b/0x60 arch/x86/kernel/process.c:147
           ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
      
          write to 0xffff88811254eb28 of 4 bytes by task 3158 on cpu 0:
           wg_cpumask_next_online drivers/net/wireguard/queueing.h:130 [inline]
           wg_queue_enqueue_per_device_and_peer drivers/net/wireguard/queueing.h:173 [inline]
           wg_packet_create_data drivers/net/wireguard/send.c:320 [inline]
           wg_packet_send_staged_packets+0x6e5/0xac0 drivers/net/wireguard/send.c:388
           wg_packet_send_keepalive+0xe2/0x100 drivers/net/wireguard/send.c:239
           wg_receive_handshake_packet drivers/net/wireguard/receive.c:186 [inline]
           wg_packet_handshake_receive_worker+0x449/0x5f0 drivers/net/wireguard/receive.c:213
           process_one_work kernel/workqueue.c:3248 [inline]
           process_scheduled_works+0x483/0x9a0 kernel/workqueue.c:3329
           worker_thread+0x526/0x720 kernel/workqueue.c:3409
           kthread+0x1d1/0x210 kernel/kthread.c:389
           ret_from_fork+0x4b/0x60 arch/x86/kernel/process.c:147
           ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
      
          value changed: 0xffffffff -> 0x00000000
      
      Mark this race as intentional by using READ/WRITE_ONCE().
      
      Cc: stable@vger.kernel.org
      Fixes: e7096c13 ("net: WireGuard secure network tunnel")
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Link: https://patch.msgid.link/20240704154517.1572127-4-Jason@zx2c4.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      2fe3d6d2
    • Helge Deller's avatar
      wireguard: allowedips: avoid unaligned 64-bit memory accesses · 948f991c
      Helge Deller authored
      On the parisc platform, the kernel issues kernel warnings because
      swap_endian() tries to load a 128-bit IPv6 address from an unaligned
      memory location:
      
       Kernel: unaligned access to 0x55f4688c in wg_allowedips_insert_v6+0x2c/0x80 [wireguard] (iir 0xf3010df)
       Kernel: unaligned access to 0x55f46884 in wg_allowedips_insert_v6+0x38/0x80 [wireguard] (iir 0xf2010dc)
      
      Avoid such unaligned memory accesses by instead using the
      get_unaligned_be64() helper macro.
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      [Jason: replace src[8] in original patch with src+8]
      Cc: stable@vger.kernel.org
      Fixes: e7096c13 ("net: WireGuard secure network tunnel")
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Link: https://patch.msgid.link/20240704154517.1572127-3-Jason@zx2c4.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      948f991c
    • Jason A. Donenfeld's avatar
      wireguard: selftests: use acpi=off instead of -no-acpi for recent QEMU · 2cb489eb
      Jason A. Donenfeld authored
      QEMU 9.0 removed -no-acpi, in favor of machine properties, so update the
      Makefile to use the correct QEMU invocation.
      
      Cc: stable@vger.kernel.org
      Fixes: b83fdcd9 ("wireguard: selftests: use microvm on x86")
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Link: https://patch.msgid.link/20240704154517.1572127-2-Jason@zx2c4.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      2cb489eb
    • Dan Carpenter's avatar
      net: bcmasp: Fix error code in probe() · 0c754d9d
      Dan Carpenter authored
      Return an error code if bcmasp_interface_create() fails.  Don't return
      success.
      
      Fixes: 490cb412 ("net: bcmasp: Add support for ASP2.0 Ethernet controller")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@linaro.org>
      Reviewed-by: default avatarMichal Kubiak <michal.kubiak@intel.com>
      Reviewed-by: default avatarJustin Chen <justin.chen@broadcom.com>
      Link: https://patch.msgid.link/ZoWKBkHH9D1fqV4r@stanley.mountainSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      0c754d9d
  6. 05 Jul, 2024 1 commit
  7. 04 Jul, 2024 2 commits
    • Linus Torvalds's avatar
      Merge tag 'net-6.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 033771c0
      Linus Torvalds authored
      Pull networking fixes from Jakub Kicinski:
       "Including fixes from bluetooth, wireless and netfilter.
      
        There's one fix for power management with Intel's e1000e here,
        Thorsten tells us there's another problem that started in v6.9. We're
        trying to wrap that up but I don't think it's blocking.
      
        Current release - new code bugs:
      
         - wifi: mac80211: disable softirqs for queued frame handling
      
         - af_unix: fix uninit-value in __unix_walk_scc(), with the new
           garbage collection algo
      
        Previous releases - regressions:
      
         - Bluetooth:
            - qca: fix BT enable failure for QCA6390 after warm reboot
            - add quirk to ignore reserved PHY bits in LE Extended Adv Report,
              abused by some Broadcom controllers found on Apple machines
      
         - wifi: wilc1000: fix ies_len type in connect path
      
        Previous releases - always broken:
      
         - tcp: fix DSACK undo in fast recovery to call tcp_try_to_open(),
           avoid premature timeouts
      
         - net: make sure skb_datagram_iter maps fragments page by page, in
           case we somehow get compound highmem mixed in
      
         - eth: bnx2x: fix multiple UBSAN array-index-out-of-bounds when more
           queues are used
      
        Misc:
      
         - MAINTAINERS: Remembering Larry Finger"
      
      * tag 'net-6.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (62 commits)
        bnxt_en: Fix the resource check condition for RSS contexts
        mlxsw: core_linecards: Fix double memory deallocation in case of invalid INI file
        inet_diag: Initialize pad field in struct inet_diag_req_v2
        tcp: Don't flag tcp_sk(sk)->rx_opt.saw_unknown for TCP AO.
        selftests: make order checking verbose in msg_zerocopy selftest
        selftests: fix OOM in msg_zerocopy selftest
        ice: use proper macro for testing bit
        ice: Reject pin requests with unsupported flags
        ice: Don't process extts if PTP is disabled
        ice: Fix improper extts handling
        selftest: af_unix: Add test case for backtrack after finalising SCC.
        af_unix: Fix uninit-value in __unix_walk_scc()
        bonding: Fix out-of-bounds read in bond_option_arp_ip_targets_set()
        net: rswitch: Avoid use-after-free in rswitch_poll()
        netfilter: nf_tables: unconditionally flush pending work before notifier
        wifi: iwlwifi: mvm: check vif for NULL/ERR_PTR before dereference
        wifi: iwlwifi: mvm: avoid link lookup in statistics
        wifi: iwlwifi: mvm: don't wake up rx_sync_waitq upon RFKILL
        wifi: iwlwifi: properly set WIPHY_FLAG_SUPPORTS_EXT_KEK_KCK
        wifi: wilc1000: fix ies_len type in connect path
        ...
      033771c0
    • Linus Torvalds's avatar
      Merge tag 's390-6.10-8' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · d470e9f5
      Linus Torvalds authored
      Pull s390 fixes from Heiko Carstens:
      
       - Fix and add physical to virtual address translations in dasd and
         virtio_ccw drivers. For virtio_ccw this is just a minimal fix.
         More code cleanup will follow.
      
       - Small defconfig updates
      
      * tag 's390-6.10-8' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
        s390/dasd: Fix invalid dereferencing of indirect CCW data pointer
        s390/vfio_ccw: Fix target addresses of TIC CCWs
        s390: Update defconfigs
      d470e9f5