- 08 Aug, 2024 17 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queueJakub Kicinski authored
Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2024-08-07 (ice) This series contains updates to ice driver only. Grzegorz adds IRQ synchronization call before performing reset and prevents writing to hardware when it is resetting. Mateusz swaps incorrect assignment of FEC statistics. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: ice: Fix incorrect assigns of FEC counts ice: Skip PTP HW writes during PTP reset procedure ice: Fix reset handler ==================== Link: https://patch.msgid.link/20240807224521.3819189-1-anthony.l.nguyen@intel.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Martin Whitaker authored
As noted in the device errata [1-8], EEE support is not fully operational in the KSZ8567, KSZ9477, KSZ9567, KSZ9896, and KSZ9897 devices, causing link drops when connected to another device that supports EEE. The patch series "net: add EEE support for KSZ9477 switch family" merged in commit 9b0bf4f7 caused EEE support to be enabled in these devices. A fix for this regression for the KSZ9477 alone was merged in commit 08c6d8ba. This patch extends this fix to the other affected devices. [1] https://ww1.microchip.com/downloads/aemDocuments/documents/UNG/ProductDocuments/Errata/KSZ8567R-Errata-DS80000752.pdf [2] https://ww1.microchip.com/downloads/aemDocuments/documents/UNG/ProductDocuments/Errata/KSZ8567S-Errata-DS80000753.pdf [3] https://ww1.microchip.com/downloads/aemDocuments/documents/UNG/ProductDocuments/Errata/KSZ9477S-Errata-DS80000754.pdf [4] https://ww1.microchip.com/downloads/aemDocuments/documents/UNG/ProductDocuments/Errata/KSZ9567R-Errata-DS80000755.pdf [5] https://ww1.microchip.com/downloads/aemDocuments/documents/UNG/ProductDocuments/Errata/KSZ9567S-Errata-DS80000756.pdf [6] https://ww1.microchip.com/downloads/aemDocuments/documents/UNG/ProductDocuments/Errata/KSZ9896C-Errata-DS80000757.pdf [7] https://ww1.microchip.com/downloads/aemDocuments/documents/UNG/ProductDocuments/Errata/KSZ9897R-Errata-DS80000758.pdf [8] https://ww1.microchip.com/downloads/aemDocuments/documents/UNG/ProductDocuments/Errata/KSZ9897S-Errata-DS80000759.pdf Fixes: 69d3b36c ("net: dsa: microchip: enable EEE support") # for KSZ8567/KSZ9567/KSZ9896/KSZ9897 Link: https://lore.kernel.org/netdev/137ce1ee-0b68-4c96-a717-c8164b514eec@martin-whitaker.me.uk/Signed-off-by: Martin Whitaker <foss@martin-whitaker.me.uk> Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com> Reviewed-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Lukasz Majewski <lukma@denx.de> Link: https://patch.msgid.link/20240807205209.21464-1-foss@martin-whitaker.me.ukSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Gal Pressman authored
The 'at least one change' requirement is not applicable for context creation, skip the check in such case. This allows a command such as 'ethtool -X eth0 context new' to work. The command works by mistake when using older versions of userspace ethtool due to an incompatibility issue where rxfh.input_xfrm is passed as zero (unset) instead of RXH_XFRM_NO_CHANGE as done with recent userspace. This patch does not try to solve the incompatibility issue. Link: https://lore.kernel.org/netdev/05ae8316-d3aa-4356-98c6-55ed4253c8a7@nvidia.com/ Fixes: 84a1d9c4 ("net: ethtool: extend RXNFC API to support RSS spreading of filter matches") Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Jianbo Liu <jianbol@nvidia.com> Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Edward Cree <ecree.xilinx@gmail.com> Link: https://patch.msgid.link/20240807173352.3501746-1-gal@nvidia.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Edward Cree authored
Both ethtool_ops.rxfh_max_context_id and the default value used when it's not specified are supposed to be exclusive maxima (the former is documented as such; the latter, U32_MAX, cannot be used as an ID since it equals ETH_RXFH_CONTEXT_ALLOC), but xa_alloc() expects an inclusive maximum. Subtract one from 'limit' to produce an inclusive maximum, and pass that to xa_alloc(). Increase bnxt's max by one to prevent a (very minor) regression, as BNXT_MAX_ETH_RSS_CTX is an inclusive max. This is safe since bnxt is not actually hard-limited; BNXT_MAX_ETH_RSS_CTX is just a leftover from old driver code that managed context IDs itself. Rename rxfh_max_context_id to rxfh_max_num_contexts to make its semantics (hopefully) more obvious. Fixes: 847a8ab1 ("net: ethtool: let the core choose RSS context IDs") Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Link: https://patch.msgid.link/5a2d11a599aa5b0cc6141072c01accfb7758650c.1723045898.git.ecree.xilinx@gmail.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Arnd Bergmann authored
Using FIELD_GET() fails in configurations that don't already include the header file indirectly: drivers/net/pse-pd/tps23881.c: In function 'tps23881_i2c_probe': drivers/net/pse-pd/tps23881.c:755:13: error: implicit declaration of function 'FIELD_GET' [-Wimplicit-function-declaration] 755 | if (FIELD_GET(TPS23881_REG_DEVID_MASK, ret) != TPS23881_DEVICE_ID) { | ^~~~~~~~~ Fixes: 89108cb5 ("net: pse-pd: tps23881: Fix the device ID check") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Oleksij Rempel <o.rempel@pengutronix.de> Link: https://patch.msgid.link/20240807075455.2055224-1-arnd@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Csókás, Bence authored
PPS was not stopped in `fec_ptp_stop()`, called when the adapter was removed. Consequentially, you couldn't safely reload the driver with the PPS signal on. Fixes: 32cba57b ("net: fec: introduce fec_ptp_stop and use in probe fail path") Reviewed-by: Fabio Estevam <festevam@gmail.com> Link: https://lore.kernel.org/netdev/CAOMZO5BzcZR8PwKKwBssQq_wAGzVgf1ffwe_nhpQJjviTdxy-w@mail.gmail.com/T/#m01dcb810bfc451a492140f6797ca77443d0cb79fSigned-off-by: Csókás, Bence <csokas.bence@prolan.hu> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Frank Li <Frank.Li@nxp.com> Link: https://patch.msgid.link/20240807080956.2556602-1-csokas.bence@prolan.huSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Florian Fainelli authored
Some Wake-on-LAN modes such as WAKE_FILTER may only be supported by the MAC, while others might be only supported by the PHY. Make sure that the .get_wol() returns the union of both rather than only that of the PHY if the PHY supports Wake-on-LAN. Fixes: 7e400ff3 ("net: bcmgenet: Add support for PHY-based Wake-on-LAN") Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/20240806175659.3232204-1-florian.fainelli@broadcom.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
James Chapman authored
When l2tp tunnels use a socket provided by userspace, we can hit lockdep splats like the below when data is transmitted through another (unrelated) userspace socket which then gets routed over l2tp. This issue was previously discussed here: https://lore.kernel.org/netdev/87sfialu2n.fsf@cloudflare.com/ The solution is to have lockdep treat socket locks of l2tp tunnel sockets separately than those of standard INET sockets. To do so, use a different lockdep subclass where lock nesting is possible. ============================================ WARNING: possible recursive locking detected 6.10.0+ #34 Not tainted -------------------------------------------- iperf3/771 is trying to acquire lock: ffff8881027601d8 (slock-AF_INET/1){+.-.}-{2:2}, at: l2tp_xmit_skb+0x243/0x9d0 but task is already holding lock: ffff888102650d98 (slock-AF_INET/1){+.-.}-{2:2}, at: tcp_v4_rcv+0x1848/0x1e10 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(slock-AF_INET/1); lock(slock-AF_INET/1); *** DEADLOCK *** May be due to missing lock nesting notation 10 locks held by iperf3/771: #0: ffff888102650258 (sk_lock-AF_INET){+.+.}-{0:0}, at: tcp_sendmsg+0x1a/0x40 #1: ffffffff822ac220 (rcu_read_lock){....}-{1:2}, at: __ip_queue_xmit+0x4b/0xbc0 #2: ffffffff822ac220 (rcu_read_lock){....}-{1:2}, at: ip_finish_output2+0x17a/0x1130 #3: ffffffff822ac220 (rcu_read_lock){....}-{1:2}, at: process_backlog+0x28b/0x9f0 #4: ffffffff822ac220 (rcu_read_lock){....}-{1:2}, at: ip_local_deliver_finish+0xf9/0x260 #5: ffff888102650d98 (slock-AF_INET/1){+.-.}-{2:2}, at: tcp_v4_rcv+0x1848/0x1e10 #6: ffffffff822ac220 (rcu_read_lock){....}-{1:2}, at: __ip_queue_xmit+0x4b/0xbc0 #7: ffffffff822ac220 (rcu_read_lock){....}-{1:2}, at: ip_finish_output2+0x17a/0x1130 #8: ffffffff822ac1e0 (rcu_read_lock_bh){....}-{1:2}, at: __dev_queue_xmit+0xcc/0x1450 #9: ffff888101f33258 (dev->qdisc_tx_busylock ?: &qdisc_tx_busylock#2){+...}-{2:2}, at: __dev_queue_xmit+0x513/0x1450 stack backtrace: CPU: 2 UID: 0 PID: 771 Comm: iperf3 Not tainted 6.10.0+ #34 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 Call Trace: <IRQ> dump_stack_lvl+0x69/0xa0 dump_stack+0xc/0x20 __lock_acquire+0x135d/0x2600 ? srso_alias_return_thunk+0x5/0xfbef5 lock_acquire+0xc4/0x2a0 ? l2tp_xmit_skb+0x243/0x9d0 ? __skb_checksum+0xa3/0x540 _raw_spin_lock_nested+0x35/0x50 ? l2tp_xmit_skb+0x243/0x9d0 l2tp_xmit_skb+0x243/0x9d0 l2tp_eth_dev_xmit+0x3c/0xc0 dev_hard_start_xmit+0x11e/0x420 sch_direct_xmit+0xc3/0x640 __dev_queue_xmit+0x61c/0x1450 ? ip_finish_output2+0xf4c/0x1130 ip_finish_output2+0x6b6/0x1130 ? srso_alias_return_thunk+0x5/0xfbef5 ? __ip_finish_output+0x217/0x380 ? srso_alias_return_thunk+0x5/0xfbef5 __ip_finish_output+0x217/0x380 ip_output+0x99/0x120 __ip_queue_xmit+0xae4/0xbc0 ? srso_alias_return_thunk+0x5/0xfbef5 ? srso_alias_return_thunk+0x5/0xfbef5 ? tcp_options_write.constprop.0+0xcb/0x3e0 ip_queue_xmit+0x34/0x40 __tcp_transmit_skb+0x1625/0x1890 __tcp_send_ack+0x1b8/0x340 tcp_send_ack+0x23/0x30 __tcp_ack_snd_check+0xa8/0x530 ? srso_alias_return_thunk+0x5/0xfbef5 tcp_rcv_established+0x412/0xd70 tcp_v4_do_rcv+0x299/0x420 tcp_v4_rcv+0x1991/0x1e10 ip_protocol_deliver_rcu+0x50/0x220 ip_local_deliver_finish+0x158/0x260 ip_local_deliver+0xc8/0xe0 ip_rcv+0xe5/0x1d0 ? __pfx_ip_rcv+0x10/0x10 __netif_receive_skb_one_core+0xce/0xe0 ? process_backlog+0x28b/0x9f0 __netif_receive_skb+0x34/0xd0 ? process_backlog+0x28b/0x9f0 process_backlog+0x2cb/0x9f0 __napi_poll.constprop.0+0x61/0x280 net_rx_action+0x332/0x670 ? srso_alias_return_thunk+0x5/0xfbef5 ? find_held_lock+0x2b/0x80 ? srso_alias_return_thunk+0x5/0xfbef5 ? srso_alias_return_thunk+0x5/0xfbef5 handle_softirqs+0xda/0x480 ? __dev_queue_xmit+0xa2c/0x1450 do_softirq+0xa1/0xd0 </IRQ> <TASK> __local_bh_enable_ip+0xc8/0xe0 ? __dev_queue_xmit+0xa2c/0x1450 __dev_queue_xmit+0xa48/0x1450 ? ip_finish_output2+0xf4c/0x1130 ip_finish_output2+0x6b6/0x1130 ? srso_alias_return_thunk+0x5/0xfbef5 ? __ip_finish_output+0x217/0x380 ? srso_alias_return_thunk+0x5/0xfbef5 __ip_finish_output+0x217/0x380 ip_output+0x99/0x120 __ip_queue_xmit+0xae4/0xbc0 ? srso_alias_return_thunk+0x5/0xfbef5 ? srso_alias_return_thunk+0x5/0xfbef5 ? tcp_options_write.constprop.0+0xcb/0x3e0 ip_queue_xmit+0x34/0x40 __tcp_transmit_skb+0x1625/0x1890 tcp_write_xmit+0x766/0x2fb0 ? __entry_text_end+0x102ba9/0x102bad ? srso_alias_return_thunk+0x5/0xfbef5 ? __might_fault+0x74/0xc0 ? srso_alias_return_thunk+0x5/0xfbef5 __tcp_push_pending_frames+0x56/0x190 tcp_push+0x117/0x310 tcp_sendmsg_locked+0x14c1/0x1740 tcp_sendmsg+0x28/0x40 inet_sendmsg+0x5d/0x90 sock_write_iter+0x242/0x2b0 vfs_write+0x68d/0x800 ? __pfx_sock_write_iter+0x10/0x10 ksys_write+0xc8/0xf0 __x64_sys_write+0x3d/0x50 x64_sys_call+0xfaf/0x1f50 do_syscall_64+0x6d/0x140 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7f4d143af992 Code: c3 8b 07 85 c0 75 24 49 89 fb 48 89 f0 48 89 d7 48 89 ce 4c 89 c2 4d 89 ca 4c 8b 44 24 08 4c 8b 4c 24 10 4c 89 5c 24 08 0f 05 <c3> e9 01 cc ff ff 41 54 b8 02 00 00 0 RSP: 002b:00007ffd65032058 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007f4d143af992 RDX: 0000000000000025 RSI: 00007f4d143f3bcc RDI: 0000000000000005 RBP: 00007f4d143f2b28 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00007f4d143f3bcc R13: 0000000000000005 R14: 0000000000000000 R15: 00007ffd650323f0 </TASK> Fixes: 0b2c5972 ("l2tp: close all race conditions in l2tp_tunnel_register()") Suggested-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot+6acef9e0a4d1f46c83d4@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=6acef9e0a4d1f46c83d4 CC: gnault@redhat.com CC: cong.wang@bytedance.com Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: Tom Parkin <tparkin@katalix.com> Link: https://patch.msgid.link/20240806160626.1248317-1-jchapman@katalix.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Russell King (Oracle) authored
dwmac4 was decoding the duplex mode from the GMAC_PHYIF_CONTROL_STATUS register incorrectly, using GMAC_PHYIF_CTRLSTATUS_LNKMOD_MASK (value 1) rather than GMAC_PHYIF_CTRLSTATUS_LNKMOD (bit 16). Fix this. Fixes: 70523e63 ("drivers: net: stmmac: reworking the PCS code.") Reviewed-by: Andrew Halaney <ahalaney@redhat.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Serge Semin <fancer.lancer@gmail.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1sbJvd-001rGD-E3@rmk-PC.armlinux.org.ukSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetoothJakub Kicinski authored
Luiz Augusto von Dentz says: ==================== bluetooth pull request for net: - hci_sync: avoid dup filtering when passive scanning with adv monitor - hci_qca: don't call pwrseq_power_off() twice for QCA6390 - hci_qca: fix QCA6390 support on non-DT platforms - hci_qca: fix a NULL-pointer derefence at shutdown - l2cap: always unlock channel in l2cap_conless_channel() * tag 'for-net-2024-08-07' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth: Bluetooth: hci_sync: avoid dup filtering when passive scanning with adv monitor Bluetooth: l2cap: always unlock channel in l2cap_conless_channel() Bluetooth: hci_qca: fix a NULL-pointer derefence at shutdown Bluetooth: hci_qca: fix QCA6390 support on non-DT platforms Bluetooth: hci_qca: don't call pwrseq_power_off() twice for QCA6390 ==================== Link: https://patch.msgid.link/20240807210103.142483-1-luiz.dentz@gmail.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
Tony Nguyen says: ==================== idpf: fix 3 bugs revealed by the Chapter I Alexander Lobakin says: The libeth conversion revealed 2 serious issues which lead to sporadic crashes or WARNs under certain configurations. Additional one was found while debugging these two with kmemleak. This one is targeted stable, the rest can be backported manually later if needed. They can be reproduced only after the conversion is applied anyway. ==================== Link: https://patch.msgid.link/20240806220923.3359860-1-anthony.l.nguyen@intel.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Alexander Lobakin authored
The second tagged commit started sometimes (very rarely, but possible) throwing WARNs from net/core/page_pool.c:page_pool_disable_direct_recycling(). Turned out idpf frees interrupt vectors with embedded NAPIs *before* freeing the queues making page_pools' NAPI pointers lead to freed memory before these pools are destroyed by libeth. It's not clear whether there are other accesses to the freed vectors when destroying the queues, but anyway, we usually free queue/interrupt vectors only when the queues are destroyed and the NAPIs are guaranteed to not be referenced anywhere. Invert the allocation and freeing logic making queue/interrupt vectors be allocated first and freed last. Vectors don't require queues to be present, so this is safe. Additionally, this change allows to remove that useless queue->q_vector pointer cleanup, as vectors are still valid when freeing the queues (+ both are freed within one function, so it's not clear why nullify the pointers at all). Fixes: 1c325aac ("idpf: configure resources for TX queues") Fixes: 90912f9f ("idpf: convert header split mode to libeth + napi_build_skb()") Reported-by: Michal Kubiak <michal.kubiak@intel.com> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20240806220923.3359860-4-anthony.l.nguyen@intel.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Michal Kubiak authored
The initialization of vport interrupt consists of two functions: 1) idpf_vport_intr_init() where a generic configuration is done 2) idpf_vport_intr_req_irq() where the irq for each q_vector is requested. The first function used to create a base name for each interrupt using "kasprintf()" call. Unfortunately, although that call allocated memory for a text buffer, that memory was never released. Fix this by removing creating the interrupt base name in 1). Instead, always create a full interrupt name in the function 2), because there is no need to create a base name separately, considering that the function 2) is never called out of idpf_vport_intr_init() context. Fixes: d4d55871 ("idpf: initialize interrupts and enable vport") Cc: stable@vger.kernel.org # 6.7 Signed-off-by: Michal Kubiak <michal.kubiak@intel.com> Reviewed-by: Pavan Kumar Linga <pavan.kumar.linga@intel.com> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20240806220923.3359860-3-anthony.l.nguyen@intel.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Alexander Lobakin authored
The second tagged commit introduced a UAF, as it removed restoring q_vector->vport pointers after reinitializating the structures. This is due to that all queue allocation functions are performed here with the new temporary vport structure and those functions rewrite the backpointers to the vport. Then, this new struct is freed and the pointers start leading to nowhere. But generally speaking, the current logic is very fragile. It claims to be more reliable when the system is low on memory, but in fact, it consumes two times more memory as at the moment of running this function, there are two vports allocated with their queues and vectors. Moreover, it claims to prevent the driver from running into "bad state", but in fact, any error during the rebuild leaves the old vport in the partially allocated state. Finally, if the interface is down when the function is called, it always allocates a new queue set, but when the user decides to enable the interface later on, vport_open() allocates them once again, IOW there's a clear memory leak here. Just don't allocate a new queue set when performing a reset, that solves crashes and memory leaks. Readd the old queue number and reopen the interface on rollback - that solves limbo states when the device is left disabled and/or without HW queues enabled. Fixes: 02cbfba1 ("idpf: add ethtool callbacks") Fixes: e4891e46 ("idpf: split &idpf_queue into 4 strictly-typed queue structures") Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20240806220923.3359860-2-anthony.l.nguyen@intel.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Michael Chan authored
A recent commit has modified the code in __bnxt_reserve_rings() to set the default RSS indirection table to default only when the number of RX rings is changing. While this works for newer firmware that requires RX ring reservations, it causes the regression on older firmware not requiring RX ring resrvations (BNXT_NEW_RM() returns false). With older firmware, RX ring reservations are not required and so hw_resc->resv_rx_rings is not always set to the proper value. The comparison: if (old_rx_rings != bp->hw_resc.resv_rx_rings) in __bnxt_reserve_rings() may be false even when the RX rings are changing. This will cause __bnxt_reserve_rings() to skip setting the default RSS indirection table to default to match the current number of RX rings. This may later cause bnxt_fill_hw_rss_tbl() to use an out-of-range index. We already have bnxt_check_rss_tbl_no_rmgr() to handle exactly this scenario. We just need to move it up in bnxt_need_reserve_rings() to be called unconditionally when using older firmware. Without the fix, if the TX rings are changing, we'll skip the bnxt_check_rss_tbl_no_rmgr() call and __bnxt_reserve_rings() may also skip the bnxt_set_dflt_rss_indir_tbl() call for the reason explained in the last paragraph. Without setting the default RSS indirection table to default, it causes the regression: BUG: KASAN: slab-out-of-bounds in __bnxt_hwrm_vnic_set_rss+0xb79/0xe40 Read of size 2 at addr ffff8881c5809618 by task ethtool/31525 Call Trace: __bnxt_hwrm_vnic_set_rss+0xb79/0xe40 bnxt_hwrm_vnic_rss_cfg_p5+0xf7/0x460 __bnxt_setup_vnic_p5+0x12e/0x270 __bnxt_open_nic+0x2262/0x2f30 bnxt_open_nic+0x5d/0xf0 ethnl_set_channels+0x5d4/0xb30 ethnl_default_set_doit+0x2f1/0x620 Reported-by: Breno Leitao <leitao@debian.org> Closes: https://lore.kernel.org/netdev/ZrC6jpghA3PWVWSB@gmail.com/ Fixes: 98ba1d93 ("bnxt_en: Fix RSS logic in __bnxt_reserve_rings()") Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Tested-by: Breno Leitao <leitao@debian.org> Link: https://patch.msgid.link/20240806053742.140304-1-michael.chan@broadcom.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Joe Hattori authored
bcm_sf2_mdio_register() calls of_phy_find_device() and then phy_device_remove() in a loop to remove existing PHY devices. of_phy_find_device() eventually calls bus_find_device(), which calls get_device() on the returned struct device * to increment the refcount. The current implementation does not decrement the refcount, which causes memory leak. This commit adds the missing phy_device_free() call to decrement the refcount via put_device() to balance the refcount. Fixes: 771089c2 ("net: dsa: bcm_sf2: Ensure that MDIO diversion is used") Signed-off-by: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp> Tested-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/20240806011327.3817861-1-joe@pf.is.s.u-tokyo.ac.jpSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Zhengchao Shao authored
The number of fallback reasons defined in the smc_clc.h file has reached 36. For historical reasons, some are no longer quoted, and there's 33 actually in use. So, add the max value of fallback reason count to 36. Fixes: 6ac1e656 ("net/smc: support smc v2.x features validate") Fixes: 7f0620b9 ("net/smc: support max connections per lgr negotiation") Fixes: 69b888e3 ("net/smc: support max links per lgr negotiation in clc handshake") Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Reviewed-by: Wenjia Zhang <wenjia@linux.ibm.com> Reviewed-by: D. Wythe <alibuda@linux.alibaba.com> Link: https://patch.msgid.link/20240805043856.565677-1-shaozhengchao@huawei.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
- 07 Aug, 2024 10 commits
-
-
Anton Khirnov authored
This restores behaviour (including the comment) from now-removed hci_request.c, and also matches existing code for active scanning. Without this, the duplicates filter is always active when passive scanning, which makes it impossible to work with devices that send nontrivial dynamic data in their advertisement reports. Fixes: abfeea47 ("Bluetooth: hci_sync: Convert MGMT_OP_START_DISCOVERY") Signed-off-by: Anton Khirnov <anton@khirnov.net> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
-
Dmitry Antipov authored
Add missing call to 'l2cap_chan_unlock()' on receive error handling path in 'l2cap_conless_channel()'. Fixes: a24cce14 ("Bluetooth: Fix reference counting of global L2CAP channels") Reported-by: syzbot+45ac74737e866894acb0@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=45ac74737e866894acb0Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
-
Bartosz Golaszewski authored
Unlike qca_regulator_init(), qca_power_shutdown() may be called for QCA_ROME which does not have qcadev->bt_power assigned. Add a NULL-pointer check before dereferencing the struct qca_power pointer. Fixes: eba17187 ("Bluetooth: hci_qca: make pwrseq calls the default if available") Reported-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Closes: https://lore.kernel.org/linux-bluetooth/su3wp6s44hrxf4ijvsdfzbvv4unu4ycb7kkvwbx6ltdafkldir@4g7ydqm2ap5j/Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
-
Bartosz Golaszewski authored
QCA6390 can albo be used on non-DT systems so we must not make the power sequencing the only option. Check if the serdev device consumes an OF node. If so: honor the new contract as per the DT bindings. If not: fall back to the previous behavior by falling through to the existing default label. Fixes: 9a15ce68 ("Bluetooth: qca: use the power sequencer for QCA6390") Reported-by: Wren Turkal <wt@penguintechs.org> Closes: https://lore.kernel.org/linux-bluetooth/27e6a6c5-fb63-4219-be0b-eefa2c116e06@penguintechs.org/Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
-
Bartosz Golaszewski authored
Now that we call pwrseq_power_off() for all models that hold a valid power sequencing handle, we can remove the switch case for QCA_6390. The switch will now use the default label for this model but that's fine: if it has the BT-enable GPIO than we should use it. Fixes: eba17187 ("Bluetooth: hci_qca: make pwrseq calls the default if available") Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
-
Mateusz Polchlopek authored
Commit ac21add2 ("ice: Implement driver functionality to dump fec statistics") introduces obtaining FEC correctable and uncorrectable stats per netdev in ICE driver. Unfortunately the assignment of values to fec_stats structure has been done incorrectly. This commit fixes the assignments. Fixes: ac21add2 ("ice: Implement driver functionality to dump fec statistics") Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Signed-off-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
-
Grzegorz Nitka authored
Block HW write access for the driver while the device is in reset to avoid potential race condition and access to the PTP HW in non-nominal state which could lead to undesired effects Fixes: 4aad5335 ("ice: add individual interrupt allocation") Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com> Co-developed-by: Karol Kolacinski <karol.kolacinski@intel.com> Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Signed-off-by: Sergey Temerkhanov <sergey.temerkhanov@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
-
Grzegorz Nitka authored
Synchronize OICR IRQ when preparing for reset to avoid potential race conditions between the reset procedure and OICR Fixes: 4aad5335 ("ice: add individual interrupt allocation") Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com> Signed-off-by: Sergey Temerkhanov <sergey.temerkhanov@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
-
ZHANG Yuntian authored
Add support for MeiG Smart SRM825L which is based on Qualcomm 315 chip. T: Bus=04 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 2 Spd=5000 MxCh= 0 D: Ver= 3.20 Cls=00(>ifc ) Sub=00 Prot=00 MxPS= 9 #Cfgs= 1 P: Vendor=2dee ProdID=4d22 Rev= 4.14 S: Manufacturer=MEIG S: Product=LTE-A Module S: SerialNumber=6f345e48 C:* #Ifs= 6 Cfg#= 1 Atr=80 MxPwr=896mA I:* If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option E: Ad=81(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=01(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms I:* If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=83(I) Atr=03(Int.) MxPS= 10 Ivl=32ms E: Ad=82(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=02(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms I:* If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=85(I) Atr=03(Int.) MxPS= 10 Ivl=32ms E: Ad=84(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=03(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms I:* If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=60 Driver=option E: Ad=87(I) Atr=03(Int.) MxPS= 10 Ivl=32ms E: Ad=86(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=04(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms I:* If#= 4 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=42 Prot=01 Driver=(none) E: Ad=05(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=88(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms I:* If#= 5 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=50 Driver=qmi_wwan E: Ad=89(I) Atr=03(Int.) MxPS= 8 Ivl=32ms E: Ad=8e(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=0f(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms Signed-off-by: ZHANG Yuntian <yt@radxa.com> Link: https://patch.msgid.link/D1EB81385E405DFE+20240803074656.567061-1-yt@radxa.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Tristram Ha authored
The wol variable in ksz_port_set_mac_address() is declared with random data, but the code in ksz_get_wol call may not be executed so the WAKE_MAGIC check may be invalid resulting in an error message when setting a MAC address after starting the DSA driver. Fixes: 3b454b63 ("net: dsa: microchip: ksz9477: Add Wake on Magic Packet support") Signed-off-by: Tristram Ha <tristram.ha@microchip.com> Reviewed-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/20240805235200.24982-1-Tristram.Ha@microchip.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
- 06 Aug, 2024 1 commit
-
-
Eric Dumazet authored
linkwatch_event() grabs possibly very contended RTNL mutex. system_wq is not suitable for such work. Inspired by many noisy syzbot reports. 3 locks held by kworker/0:7/5266: #0: ffff888015480948 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3206 [inline] #0: ffff888015480948 ((wq_completion)events){+.+.}-{0:0}, at: process_scheduled_works+0x90a/0x1830 kernel/workqueue.c:3312 #1: ffffc90003f6fd00 ((linkwatch_work).work){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3207 [inline] , at: process_scheduled_works+0x945/0x1830 kernel/workqueue.c:3312 #2: ffffffff8fa6f208 (rtnl_mutex){+.+.}-{3:3}, at: linkwatch_event+0xe/0x60 net/core/link_watch.c:276 Reported-by: syzbot <syzkaller@googlegroups.com> Fixes: 1da177e4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20240805085821.1616528-1-edumazet@google.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
- 05 Aug, 2024 5 commits
-
-
Nikolay Aleksandrov authored
syzbot hit a use-after-free[1] which is caused because the bridge doesn't make sure that all previous garbage has been collected when removing a port. What happens is: CPU 1 CPU 2 start gc cycle remove port acquire gc lock first wait for lock call br_multicasg_gc() directly acquire lock now but free port the port can be freed while grp timers still running Make sure all previous gc cycles have finished by using flush_work before freeing the port. [1] BUG: KASAN: slab-use-after-free in br_multicast_port_group_expired+0x4c0/0x550 net/bridge/br_multicast.c:861 Read of size 8 at addr ffff888071d6d000 by task syz.5.1232/9699 CPU: 1 PID: 9699 Comm: syz.5.1232 Not tainted 6.10.0-rc5-syzkaller-00021-g24ca36a5 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/07/2024 Call Trace: <IRQ> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:114 print_address_description mm/kasan/report.c:377 [inline] print_report+0xc3/0x620 mm/kasan/report.c:488 kasan_report+0xd9/0x110 mm/kasan/report.c:601 br_multicast_port_group_expired+0x4c0/0x550 net/bridge/br_multicast.c:861 call_timer_fn+0x1a3/0x610 kernel/time/timer.c:1792 expire_timers kernel/time/timer.c:1843 [inline] __run_timers+0x74b/0xaf0 kernel/time/timer.c:2417 __run_timer_base kernel/time/timer.c:2428 [inline] __run_timer_base kernel/time/timer.c:2421 [inline] run_timer_base+0x111/0x190 kernel/time/timer.c:2437 Reported-by: syzbot+263426984509be19c9a0@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=263426984509be19c9a0 Fixes: e12cec65 ("net: bridge: mcast: destroy all entries via gc") Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/20240802080730.3206303-1-razor@blackwall.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Daniele Palmas authored
Free the unused skb when not ip packets arrive. Fixes: c6adf779 ("net: usb: qmi_wwan: add qmap mux protocol support") Signed-off-by: Daniele Palmas <dnlplm@gmail.com> Acked-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Heng Qi says: ==================== virtio-net: unbreak vq resizing if vq coalescing is not supported Currently, if the driver does not negotiate the vq coalescing feature but supports vq resize, the vq resize action, which could have been successfully executed, is interrupted due to the failure in configuring the vq coalescing parameters. This issue needs to be fixed. Changelog ========= v3->v4: - Add a comment for patch[2/2]. v2->v3: - Break out the feature check and the fix into separate patches. v1->v2: - Rephrase the subject. - Put the feature check inside the virtnet_send_{r,t}x_ctrl_coal_vq_cmd. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Heng Qi authored
Don't break the resize action if the vq coalescing feature named VIRTIO_NET_F_VQ_NOTF_COAL is not negotiated. Fixes: f61fe5f0 ("virtio-net: fix the vq coalescing setting for vq resize") Signed-off-by: Heng Qi <hengqi@linux.alibaba.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Eugenio Pé rez <eperezma@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Heng Qi authored
Virtio spec says: The driver MUST have negotiated the VIRTIO_NET_F_VQ_NOTF_COAL feature when issuing commands VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET and VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET. So we add the feature negotiation check to virtnet_send_{r,t}x_ctrl_coal_vq_cmd as a basis for the next bugfix patch. Suggested-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Heng Qi <hengqi@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 04 Aug, 2024 1 commit
-
-
Dmitry Safonov authored
The lifetime of TCP-AO static_key is the same as the last tcp_ao_info. On the socket destruction tcp_ao_info ceases to be with RCU grace period, while tcp-ao static branch is currently deferred destructed. The static key definition is : DEFINE_STATIC_KEY_DEFERRED_FALSE(tcp_ao_needed, HZ); which means that if RCU grace period is delayed by more than a second and tcp_ao_needed is in the process of disablement, other CPUs may yet see tcp_ao_info which atent dead, but soon-to-be. And that breaks the assumption of static_key_fast_inc_not_disabled(). See the comment near the definition: > * The caller must make sure that the static key can't get disabled while > * in this function. It doesn't patch jump labels, only adds a user to > * an already enabled static key. Originally it was introduced in commit eb8c5072 ("jump_label: Prevent key->enabled int overflow"), which is needed for the atomic contexts, one of which would be the creation of a full socket from a request socket. In that atomic context, it's known by the presence of the key (md5/ao) that the static branch is already enabled. So, the ref counter for that static branch is just incremented instead of holding the proper mutex. static_key_fast_inc_not_disabled() is just a helper for such usage case. But it must not be used if the static branch could get disabled in parallel as it's not protected by jump_label_mutex and as a result, races with jump_label_update() implementation details. Happened on netdev test-bot[1], so not a theoretical issue: [] jump_label: Fatal kernel bug, unexpected op at tcp_inbound_hash+0x1a7/0x870 [ffffffffa8c4e9b7] (eb 50 0f 1f 44 != 66 90 0f 1f 00)) size:2 type:1 [] ------------[ cut here ]------------ [] kernel BUG at arch/x86/kernel/jump_label.c:73! [] Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN NOPTI [] CPU: 3 PID: 243 Comm: kworker/3:3 Not tainted 6.10.0-virtme #1 [] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014 [] Workqueue: events jump_label_update_timeout [] RIP: 0010:__jump_label_patch+0x2f6/0x350 ... [] Call Trace: [] <TASK> [] arch_jump_label_transform_queue+0x6c/0x110 [] __jump_label_update+0xef/0x350 [] __static_key_slow_dec_cpuslocked.part.0+0x3c/0x60 [] jump_label_update_timeout+0x2c/0x40 [] process_one_work+0xe3b/0x1670 [] worker_thread+0x587/0xce0 [] kthread+0x28a/0x350 [] ret_from_fork+0x31/0x70 [] ret_from_fork_asm+0x1a/0x30 [] </TASK> [] Modules linked in: veth [] ---[ end trace 0000000000000000 ]--- [] RIP: 0010:__jump_label_patch+0x2f6/0x350 [1]: https://netdev-3.bots.linux.dev/vmksft-tcp-ao-dbg/results/696681/5-connect-deny-ipv6/stderr Cc: stable@kernel.org Fixes: 67fa83f7 ("net/tcp: Add static_key for TCP-AO") Signed-off-by: Dmitry Safonov <0x7f454c46@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 02 Aug, 2024 6 commits
-
-
Praveen Kaligineedi authored
GVE driver wrongly relies on netif_carrier_ok() to check the interface administrative state when resources are being allocated/deallocated for queue(s). netif_carrier_ok() needs to be replaced with netif_running() for all such cases. Administrative state is the result of "ip link set dev <dev> up/down". It reflects whether the administrator wants to use the device for traffic and the corresponding resources have been allocated. Fixes: 5f08cd3d ("gve: Alloc before freeing when adjusting queues") Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com> Reviewed-by: Shailend Chand <shailend@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20240801205619.987396-1-pkaligineedi@google.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Kyle Swenson authored
The DEVID register contains two pieces of information: the device ID in the upper nibble, and the silicon revision number in the lower nibble. The driver should work fine with any silicon revision, so let's mask that out in the device ID check. Fixes: 20e6d190 ("net: pse-pd: Add TI TPS23881 PSE controller driver") Signed-off-by: Kyle Swenson <kyle.swenson@est.tech> Reviewed-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com> Acked-by: Oleksij Rempel <o.rempel@pengutronix.de> Link: https://patch.msgid.link/20240731154152.4020668-1-kyle.swenson@est.techSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Kuniyuki Iwashima authored
syzbot reported a null-ptr-deref while accessing sk2->sk_reuseport_cb in reuseport_add_sock(). [0] The repro first creates a listener with SO_REUSEPORT. Then, it creates another listener on the same port and concurrently closes the first listener. The second listen() calls reuseport_add_sock() with the first listener as sk2, where sk2->sk_reuseport_cb is not expected to be cleared concurrently, but the close() does clear it by reuseport_detach_sock(). The problem is SCTP does not properly synchronise reuseport_alloc(), reuseport_add_sock(), and reuseport_detach_sock(). The caller of reuseport_alloc() and reuseport_{add,detach}_sock() must provide synchronisation for sockets that are classified into the same reuseport group. Otherwise, such sockets form multiple identical reuseport groups, and all groups except one would be silently dead. 1. Two sockets call listen() concurrently 2. No socket in the same group found in sctp_ep_hashtable[] 3. Two sockets call reuseport_alloc() and form two reuseport groups 4. Only one group hit first in __sctp_rcv_lookup_endpoint() receives incoming packets Also, the reported null-ptr-deref could occur. TCP/UDP guarantees that would not happen by holding the hash bucket lock. Let's apply the locking strategy to __sctp_hash_endpoint() and __sctp_unhash_endpoint(). [0]: Oops: general protection fault, probably for non-canonical address 0xdffffc0000000002: 0000 [#1] PREEMPT SMP KASAN PTI KASAN: null-ptr-deref in range [0x0000000000000010-0x0000000000000017] CPU: 1 UID: 0 PID: 10230 Comm: syz-executor119 Not tainted 6.10.0-syzkaller-12585-g301927d2 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/27/2024 RIP: 0010:reuseport_add_sock+0x27e/0x5e0 net/core/sock_reuseport.c:350 Code: 00 0f b7 5d 00 bf 01 00 00 00 89 de e8 1b a4 ff f7 83 fb 01 0f 85 a3 01 00 00 e8 6d a0 ff f7 49 8d 7e 12 48 89 f8 48 c1 e8 03 <42> 0f b6 04 28 84 c0 0f 85 4b 02 00 00 41 0f b7 5e 12 49 8d 7e 14 RSP: 0018:ffffc9000b947c98 EFLAGS: 00010202 RAX: 0000000000000002 RBX: ffff8880252ddf98 RCX: ffff888079478000 RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000012 RBP: 0000000000000001 R08: ffffffff8993e18d R09: 1ffffffff1fef385 R10: dffffc0000000000 R11: fffffbfff1fef386 R12: ffff8880252ddac0 R13: dffffc0000000000 R14: 0000000000000000 R15: 0000000000000000 FS: 00007f24e45b96c0(0000) GS:ffff8880b9300000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007ffcced5f7b8 CR3: 00000000241be000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> __sctp_hash_endpoint net/sctp/input.c:762 [inline] sctp_hash_endpoint+0x52a/0x600 net/sctp/input.c:790 sctp_listen_start net/sctp/socket.c:8570 [inline] sctp_inet_listen+0x767/0xa20 net/sctp/socket.c:8625 __sys_listen_socket net/socket.c:1883 [inline] __sys_listen+0x1b7/0x230 net/socket.c:1894 __do_sys_listen net/socket.c:1902 [inline] __se_sys_listen net/socket.c:1900 [inline] __x64_sys_listen+0x5a/0x70 net/socket.c:1900 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f24e46039b9 Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 91 1a 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f24e45b9228 EFLAGS: 00000246 ORIG_RAX: 0000000000000032 RAX: ffffffffffffffda RBX: 00007f24e468e428 RCX: 00007f24e46039b9 RDX: 00007f24e46039b9 RSI: 0000000000000003 RDI: 0000000000000004 RBP: 00007f24e468e420 R08: 00007f24e45b96c0 R09: 00007f24e45b96c0 R10: 00007f24e45b96c0 R11: 0000000000000246 R12: 00007f24e468e42c R13: 00007f24e465a5dc R14: 0020000000000001 R15: 00007ffcced5f7d8 </TASK> Modules linked in: Fixes: 6ba84574 ("sctp: process sk_reuseport in sctp_get_port_local") Reported-by: syzbot+e6979a5d2f10ecb700e4@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=e6979a5d2f10ecb700e4 Tested-by: syzbot+e6979a5d2f10ecb700e4@syzkaller.appspotmail.com Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Acked-by: Xin Long <lucien.xin@gmail.com> Link: https://patch.msgid.link/20240731234624.94055-1-kuniyu@amazon.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Stephen Hemminger authored
The old SysKonnect NIc's are not used or actively maintained anymore. My sky2 NIC's are all in box in back corner of attic. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Link: https://patch.msgid.link/20240801162930.212299-1-stephen@networkplumber.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
Matthieu Baerts says: ==================== mptcp: fix endpoints with 'signal' and 'subflow' flags When looking at improving the user experience around the MPTCP endpoints setup, I noticed that setting an endpoint with both the 'signal' and the 'subflow' flags -- as it has been done in the past by users according to bug reports we got -- was resulting on only announcing the endpoint, but not using it to create subflows: the 'subflow' flag was then ignored. My initial thought was to modify IPRoute2 to warn the user when the two flags were set, but it doesn't sound normal to ignore one of them. I then looked at modifying the kernel not to allow having the two flags set, but when discussing about that with Mat, we thought it was maybe not ideal to do that, as there might be use-cases, we might break some configs. Then I saw it was working before v5.17. So instead, I fixed the support on the kernel side (patch 5) using Paolo's suggestion. This also includes a fix on the options side (patch 1: for v5.11+), an explicit deny of some options combinations (patch 2: for v5.18+), and some refactoring (patches 3 and 4) to ease the inclusion of the patch 5. While at it, I added a new selftest (patch 7) to validate this case -- including a modification of the chk_add_nr helper to inverse the sides were the counters are checked (patch 6) -- and allowed ADD_ADDR echo just after the MP_JOIN 3WHS. The selftests modification have the same Fixes tag as the previous commit, but no 'Cc: Stable': if the backport can work, that's good -- but it still need to be verified by running the selftests -- if not, no need to worry, many CIs will use the selftests from the last stable version to validate previous stable releases. ==================== Link: https://patch.msgid.link/20240731-upstream-net-20240731-mptcp-endp-subflow-signal-v1-0-c8a9b036493b@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Matthieu Baerts (NGI0) authored
It should be quite uncommon to set both the subflow and the signal flags: the initiator of the connection is typically the one creating new subflows, not the other peer, then no need to announce additional local addresses, and use it to create subflows. But some people might be confused about the flags, and set both "just to be sure at least the right one is set". To verify the previous fix, and avoid future regressions, this specific case is now validated: the client announces a new address, and initiates a new subflow from the same address. While working on this, another bug has been noticed, where the client reset the new subflow because an ADD_ADDR echo got received as the 3rd ACK: this new test also explicitly checks that no RST have been sent by the client and server. The 'Fixes' tag here below is the same as the one from the previous commit: this patch here is not fixing anything wrong in the selftests, but it validates the previous fix for an issue introduced by this commit ID. Fixes: 86e39e04 ("mptcp: keep track of local endpoint still available for each msk") Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20240731-upstream-net-20240731-mptcp-endp-subflow-signal-v1-7-c8a9b036493b@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-