- 13 Feb, 2024 14 commits
-
-
Paolo Abeni authored
The netdev CI is reporting failures for the pmtu test: [ 115.929264] br0: port 2(vxlan_a) entered forwarding state # 2024/02/08 17:33:22 socat[7871] E bind(7, {AF=10 [0000:0000:0000:0000:0000:0000:0000:0000]:50000}, 28): Address already in use # 2024/02/08 17:33:22 socat[7877] E write(7, 0x5598fb6ff000, 8192): Connection refused # TEST: IPv6, bridged vxlan4: PMTU exceptions [FAIL] # File size 0 mismatches exepcted value in locally bridged vxlan test The root cause is apparently a socket created by a previous iteration of the relevant loop still lasting in LAST_ACK state. Note that even the file size check is racy, the receiver process dumping the file could still be running in background Allow the listener to bound on the same local port via SO_REUSEADDR and collect file output file size only after the listener completion. Fixes: 136a1b43 ("selftests: net: test vxlan pmtu exceptions with tcp") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Link: https://lore.kernel.org/r/4f51c11a1ce7ca7a4dabd926cffff63dadac9ba1.1707731086.git.pabeni@redhat.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Paolo Abeni authored
The helper waiting for a listener port can match any socket whose hexadecimal representation of source or destination addresses matches that of the given port. Additionally, any socket state is accepted. All the above can let the helper return successfully before the relevant listener is actually ready, with unexpected results. So far I could not find any related failure in the netdev CI, but the next patch is going to make the critical event more easily reproducible. Address the issue matching the port hex only vs the relevant socket field and additionally checking the socket state for TCP sockets. Fixes: 3bdd9fd2 ("selftests/net: synchronize udpgro tests' tx and rx connection") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Link: https://lore.kernel.org/r/192b3dbc443d953be32991d1b0ca432bd4c65008.1707731086.git.pabeni@redhat.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Paolo Abeni authored
The mentioned test is failing in slow environments: # SO_TXTIME ipv4 clock monotonic # ./so_txtime: recv: timeout: Resource temporarily unavailable not ok 1 selftests: net: so_txtime.sh # exit=1 Tuning the tolerance in the test binary is error-prone and doomed to failures is slow-enough environment. Just resort to suppress any error in such cases. Note to suppress them we need first to refactor a bit the code moving it to explicit error handling. Fixes: af5136f9 ("selftests/net: SO_TXTIME with ETF and FQ") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Link: https://lore.kernel.org/r/2142d9ed4b5c5aa07dd1b455779625d91b175373.1707730902.git.pabeni@redhat.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Paolo Abeni authored
The gro self-tests sends the packets to be aggregated with multiple write operations. When running is slow environment, it's hard to guarantee that the GRO engine will wait for the last packet in an intended train. The above causes almost deterministic failures in our CI for the 'large' test-case. Address the issue explicitly ignoring failures for such case in slow environments (KSFT_MACHINE_SLOW==true). Fixes: 7d157501 ("selftests/net: GRO coalesce test") Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Link: https://lore.kernel.org/r/97d3ba83f5a2bfeb36f6bc0fb76724eb3dafb608.1707729403.git.pabeni@redhat.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Randy Dunlap authored
When CONFIG_PTP_1588_CLOCK=m and CONFIG_TI_ICSSG_PRUETH=y, there are kconfig dependency warnings and build errors referencing PTP functions. Fix these by making TI_ICSSG_PRUETH depend on PTP_1588_CLOCK_OPTIONAL. Fixes these build errors and warnings: WARNING: unmet direct dependencies detected for TI_ICSS_IEP Depends on [m]: NETDEVICES [=y] && ETHERNET [=y] && NET_VENDOR_TI [=y] && PTP_1588_CLOCK_OPTIONAL [=m] && TI_PRUSS [=y] Selected by [y]: - TI_ICSSG_PRUETH [=y] && NETDEVICES [=y] && ETHERNET [=y] && NET_VENDOR_TI [=y] && PRU_REMOTEPROC [=y] && ARCH_K3 [=y] && OF [=y] && TI_K3_UDMA_GLUE_LAYER [=y] aarch64-linux-ld: drivers/net/ethernet/ti/icssg/icss_iep.o: in function `icss_iep_get_ptp_clock_idx': icss_iep.c:(.text+0x1d4): undefined reference to `ptp_clock_index' aarch64-linux-ld: drivers/net/ethernet/ti/icssg/icss_iep.o: in function `icss_iep_exit': icss_iep.c:(.text+0xde8): undefined reference to `ptp_clock_unregister' aarch64-linux-ld: drivers/net/ethernet/ti/icssg/icss_iep.o: in function `icss_iep_init': icss_iep.c:(.text+0x176c): undefined reference to `ptp_clock_register' Fixes: 186734c1 ("net: ti: icssg-prueth: add packet timestamping and ptp support") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Roger Quadros <rogerq@ti.com> Cc: Md Danish Anwar <danishanwar@ti.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: netdev@vger.kernel.org Reviewed-by: MD Danish Anwar <danishanwar@ti.com> Link: https://lore.kernel.org/r/20240211061152.14696-1-rdunlap@infradead.orgSigned-off-by: Paolo Abeni <pabeni@redhat.com>
-
Kuniyuki Iwashima authored
syzbot reported a task hung; at the same time, GC was looping infinitely in list_for_each_entry_safe() for OOB skb. [0] syzbot demonstrated that the list_for_each_entry_safe() was not actually safe in this case. A single skb could have references for multiple sockets. If we free such a skb in the list_for_each_entry_safe(), the current and next sockets could be unlinked in a single iteration. unix_notinflight() uses list_del_init() to unlink the socket, so the prefetched next socket forms a loop itself and list_for_each_entry_safe() never stops. Here, we must use while() and make sure we always fetch the first socket. [0]: Sending NMI from CPU 0 to CPUs 1: NMI backtrace for cpu 1 CPU: 1 PID: 5065 Comm: syz-executor236 Not tainted 6.8.0-rc3-syzkaller-00136-g1f719a2f #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/25/2024 RIP: 0010:preempt_count arch/x86/include/asm/preempt.h:26 [inline] RIP: 0010:check_kcov_mode kernel/kcov.c:173 [inline] RIP: 0010:__sanitizer_cov_trace_pc+0xd/0x60 kernel/kcov.c:207 Code: cc cc cc cc 66 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 65 48 8b 14 25 40 c2 03 00 <65> 8b 05 b4 7c 78 7e a9 00 01 ff 00 48 8b 34 24 74 0f f6 c4 01 74 RSP: 0018:ffffc900033efa58 EFLAGS: 00000283 RAX: ffff88807b077800 RBX: ffff88807b077800 RCX: 1ffffffff27b1189 RDX: ffff88802a5a3b80 RSI: ffffffff8968488d RDI: ffff88807b077f70 RBP: ffffc900033efbb0 R08: 0000000000000001 R09: fffffbfff27a900c R10: ffffffff93d48067 R11: ffffffff8ae000eb R12: ffff88807b077800 R13: dffffc0000000000 R14: ffff88807b077e40 R15: 0000000000000001 FS: 0000000000000000(0000) GS:ffff8880b9500000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000564f4fc1e3a8 CR3: 000000000d57a000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <NMI> </NMI> <TASK> unix_gc+0x563/0x13b0 net/unix/garbage.c:319 unix_release_sock+0xa93/0xf80 net/unix/af_unix.c:683 unix_release+0x91/0xf0 net/unix/af_unix.c:1064 __sock_release+0xb0/0x270 net/socket.c:659 sock_close+0x1c/0x30 net/socket.c:1421 __fput+0x270/0xb80 fs/file_table.c:376 task_work_run+0x14f/0x250 kernel/task_work.c:180 exit_task_work include/linux/task_work.h:38 [inline] do_exit+0xa8a/0x2ad0 kernel/exit.c:871 do_group_exit+0xd4/0x2a0 kernel/exit.c:1020 __do_sys_exit_group kernel/exit.c:1031 [inline] __se_sys_exit_group kernel/exit.c:1029 [inline] __x64_sys_exit_group+0x3e/0x50 kernel/exit.c:1029 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xd5/0x270 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x6f/0x77 RIP: 0033:0x7f9d6cbdac09 Code: Unable to access opcode bytes at 0x7f9d6cbdabdf. RSP: 002b:00007fff5952feb8 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f9d6cbdac09 RDX: 000000000000003c RSI: 00000000000000e7 RDI: 0000000000000000 RBP: 00007f9d6cc552b0 R08: ffffffffffffffb8 R09: 0000000000000006 R10: 0000000000000006 R11: 0000000000000246 R12: 00007f9d6cc552b0 R13: 0000000000000000 R14: 00007f9d6cc55d00 R15: 00007f9d6cbabe70 </TASK> Reported-by: syzbot+4fa4a2d1f5a5ee06f006@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=4fa4a2d1f5a5ee06f006 Fixes: 1279f9d9 ("af_unix: Call kfree_skb() for dead unix_(sk)->oob_skb in GC.") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://lore.kernel.org/r/20240209220453.96053-1-kuniyu@amazon.comSigned-off-by: Paolo Abeni <pabeni@redhat.com>
-
Keqi Wang authored
This reverts commit c46bfba1 ("connector: Fix proc_event_num_listeners count not cleared"). It is not accurate to reset proc_event_num_listeners according to cn_netlink_send_mult() return value -ESRCH. In the case of stress-ng netlink-proc, -ESRCH will always be returned, because netlink_broadcast_filtered will return -ESRCH, which may cause stress-ng netlink-proc performance degradation. Reported-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/oe-lkp/202401112259.b23a1567-oliver.sang@intel.com Fixes: c46bfba1 ("connector: Fix proc_event_num_listeners count not cleared") Signed-off-by: Keqi Wang <wangkeqi_chris@163.com> Link: https://lore.kernel.org/r/20240209091659.68723-1-wangkeqi_chris@163.comSigned-off-by: Paolo Abeni <pabeni@redhat.com>
-
Allison Henderson authored
Functions rds_still_queued and rds_clear_recv_queue lock a given socket in order to safely iterate over the incoming rds messages. However calling rds_inc_put while under this lock creates a potential deadlock. rds_inc_put may eventually call rds_message_purge, which will lock m_rs_lock. This is the incorrect locking order since m_rs_lock is meant to be locked before the socket. To fix this, we move the message item to a local list or variable that wont need rs_recv_lock protection. Then we can safely call rds_inc_put on any item stored locally after rs_recv_lock is released. Fixes: bdbe6fbc ("RDS: recv.c") Reported-by: syzbot+f9db6ff27b9bfdcfeca0@syzkaller.appspotmail.com Reported-by: syzbot+dcd73ff9291e6d34b3ab@syzkaller.appspotmail.com Signed-off-by: Allison Henderson <allison.henderson@oracle.com> Link: https://lore.kernel.org/r/20240209022854.200292-1-allison.henderson@oracle.comSigned-off-by: Paolo Abeni <pabeni@redhat.com>
-
Eric Dumazet authored
rtnl_prop_list_size() can be called while alternative names are added or removed concurrently. if_nlmsg_size() / rtnl_calcit() can indeed be called without RTNL held. Use explicit RCU protection to avoid UAF. Fixes: 88f4fb0c ("net: rtnetlink: put alternative names to getlink message") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/20240209181248.96637-1-edumazet@google.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Shannon Nelson authored
The VFs don't run the health thread, so don't try to stop or restart the non-existent timer or work item. Fixes: d9407ff1 ("pds_core: Prevent health thread from running during reset/remove") Reviewed-by: Brett Creeley <brett.creeley@amd.com> Signed-off-by: Shannon Nelson <shannon.nelson@amd.com> Link: https://lore.kernel.org/r/20240210002002.49483-1-shannon.nelson@amd.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Shannon Nelson authored
We should be doing as little as possible besides freeing Tx space when our napi routines are called with budget of 0, so jump out before doing anything besides Tx cleaning. See commit afbed3f7 ("net/mlx5e: do as little as possible in napi poll when budget is 0") for more info. Fixes: fe8c30b5 ("ionic: separate interrupt for Tx and Rx") Reviewed-by: Brett Creeley <brett.creeley@amd.com> Signed-off-by: Shannon Nelson <shannon.nelson@amd.com> Link: https://lore.kernel.org/r/20240210001307.48450-1-shannon.nelson@amd.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Simon Horman authored
The cited commit introduces and uses the string constants dpp_tx_err and dpp_rx_err. These are assigned to constant fields of the array dwxgmac3_error_desc. It has been reported that on GCC 6 and 7.5.0 this results in warnings such as: .../dwxgmac2_core.c:836:20: error: initialiser element is not constant { true, "TDPES0", dpp_tx_err }, I have been able to reproduce this using: GCC 7.5.0, 8.4.0, 9.4.0 and 10.5.0. But not GCC 13.2.0. So it seems this effects older compilers but not newer ones. As Jon points out in his report, the minimum compiler supported by the kernel is GCC 5.1, so it does seem that this ought to be fixed. It is not clear to me what combination of 'const', if any, would address this problem. So this patch takes of using #defines for the string constants Compile tested only. Fixes: 46eba193 ("net: stmmac: xgmac: fix handling of DPP safety error for DMA channels") Reported-by: Jon Hunter <jonathanh@nvidia.com> Closes: https://lore.kernel.org/netdev/c25eb595-8d91-40ea-9f52-efa15ebafdbc@nvidia.com/Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202402081135.lAxxBXHk-lkp@intel.com/Signed-off-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240208-xgmac-const-v1-1-e69a1eeabfc8@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Maxim Galaganov authored
Older glibc's netinet/in.h may leave IPPROTO_MPTCP undefined when building ip_local_port_range.c, that leads to "error: use of undeclared identifier 'IPPROTO_MPTCP'". Define IPPROTO_MPTCP in such cases, just like in other MPTCP selftests. Fixes: 122db5e3 ("selftests/net: add MPTCP coverage for IP_LOCAL_PORT_RANGE") Reported-by: Linux Kernel Functional Testing <lkft@linaro.org> Closes: https://lore.kernel.org/netdev/CA+G9fYvGO5q4o_Td_kyQgYieXWKw6ktMa-Q0sBu6S-0y3w2aEQ@mail.gmail.com/Signed-off-by: Maxim Galaganov <max@internet.ru> Tested-by: Linux Kernel Functional Testing <lkft@linaro.org> Link: https://lore.kernel.org/r/20240209132512.254520-1-max@internet.ruSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Ivan Vecera authored
Currently when PF administratively sets VF's MAC address and the VF is put down (VF tries to delete all MACs) then the MAC is removed from MAC filters and primary VF MAC is zeroed. Do not allow untrusted VF to remove primary MAC when it was set administratively by PF. Reproducer: 1) Create VF 2) Set VF interface up 3) Administratively set the VF's MAC 4) Put VF interface down [root@host ~]# echo 1 > /sys/class/net/enp2s0f0/device/sriov_numvfs [root@host ~]# ip link set enp2s0f0v0 up [root@host ~]# ip link set enp2s0f0 vf 0 mac fe:6c:b5:da:c7:7d [root@host ~]# ip link show enp2s0f0 23: enp2s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether 3c:ec:ef:b7:dd:04 brd ff:ff:ff:ff:ff:ff vf 0 link/ether fe:6c:b5:da:c7:7d brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off [root@host ~]# ip link set enp2s0f0v0 down [root@host ~]# ip link show enp2s0f0 23: enp2s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether 3c:ec:ef:b7:dd:04 brd ff:ff:ff:ff:ff:ff vf 0 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off Fixes: 700bbf6c ("i40e: allow VF to remove any MAC filter") Fixes: ceb29474 ("i40e: Add support for VF to specify its primary MAC address") Signed-off-by: Ivan Vecera <ivecera@redhat.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20240208180335.1844996-1-anthony.l.nguyen@intel.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
- 12 Feb, 2024 13 commits
-
-
Breno Leitao authored
The Documentation/ABI/testing/sysfs-class-net-statistics documentation is pointing to the wrong path for the interface. Documentation is pointing to /sys/class/<iface>, instead of /sys/class/net/<iface>. Fix it by adding the `net/` directory before the interface. Fixes: 6044f970 ("net: sysfs: document /sys/class/net/statistics/*") Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Matthieu Baerts says: ==================== mptcp: locking cleanup & misc. fixes Patches 1-4 are fixes for issues found by Paolo while working on adding TCP_NOTSENT_LOWAT support. The latter will need to track more states under the msk data lock. Since the locking msk locking schema is already quite complex, do a long awaited clean-up step by moving several confusing lockless initialization under the relevant locks. Note that it is unlikely a real race could happen even prior to such patches as the MPTCP-level state machine implicitly ensures proper serialization of the write accesses, even lacking explicit lock. But still, simplification is welcome and this will help for the maintenance. This can be backported up to v5.6. Patch 5 is a fix for the userspace PM, not to add new local address entries if the address is already in the list. This behaviour can be seen since v5.19. Patch 6 fixes an issue when Fastopen is used. The issue can happen since v6.2. A previous fix has already been applied, but not taking care of all cases according to syzbot. Patch 7 updates Geliang's email address in the MAINTAINERS file. ==================== Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Geliang Tang authored
Update my email-address in MAINTAINERS and .mailmap entries to my kernel.org account. Suggested-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Geliang Tang <geliang@kernel.org> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Paolo Abeni authored
Fastopen and PM-trigger subflow shutdown can race, as reported by syzkaller. In my first attempt to close such race, I missed the fact that the subflow status can change again before the subflow_state_change callback is invoked. Address the issue additionally copying with all the states directly reachable from TCP_FIN_WAIT1. Fixes: 1e777f39 ("mptcp: add MSG_FASTOPEN sendmsg flag support") Fixes: 4fd19a30 ("mptcp: fix inconsistent state on fastopen race") Cc: stable@vger.kernel.org Reported-by: syzbot+c53d4d3ddb327e80bc51@syzkaller.appspotmail.com Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/458Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Geliang Tang authored
Before adding a new entry in mptcp_userspace_pm_get_local_id(), it's better to check whether this address is already in userspace pm local address list. If it's in the list, no need to add a new entry, just return it's address ID and use this address. Fixes: 8b201370 ("mptcp: read attributes of addr entries managed by userspace PMs") Cc: stable@vger.kernel.org Signed-off-by: Geliang Tang <geliang.tang@linux.dev> Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Paolo Abeni authored
Most MPTCP-level related fields are under the mptcp data lock protection, but are written one-off without such lock at MPC complete time, both for the client and the server Leverage the mptcp_propagate_state() infrastructure to move such initialization under the proper lock client-wise. The server side critical init steps are done by mptcp_subflow_fully_established(): ensure the caller properly held the relevant lock, and avoid acquiring the same lock in the nested scopes. There are no real potential races, as write access to such fields is implicitly serialized by the MPTCP state machine; the primary goal is consistency. Fixes: d22f4988 ("mptcp: process MP_CAPABLE data option") Cc: stable@vger.kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Paolo Abeni authored
The 'msk->write_seq' and 'msk->snd_nxt' are always updated under the msk socket lock, except at MPC handshake completiont time. Builds-up on the previous commit to move such init under the relevant lock. There are no known problems caused by the potential race, the primary goal is consistency. Fixes: 6d0060f6 ("mptcp: Write MPTCP DSS headers to outgoing data packets") Cc: stable@vger.kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Paolo Abeni authored
mptcp_rcv_space_init() is supposed to happen under the msk socket lock, but active msk socket does that without such protection. Leverage the existing mptcp_propagate_state() helper to that extent. We need to ensure mptcp_rcv_space_init will happen before mptcp_rcv_space_adjust(), and the release_cb does not assure that: explicitly check for such condition. While at it, move the wnd_end initialization out of mptcp_rcv_space_init(), it never belonged there. Note that the race does not produce ill effect in practice, but change allows cleaning-up and defying better the locking model. Fixes: a6b118fe ("mptcp: add receive buffer auto-tuning") Cc: stable@vger.kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Paolo Abeni authored
Such field is there to avoid acquiring the data lock in a few spots, but it adds complexity to the already non trivial locking schema. All the relevant call sites (mptcp-level re-injection, set socket options), are slow-path, drop such field in favor of 'cb_flags', adding the relevant locking. This patch could be seen as an improvement, instead of a fix. But it simplifies the next patch. The 'Fixes' tag has been added to help having this series backported to stable. Fixes: e9d09bac ("mptcp: avoid atomic bit manipulation when possible") Cc: stable@vger.kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Eric Dumazet says: ==================== net: more three misplaced fields We recently reorganized some structures for better data locality in networking fast paths. This series moves three fields that were not correctly classified. There probably more to come. Reference : https://lwn.net/Articles/951321/ ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
dev->lstats is notably used from loopback ndo_start_xmit() and other virtual drivers. Per cpu stats updates are dirtying per-cpu data, but the pointer itself is read-only. Fixes: 43a71cd6 ("net-device: reorganize net_device fast path variables") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Coco Li <lixiaoyan@google.com> Cc: Simon Horman <horms@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
tp->tcp_usec_ts is a read mostly field, used in rx and tx fast paths. Fixes: d5fed5ad ("tcp: reorganize tcp_sock fast path variables") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Coco Li <lixiaoyan@google.com> Cc: Wei Wang <weiwan@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
tp->scaling_ratio is a read mostly field, used in rx and tx fast paths. Fixes: d5fed5ad ("tcp: reorganize tcp_sock fast path variables") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Coco Li <lixiaoyan@google.com> Cc: Wei Wang <weiwan@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 10 Feb, 2024 8 commits
-
-
David S. Miller authored
Jakub Kicinski says: ==================== net: tls: fix some issues with async encryption valis was reporting a race on socket close so I sat down to try to fix it. I used Sabrina's async crypto debug patch to test... and in the process run into some of the same issues, and created very similar fixes :( I didn't realize how many of those patches weren't applied. Once I found Sabrina's code [1] it turned out to be so similar in fact that I added her S-o-b's and Co-develop'eds in a semi-haphazard way. With this series in place all expected tests pass with async crypto. Sabrina had a few more fixes, but I'll leave those to her, things are not crashing anymore. [1] https://lore.kernel.org/netdev/cover.1694018970.git.sd@queasysnail.net/ ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
We double count async, non-zc rx data. The previous fix was lucky because if we fully zc async_copy_bytes is 0 so we add 0. Decrypted already has all the bytes we handled, in all cases. We don't have to adjust anything, delete the erroneous line. Fixes: 4d42cd6b ("tls: rx: fix return value for async crypto") Co-developed-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
This exact case was fail for async crypto and we weren't catching it. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Sabrina Dubroca authored
tls_decrypt_sg doesn't take a reference on the pages from clear_skb, so the put_page() in tls_decrypt_done releases them, and we trigger a use-after-free in process_rx_list when we try to read from the partially-read skb. Fixes: fd31f399 ("tls: rx: decrypt into a fresh skb") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
Since we're setting the CRYPTO_TFM_REQ_MAY_BACKLOG flag on our requests to the crypto API, crypto_aead_{encrypt,decrypt} can return -EBUSY instead of -EINPROGRESS in valid situations. For example, when the cryptd queue for AESNI is full (easy to trigger with an artificially low cryptd.cryptd_max_cpu_qlen), requests will be enqueued to the backlog but still processed. In that case, the async callback will also be called twice: first with err == -EINPROGRESS, which it seems we can just ignore, then with err == 0. Compared to Sabrina's original patch this version uses the new tls_*crypt_async_wait() helpers and converts the EBUSY to EINPROGRESS to avoid having to modify all the error handling paths. The handling is identical. Fixes: a54667f6 ("tls: Add support for encryption using async offload accelerator") Fixes: 94524d8f ("net/tls: Add support for async decryption of tls records") Co-developed-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Link: https://lore.kernel.org/netdev/9681d1febfec295449a62300938ed2ae66983f28.1694018970.git.sd@queasysnail.net/Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
Similarly to previous commit, the submitting thread (recvmsg/sendmsg) may exit as soon as the async crypto handler calls complete(). Reorder scheduling the work before calling complete(). This seems more logical in the first place, as it's the inverse order of what the submitting thread will do. Reported-by: valis <sec@valis.email> Fixes: a42055e8 ("net/tls: Add support for async encryption of records for performance") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
The submitting thread (one which called recvmsg/sendmsg) may exit as soon as the async crypto handler calls complete() so any code past that point risks touching already freed data. Try to avoid the locking and extra flags altogether. Have the main thread hold an extra reference, this way we can depend solely on the atomic ref counter for synchronization. Don't futz with reiniting the completion, either, we are now tightly controlling when completion fires. Reported-by: valis <sec@valis.email> Fixes: 0cada332 ("net/tls: fix race condition causing kernel panic") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
Factor out waiting for async encrypt and decrypt to finish. There are already multiple copies and a subsequent fix will need more. No functional changes. Note that crypto_wait_req() returns wait->err Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 09 Feb, 2024 5 commits
-
-
Jakub Kicinski authored
Breno Leitao says: ==================== net: Fix MODULE_DESCRIPTION() for net (p5) There are hundreds of network modules that misses MODULE_DESCRIPTION(), causing a warning when compiling with W=1. Example: WARNING: modpost: missing MODULE_DESCRIPTION() in net/sched/em_cmp.o WARNING: modpost: missing MODULE_DESCRIPTION() in net/sched/em_nbyte.o WARNING: modpost: missing MODULE_DESCRIPTION() in net/sched/em_u32.o WARNING: modpost: missing MODULE_DESCRIPTION() in net/sched/em_meta.o WARNING: modpost: missing MODULE_DESCRIPTION() in net/sched/em_text.o WARNING: modpost: missing MODULE_DESCRIPTION() in net/sched/em_canid.o WARNING: modpost: missing MODULE_DESCRIPTION() in net/ipv4/ip_tunnel.o WARNING: modpost: missing MODULE_DESCRIPTION() in net/ipv4/ipip.o WARNING: modpost: missing MODULE_DESCRIPTION() in net/ipv4/ip_gre.o WARNING: modpost: missing MODULE_DESCRIPTION() in net/ipv4/udp_tunnel.o WARNING: modpost: missing MODULE_DESCRIPTION() in net/ipv4/ip_vti.o WARNING: modpost: missing MODULE_DESCRIPTION() in net/ipv4/ah4.o WARNING: modpost: missing MODULE_DESCRIPTION() in net/ipv4/esp4.o WARNING: modpost: missing MODULE_DESCRIPTION() in net/ipv4/xfrm4_tunnel.o WARNING: modpost: missing MODULE_DESCRIPTION() in net/ipv4/tunnel4.o WARNING: modpost: missing MODULE_DESCRIPTION() in net/xfrm/xfrm_algo.o WARNING: modpost: missing MODULE_DESCRIPTION() in net/xfrm/xfrm_user.o WARNING: modpost: missing MODULE_DESCRIPTION() in net/ipv6/ah6.o WARNING: modpost: missing MODULE_DESCRIPTION() in net/ipv6/esp6.o WARNING: modpost: missing MODULE_DESCRIPTION() in net/ipv6/xfrm6_tunnel.o WARNING: modpost: missing MODULE_DESCRIPTION() in net/ipv6/tunnel6.o This part5 of the patchset focus on the missing net/ module, which are now warning free. v1: https://lore.kernel.org/all/20240205101400.1480521-1-leitao@debian.org/ v2: https://lore.kernel.org/all/20240207101929.484681-1-leitao@debian.org/ ==================== Link: https://lore.kernel.org/r/20240208164244.3818498-1-leitao@debian.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Breno Leitao authored
W=1 builds now warn if module is built without a MODULE_DESCRIPTION(). Add descriptions to the DSA loopback fixed PHY module. Suggested-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Breno Leitao <leitao@debian.org> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20240208164244.3818498-10-leitao@debian.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Breno Leitao authored
W=1 builds now warn if module is built without a MODULE_DESCRIPTION(). Add descriptions to the IP-VLAN based tap driver. Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240208164244.3818498-9-leitao@debian.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Breno Leitao authored
W=1 builds now warn if module is built without a MODULE_DESCRIPTION(). Add descriptions to the network schedulers. Suggested-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://lore.kernel.org/r/20240208164244.3818498-8-leitao@debian.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Breno Leitao authored
W=1 builds now warn if module is built without a MODULE_DESCRIPTION(). Add descriptions to the IPv4 modules. Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240208164244.3818498-7-leitao@debian.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-