- 20 Sep, 2022 31 commits
-
-
Bhupesh Sharma authored
As suggested by Vinod, adding myself as the reviewer for the Qualcomm ETHQOS Ethernet driver. Recently I have enabled this driver on a few Qualcomm SoCs / boards and hence trying to keep a close eye on it. Signed-off-by: Bhupesh Sharma <bhupesh.sharma@linaro.org> Acked-by: Vinod Koul <vkoul@kernel.org> Link: https://lore.kernel.org/r/20220915112804.3950680-1-bhupesh.sharma@linaro.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
Vladimir Oltean says: ==================== Fixes for tc-taprio software mode While working on some new features for tc-taprio, I found some strange behavior which looked like bugs. I was able to eventually trigger a NULL pointer dereference. This patch set fixes 2 issues I saw. Detailed explanation in patches. ==================== Link: https://lore.kernel.org/r/20220915100802.2308279-1-vladimir.oltean@nxp.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Vladimir Oltean authored
taprio can only operate as root qdisc, and to that end, there exists the following check in taprio_init(), just as in mqprio: if (sch->parent != TC_H_ROOT) return -EOPNOTSUPP; And indeed, when we try to attach taprio to an mqprio child, it fails as expected: $ tc qdisc add dev swp0 root handle 1: mqprio num_tc 8 \ map 0 1 2 3 4 5 6 7 \ queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 hw 0 $ tc qdisc replace dev swp0 parent 1:2 taprio num_tc 8 \ map 0 1 2 3 4 5 6 7 \ queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \ base-time 0 sched-entry S 0x7f 990000 sched-entry S 0x80 100000 \ flags 0x0 clockid CLOCK_TAI Error: sch_taprio: Can only be attached as root qdisc. (extack message added by me) But when we try to attach a taprio child to a taprio root qdisc, surprisingly it doesn't fail: $ tc qdisc replace dev swp0 root handle 1: taprio num_tc 8 \ map 0 1 2 3 4 5 6 7 queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \ base-time 0 sched-entry S 0x7f 990000 sched-entry S 0x80 100000 \ flags 0x0 clockid CLOCK_TAI $ tc qdisc replace dev swp0 parent 1:2 taprio num_tc 8 \ map 0 1 2 3 4 5 6 7 \ queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \ base-time 0 sched-entry S 0x7f 990000 sched-entry S 0x80 100000 \ flags 0x0 clockid CLOCK_TAI This is because tc_modify_qdisc() behaves differently when mqprio is root, vs when taprio is root. In the mqprio case, it finds the parent qdisc through p = qdisc_lookup(dev, TC_H_MAJ(clid)), and then the child qdisc through q = qdisc_leaf(p, clid). This leaf qdisc q has handle 0, so it is ignored according to the comment right below ("It may be default qdisc, ignore it"). As a result, tc_modify_qdisc() goes through the qdisc_create() code path, and this gives taprio_init() a chance to check for sch_parent != TC_H_ROOT and error out. Whereas in the taprio case, the returned q = qdisc_leaf(p, clid) is different. It is not the default qdisc created for each netdev queue (both taprio and mqprio call qdisc_create_dflt() and keep them in a private q->qdiscs[], or priv->qdiscs[], respectively). Instead, taprio makes qdisc_leaf() return the _root_ qdisc, aka itself. When taprio does that, tc_modify_qdisc() goes through the qdisc_change() code path, because the qdisc layer never finds out about the child qdisc of the root. And through the ->change() ops, taprio has no reason to check whether its parent is root or not, just through ->init(), which is not called. The problem is the taprio_leaf() implementation. Even though code wise, it does the exact same thing as mqprio_leaf() which it is copied from, it works with different input data. This is because mqprio does not attach itself (the root) to each device TX queue, but one of the default qdiscs from its private array. In fact, since commit 13511704 ("net: taprio offload: enforce qdisc to netdev queue mapping"), taprio does this too, but just for the full offload case. So if we tried to attach a taprio child to a fully offloaded taprio root qdisc, it would properly fail too; just not to a software root taprio. To fix the problem, stop looking at the Qdisc that's attached to the TX queue, and instead, always return the default qdiscs that we've allocated (and to which we privately enqueue and dequeue, in software scheduling mode). Since Qdisc_class_ops :: leaf is only called from tc_modify_qdisc(), the risk of unforeseen side effects introduced by this change is minimal. Fixes: 5a781ccb ("tc: Add support for configuring the taprio scheduler") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Vladimir Oltean authored
In an incredibly strange API design decision, qdisc->destroy() gets called even if qdisc->init() never succeeded, not exclusively since commit 87b60cfa ("net_sched: fix error recovery at qdisc creation"), but apparently also earlier (in the case of qdisc_create_dflt()). The taprio qdisc does not fully acknowledge this when it attempts full offload, because it starts off with q->flags = TAPRIO_FLAGS_INVALID in taprio_init(), then it replaces q->flags with TCA_TAPRIO_ATTR_FLAGS parsed from netlink (in taprio_change(), tail called from taprio_init()). But in taprio_destroy(), we call taprio_disable_offload(), and this determines what to do based on FULL_OFFLOAD_IS_ENABLED(q->flags). But looking at the implementation of FULL_OFFLOAD_IS_ENABLED() (a bitwise check of bit 1 in q->flags), it is invalid to call this macro on q->flags when it contains TAPRIO_FLAGS_INVALID, because that is set to U32_MAX, and therefore FULL_OFFLOAD_IS_ENABLED() will return true on an invalid set of flags. As a result, it is possible to crash the kernel if user space forces an error between setting q->flags = TAPRIO_FLAGS_INVALID, and the calling of taprio_enable_offload(). This is because drivers do not expect the offload to be disabled when it was never enabled. The error that we force here is to attach taprio as a non-root qdisc, but instead as child of an mqprio root qdisc: $ tc qdisc add dev swp0 root handle 1: \ mqprio num_tc 8 map 0 1 2 3 4 5 6 7 \ queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 hw 0 $ tc qdisc replace dev swp0 parent 1:1 \ taprio num_tc 8 map 0 1 2 3 4 5 6 7 \ queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 base-time 0 \ sched-entry S 0x7f 990000 sched-entry S 0x80 100000 \ flags 0x0 clockid CLOCK_TAI Unable to handle kernel paging request at virtual address fffffffffffffff8 [fffffffffffffff8] pgd=0000000000000000, p4d=0000000000000000 Internal error: Oops: 96000004 [#1] PREEMPT SMP Call trace: taprio_dump+0x27c/0x310 vsc9959_port_setup_tc+0x1f4/0x460 felix_port_setup_tc+0x24/0x3c dsa_slave_setup_tc+0x54/0x27c taprio_disable_offload.isra.0+0x58/0xe0 taprio_destroy+0x80/0x104 qdisc_create+0x240/0x470 tc_modify_qdisc+0x1fc/0x6b0 rtnetlink_rcv_msg+0x12c/0x390 netlink_rcv_skb+0x5c/0x130 rtnetlink_rcv+0x1c/0x2c Fix this by keeping track of the operations we made, and undo the offload only if we actually did it. I've added "bool offloaded" inside a 4 byte hole between "int clockid" and "atomic64_t picos_per_byte". Now the first cache line looks like below: $ pahole -C taprio_sched net/sched/sch_taprio.o struct taprio_sched { struct Qdisc * * qdiscs; /* 0 8 */ struct Qdisc * root; /* 8 8 */ u32 flags; /* 16 4 */ enum tk_offsets tk_offset; /* 20 4 */ int clockid; /* 24 4 */ bool offloaded; /* 28 1 */ /* XXX 3 bytes hole, try to pack */ atomic64_t picos_per_byte; /* 32 0 */ /* XXX 8 bytes hole, try to pack */ spinlock_t current_entry_lock; /* 40 0 */ /* XXX 8 bytes hole, try to pack */ struct sched_entry * current_entry; /* 48 8 */ struct sched_gate_list * oper_sched; /* 56 8 */ /* --- cacheline 1 boundary (64 bytes) --- */ Fixes: 9c66d156 ("taprio: Add support for hardware offloading") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Ido Schimmel authored
The global 'raw_v6_hashinfo' variable can be accessed even when IPv6 is administratively disabled via the 'ipv6.disable=1' kernel command line option, leading to a crash [1]. Fix by restoring the original behavior and always initializing the variable, regardless of IPv6 support being administratively disabled or not. [1] BUG: unable to handle page fault for address: ffffffffffffffc8 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 173e18067 P4D 173e18067 PUD 173e1a067 PMD 0 Oops: 0000 [#1] PREEMPT SMP KASAN CPU: 3 PID: 271 Comm: ss Not tainted 6.0.0-rc4-custom-00136-g0727a9a5 #1396 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-1.fc36 04/01/2014 RIP: 0010:raw_diag_dump+0x310/0x7f0 [...] Call Trace: <TASK> __inet_diag_dump+0x10f/0x2e0 netlink_dump+0x575/0xfd0 __netlink_dump_start+0x67b/0x940 inet_diag_handler_cmd+0x273/0x2d0 sock_diag_rcv_msg+0x317/0x440 netlink_rcv_skb+0x15e/0x430 sock_diag_rcv+0x2b/0x40 netlink_unicast+0x53b/0x800 netlink_sendmsg+0x945/0xe60 ____sys_sendmsg+0x747/0x960 ___sys_sendmsg+0x13a/0x1e0 __sys_sendmsg+0x118/0x1e0 do_syscall_64+0x34/0x80 entry_SYSCALL_64_after_hwframe+0x63/0xcd Fixes: 1da177e4 ("Linux-2.6.12-rc2") Fixes: 0daf07e5 ("raw: convert raw sockets to RCU") Reported-by: Roberto Ricci <rroberto2r@gmail.com> Tested-by: Roberto Ricci <rroberto2r@gmail.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20220916084821.229287-1-idosch@nvidia.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Vladimir Oltean authored
TSN features on the ENETC (taprio, cbs, gate, police) are configured through a mix of command BD ring messages and port registers: enetc_port_rd(), enetc_port_wr(). Port registers are a region of the ENETC memory map which are only accessible from the PCIe Physical Function. They are not accessible from the Virtual Functions. Moreover, attempting to access these registers crashes the kernel: $ echo 1 > /sys/bus/pci/devices/0000\:00\:00.0/sriov_numvfs pci 0000:00:01.0: [1957:ef00] type 00 class 0x020001 fsl_enetc_vf 0000:00:01.0: Adding to iommu group 15 fsl_enetc_vf 0000:00:01.0: enabling device (0000 -> 0002) fsl_enetc_vf 0000:00:01.0 eno0vf0: renamed from eth0 $ tc qdisc replace dev eno0vf0 root taprio num_tc 8 map 0 1 2 3 4 5 6 7 \ queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 base-time 0 \ sched-entry S 0x7f 900000 sched-entry S 0x80 100000 flags 0x2 Unable to handle kernel paging request at virtual address ffff800009551a08 Internal error: Oops: 96000007 [#1] PREEMPT SMP pc : enetc_setup_tc_taprio+0x170/0x47c lr : enetc_setup_tc_taprio+0x16c/0x47c Call trace: enetc_setup_tc_taprio+0x170/0x47c enetc_setup_tc+0x38/0x2dc taprio_change+0x43c/0x970 taprio_init+0x188/0x1e0 qdisc_create+0x114/0x470 tc_modify_qdisc+0x1fc/0x6c0 rtnetlink_rcv_msg+0x12c/0x390 Split enetc_setup_tc() into separate functions for the PF and for the VF drivers. Also remove enetc_qos.o from being included into enetc-vf.ko, since it serves absolutely no purpose there. Fixes: 34c6adf1 ("enetc: Configure the Time-Aware Scheduler via tc-taprio offload") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20220916133209.3351399-2-vladimir.oltean@nxp.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Vladimir Oltean authored
The VF netdev driver shouldn't respond to changes in the NETIF_F_HW_TC flag; only PFs should. Moreover, TSN-specific code should go to enetc_qos.c, which should not be included in the VF driver. Fixes: 79e49982 ("net: enetc: add hw tc hw offload features for PSPF capability") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20220916133209.3351399-1-vladimir.oltean@nxp.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
Jason A. Donenfeld says: ==================== wireguard patches for 6.0-rc6 1) The ratelimiter timing test doesn't help outside of development, yet it is currently preventing the module from being inserted on some kernels when it flakes at insertion time. So we disable it. 2) A fix for a build error on UML, caused by a recent change in a different tree. 3) A WARN_ON() is triggered by Kees' new fortified memcpy() patch, due to memcpy()ing over a sockaddr pointer with the size of a sockaddr_in[6]. The type safe fix is pretty simple. Given how classic of a thing sockaddr punning is, I suspect this may be the first in a few patches like this throughout the net tree, once Kees' fortify series is more widely deployed (current it's just in next). ==================== Link: https://lore.kernel.org/r/20220916143740.831881-1-Jason@zx2c4.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jason A. Donenfeld authored
Doing a variable-sized memcpy is slower, and the compiler isn't smart enough to turn this into a constant-size assignment. Further, Kees' latest fortified memcpy will actually bark, because the destination pointer is type sockaddr, not explicitly sockaddr_in or sockaddr_in6, so it thinks there's an overflow: memcpy: detected field-spanning write (size 28) of single field "&endpoint.addr" at drivers/net/wireguard/netlink.c:446 (size 16) Fix this by just assigning by using explicit casts for each checked case. Fixes: e7096c13 ("net: WireGuard secure network tunnel") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Reviewed-by: Kees Cook <keescook@chromium.org> Reported-by: syzbot+a448cda4dba2dac50de5@syzkaller.appspotmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jason A. Donenfeld authored
Since 1b620d53 ("kbuild: disable header exports for UML in a straightforward way"), installing headers fails on UML, so just disable installing them, since they're not needed anyway on the architecture. Fixes: b438b3b8 ("wireguard: selftests: support UML") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jason A. Donenfeld authored
A previous commit tried to make the ratelimiter timings test more reliable but in the process made it less reliable on other configurations. This is an impossible problem to solve without increasingly ridiculous heuristics. And it's not even a problem that actually needs to be solved in any comprehensive way, since this is only ever used during development. So just cordon this off with a DEBUG_ ifdef, just like we do for the trie's randomized tests, so it can be enabled while hacking on the code, and otherwise disabled in CI. In the process we also revert 151c8e49. Fixes: 151c8e49 ("wireguard: ratelimiter: use hrtimer in selftest") Fixes: e7096c13 ("net: WireGuard secure network tunnel") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Íñigo Huguet authored
Like in previous patch for sfc, prevent potential (but unlikely) NULL pointer dereference. Fixes: 12804793 ("sfc: decouple TXQ type from label") Reported-by: Tianhao Zhao <tizhao@redhat.com> Signed-off-by: Íñigo Huguet <ihuguet@redhat.com> Link: https://lore.kernel.org/r/20220915141958.16458-1-ihuguet@redhat.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Íñigo Huguet authored
As in previous commit for sfc, fix TX channels offset when efx_siena_separate_tx_channels is false (the default) Fixes: 25bde571 ("sfc/siena: fix wrong tx channel offset with efx_separate_tx_channels") Reported-by: Tianhao Zhao <tizhao@redhat.com> Signed-off-by: Íñigo Huguet <ihuguet@redhat.com> Link: https://lore.kernel.org/r/20220915141653.15504-1-ihuguet@redhat.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Tetsuo Handa authored
syzbot is still complaining uninit-value in tcp_recvmsg(), for commit 1228b34c ("net: clear msg_get_inq in __sys_recvfrom() and __copy_msghdr_from_user()") missed that __get_compat_msghdr() is called instead of copy_msghdr_from_user() when MSG_CMSG_COMPAT is specified. Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Fixes: 1228b34c ("net: clear msg_get_inq in __sys_recvfrom() and __copy_msghdr_from_user()") Reviewed-by: Jens Axboe <axboe@kernel.dk> Link: https://lore.kernel.org/r/d06d0f7f-696c-83b4-b2d5-70b5f2730a37@I-love.SAKURA.ne.jpSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
Ido Schimmel says: ==================== ipmr: Always call ip{,6}_mr_forward() from RCU read-side critical section Patch #1 fixes a bug in ipmr code. Patch #2 adds corresponding test cases. ==================== Link: https://lore.kernel.org/r/20220914075339.4074096-1-idosch@nvidia.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Ido Schimmel authored
Add IPv4 and IPv6 test cases for unresolved multicast routes, testing that queued packets are forwarded after installing a matching (S, G) route. The test cases can be used to reproduce the bugs fixed in "ipmr: Always call ip{,6}_mr_forward() from RCU read-side critical section". Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Ido Schimmel authored
These functions expect to be called from RCU read-side critical section, but this only happens when invoked from the data path via ip{,6}_mr_input(). They can also be invoked from process context in response to user space adding a multicast route which resolves a cache entry with queued packets [1][2]. Fix by adding missing rcu_read_lock() / rcu_read_unlock() in these call paths. [1] WARNING: suspicious RCU usage 6.0.0-rc3-custom-15969-g049d233c8bcc-dirty #1387 Not tainted ----------------------------- net/ipv4/ipmr.c:84 suspicious rcu_dereference_check() usage! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 1 lock held by smcrouted/246: #0: ffffffff862389b0 (rtnl_mutex){+.+.}-{3:3}, at: ip_mroute_setsockopt+0x11c/0x1420 stack backtrace: CPU: 0 PID: 246 Comm: smcrouted Not tainted 6.0.0-rc3-custom-15969-g049d233c8bcc-dirty #1387 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-1.fc36 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x91/0xb9 vif_dev_read+0xbf/0xd0 ipmr_queue_xmit+0x135/0x1ab0 ip_mr_forward+0xe7b/0x13d0 ipmr_mfc_add+0x1a06/0x2ad0 ip_mroute_setsockopt+0x5c1/0x1420 do_ip_setsockopt+0x23d/0x37f0 ip_setsockopt+0x56/0x80 raw_setsockopt+0x219/0x290 __sys_setsockopt+0x236/0x4d0 __x64_sys_setsockopt+0xbe/0x160 do_syscall_64+0x34/0x80 entry_SYSCALL_64_after_hwframe+0x63/0xcd [2] WARNING: suspicious RCU usage 6.0.0-rc3-custom-15969-g049d233c8bcc-dirty #1387 Not tainted ----------------------------- net/ipv6/ip6mr.c:69 suspicious rcu_dereference_check() usage! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 1 lock held by smcrouted/246: #0: ffffffff862389b0 (rtnl_mutex){+.+.}-{3:3}, at: ip6_mroute_setsockopt+0x6b9/0x2630 stack backtrace: CPU: 1 PID: 246 Comm: smcrouted Not tainted 6.0.0-rc3-custom-15969-g049d233c8bcc-dirty #1387 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-1.fc36 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x91/0xb9 vif_dev_read+0xbf/0xd0 ip6mr_forward2.isra.0+0xc9/0x1160 ip6_mr_forward+0xef0/0x13f0 ip6mr_mfc_add+0x1ff2/0x31f0 ip6_mroute_setsockopt+0x1825/0x2630 do_ipv6_setsockopt+0x462/0x4440 ipv6_setsockopt+0x105/0x140 rawv6_setsockopt+0xd8/0x690 __sys_setsockopt+0x236/0x4d0 __x64_sys_setsockopt+0xbe/0x160 do_syscall_64+0x34/0x80 entry_SYSCALL_64_after_hwframe+0x63/0xcd Fixes: ebc31979 ("ipmr: add rcu protection over (struct vif_device)->dev") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Alex Elder authored
IPA can route packets between IPA-connected entities. The AP and modem are currently the only such entities supported, and no routing is required to transfer packets between them. The number of entries in each routing table is fixed, and defined at initialization time. Some of these entries are designated for use by the modem, and the rest are available for the AP to use. The AP sends a QMI message to the modem which describes (among other things) information about routing table memory available for the modem to use. Currently the QMI initialization packet gives wrong information in its description of routing tables. What *should* be supplied is the maximum index that the modem can use for the routing table memory located at a given location. The current code instead supplies the total *number* of routing table entries. Furthermore, the modem is granted the entire table, not just the subset it's supposed to use. This patch fixes this. First, the ipa_mem_bounds structure is generalized so its "end" field can be interpreted either as a final byte offset, or a final array index. Second, the IPv4 and IPv6 (non-hashed and hashed) table information fields in the QMI ipa_init_modem_driver_req structure are changed to be ipa_mem_bounds rather than ipa_mem_array structures. Third, we set the "end" value for each routing table to be the last index, rather than setting the "count" to be the number of indices. Finally, instead of allowing the modem to use all of a routing table's memory, it is limited to just the portion meant to be used by the modem. In all versions of IPA currently supported, that is IPA_ROUTE_MODEM_COUNT (8) entries. Update a few comments for clarity. Fixes: 530f9216 ("soc: qcom: ipa: AP/modem communications") Signed-off-by: Alex Elder <elder@linaro.org> Link: https://lore.kernel.org/r/20220913204602.1803004-1-elder@linaro.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Liang He authored
In of_mdiobus_register(), we should call of_node_put() for 'child' escaped out of for_each_available_child_of_node(). Fixes: 66bdede4 ("of_mdio: Fix broken PHY IRQ in case of probe deferral") Co-developed-by: Miaoqian Lin <linmq006@gmail.com> Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Signed-off-by: Liang He <windhl@126.com> Link: https://lore.kernel.org/r/20220913125659.3331969-1-windhl@126.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Cong Wang authored
Before we switched to ->read_skb(), ->read_sock() was passed with desc.count=1, which technically indicates we only read one skb per ->sk_data_ready() call. However, for TCP, this is not true. TCP at least has sk_rcvlowat which intentionally holds skb's in receive queue until this watermark is reached. This means when ->sk_data_ready() is invoked there could be multiple skb's in the queue, therefore we have to read multiple skbs in tcp_read_skb() instead of one. Fixes: 965b57b4 ("net: Introduce a new proto_ops ->read_skb()") Reported-by: Peilin Ye <peilin.ye@bytedance.com> Cc: John Fastabend <john.fastabend@gmail.com> Cc: Jakub Sitnicki <jakub@cloudflare.com> Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Cong Wang <cong.wang@bytedance.com> Link: https://lore.kernel.org/r/20220912173553.235838-1-xiyou.wangcong@gmail.comSigned-off-by: Paolo Abeni <pabeni@redhat.com>
-
Paolo Abeni authored
Francesco Dolcini says: ==================== Revert fec PTP changes Revert the last 2 FEC PTP changes from Csókás Bence, they are causing multiple issues and we are at 6.0-rc5. ==================== Link: https://lore.kernel.org/r/20220912070143.98153-1-francesco.dolcini@toradex.comSigned-off-by: Paolo Abeni <pabeni@redhat.com>
-
Francesco Dolcini authored
This reverts commit b353b241, this is creating multiple issues, just not ready to be merged yet. Link: https://lore.kernel.org/all/CAHk-=wj1obPoTu1AHj9Bd_BGYjdjDyPP+vT5WMj8eheb3A9WHw@mail.gmail.com/ Link: https://lore.kernel.org/all/20220907143915.5w65kainpykfobte@pengutronix.de/ Fixes: b353b241 ("net: fec: Use a spinlock to guard `fep->ptp_clk_on`") Signed-off-by: Francesco Dolcini <francesco.dolcini@toradex.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-
Francesco Dolcini authored
This reverts commit f7995922, this is creating multiple issues, just not ready to be merged yet. Link: https://lore.kernel.org/all/20220905180542.GA3685102@roeck-us.net/ Link: https://lore.kernel.org/all/CAHk-=wj1obPoTu1AHj9Bd_BGYjdjDyPP+vT5WMj8eheb3A9WHw@mail.gmail.com/ Fixes: f7995922 ("fec: Restart PPS after link state change") Signed-off-by: Francesco Dolcini <francesco.dolcini@toradex.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-
Rakesh Sankaranarayanan authored
Maximum frame length check is enabled in lan937x switch on POR, But it is found to be disabled on driver during port setup operation. Due to this, packets are not dropped when transmitted with greater than configured value. For testing, setup made for lan1->lan2 transmission and configured lan1 interface with a frame length (less than 1500 as mentioned in documentation) and transmitted packets with greater than configured value. Expected no packets at lan2 end, but packets observed at lan2. Based on the documentation, packets should get discarded if the actual packet length doesn't match the frame length configured. Frame length check should be disabled only for cascaded ports due to tailtags. This feature was disabled on ksz9477 series due to ptp issue, which is not in lan937x series. But since lan937x took ksz9477 as base, frame length check disabled here as well. Patch added to remove this portion from port setup so that maximum frame length check will be active for normal ports. Fixes: 55ab6ffa ("net: dsa: microchip: add DSA support for microchip LAN937x") Signed-off-by: Rakesh Sankaranarayanan <rakesh.sankaranarayanan@microchip.com> Link: https://lore.kernel.org/r/20220912051228.1306074-1-rakesh.sankaranarayanan@microchip.comSigned-off-by: Paolo Abeni <pabeni@redhat.com>
-
Shailend Chand authored
Use GFP_ATOMIC when allocating pages out of the hotpath, continue to use GFP_KERNEL when allocating pages during setup. GFP_KERNEL will allow blocking which allows it to succeed more often in a low memory enviornment but in the hotpath we do not want to allow the allocation to block. Fixes: 9b8dd5e5 ("gve: DQO: Add RX path") Signed-off-by: Shailend Chand <shailend@google.com> Signed-off-by: Jeroen de Borst <jeroendb@google.com> Link: https://lore.kernel.org/r/20220913000901.959546-1-jeroendb@google.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Vadim Fedorenko authored
The warning message of unsupported FW appears every time RX timestamps are disabled on the interface. The patch fixes the flags to correct set for the check. Fixes: 66ed81dc ("bnxt_en: Enable packet timestamping for all RX packets") Cc: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru> Reviewed-by: Andy Gospodarek <gospo@broadcom.com> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20220915234932.25497-1-vfedorenko@novek.ruSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wirelessJakub Kicinski authored
Kalle Valo says: ==================== wireless fixes for v6.0 Late stage fixes for v6.0. Temporarily mark iwlwifi's mei code broken as it breaks suspend for iwd users and also don't spam nss trimming messages. mt76 has fixes for aggregation sequence numbers and a regression related to the VHT extended NSS BW feature. * tag 'wireless-2022-09-19' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless: wifi: mt76: fix 5 GHz connection regression on mt76x0/mt76x2 wifi: mt76: fix reading current per-tid starting sequence number for aggregation wifi: iwlwifi: Mark IWLMEI as broken wifi: iwlwifi: don't spam logs with NSS>2 messages ==================== Link: https://lore.kernel.org/r/20220919105003.1EAE7C433B5@smtp.kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
git://git.open-mesh.org/linux-mergeJakub Kicinski authored
Simon Wunderlich says: ==================== Here is a batman-adv bugfix: - Fix hang up with small MTU hard-interface, by Shigeru Yoshida * tag 'batadv-net-pullrequest-20220916' of git://git.open-mesh.org/linux-merge: batman-adv: Fix hang up with small MTU hard-interface ==================== Link: https://lore.kernel.org/r/20220916160931.1412407-1-sw@simonwunderlich.deSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Íñigo Huguet authored
Trying to get the channel from the tx_queue variable here is wrong because we can only be here if tx_queue is NULL, so we shouldn't dereference it. As the above comment in the code says, this is very unlikely to happen, but it's wrong anyway so let's fix it. I hit this issue because of a different bug that caused tx_queue to be NULL. If that happens, this is the error message that we get here: BUG: unable to handle kernel NULL pointer dereference at 0000000000000020 [...] RIP: 0010:efx_hard_start_xmit+0x153/0x170 [sfc] Fixes: 12804793 ("sfc: decouple TXQ type from label") Reported-by: Tianhao Zhao <tizhao@redhat.com> Signed-off-by: Íñigo Huguet <ihuguet@redhat.com> Acked-by: Edward Cree <ecree.xilinx@gmail.com> Link: https://lore.kernel.org/r/20220914111135.21038-1-ihuguet@redhat.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Íñigo Huguet authored
In legacy interrupt mode the tx_channel_offset was hardcoded to 1, but that's not correct if efx_sepparate_tx_channels is false. In that case, the offset is 0 because the tx queues are in the single existing channel at index 0, together with the rx queue. Without this fix, as soon as you try to send any traffic, it tries to get the tx queues from an uninitialized channel getting these errors: WARNING: CPU: 1 PID: 0 at drivers/net/ethernet/sfc/tx.c:540 efx_hard_start_xmit+0x12e/0x170 [sfc] [...] RIP: 0010:efx_hard_start_xmit+0x12e/0x170 [sfc] [...] Call Trace: <IRQ> dev_hard_start_xmit+0xd7/0x230 sch_direct_xmit+0x9f/0x360 __dev_queue_xmit+0x890/0xa40 [...] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020 [...] RIP: 0010:efx_hard_start_xmit+0x153/0x170 [sfc] [...] Call Trace: <IRQ> dev_hard_start_xmit+0xd7/0x230 sch_direct_xmit+0x9f/0x360 __dev_queue_xmit+0x890/0xa40 [...] Fixes: c308dfd1 ("sfc: fix wrong tx channel offset with efx_separate_tx_channels") Reported-by: Tianhao Zhao <tizhao@redhat.com> Signed-off-by: Íñigo Huguet <ihuguet@redhat.com> Acked-by: Edward Cree <ecree.xilinx@gmail.com> Link: https://lore.kernel.org/r/20220914103648.16902-1-ihuguet@redhat.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetoothJakub Kicinski authored
Luiz Augusto von Dentz says: ==================== bluetooth pull request for net: - Fix HCIGETDEVINFO regression * tag 'for-net-2022-09-09' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth: Bluetooth: Fix HCIGETDEVINFO regression ==================== Link: https://lore.kernel.org/r/20220909201642.3810565-1-luiz.dentz@gmail.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
- 19 Sep, 2022 5 commits
-
-
Lorenzo Bianconi authored
Disable page_pool/XDP support for MT7621 SoC in order fix a regression introduce adding XDP for MT7986 SoC. There is no a real use case for XDP on MT7621 since it is a low-end cpu. Moreover this patch reduces the memory footprint. Tested-by: Sergio Paracuellos <sergio.paracuellos@gmail.com> Tested-by: Arınç ÜNAL <arinc.unal@arinc9.com> Fixes: 23233e57 ("net: ethernet: mtk_eth_soc: rely on page_pool for single page buffers") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/2bf31e27b888c43228b0d84dd2ef5033338269e2.1663074002.git.lorenzo@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Haiyang Zhang authored
Per GDMA spec, rmb is necessary after checking owner_bits, before reading EQ or CQ entries. Add rmb in these two places to comply with the specs. Cc: stable@vger.kernel.org Fixes: ca9c54d2 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)") Reported-by: Sinan Kaya <Sinan.Kaya@microsoft.com> Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: Dexuan Cui <decui@microsoft.com> Link: https://lore.kernel.org/r/1662928805-15861-1-git-send-email-haiyangz@microsoft.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jeroen de Borst authored
Updating active developers. Signed-off-by: Jeroen de Borst <jeroendb@google.com> Link: https://lore.kernel.org/r/20220913185319.1061909-1-jeroendb@google.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Ido Schimmel authored
The hwstats debugfs files are only writeable, but they are created with read and write permissions, causing certain selftests to fail [1]. Fix by creating the files with write permission only. [1] # ./test_offload.py Test destruction of generic XDP... Traceback (most recent call last): File "/home/idosch/code/linux/tools/testing/selftests/bpf/./test_offload.py", line 810, in <module> simdev = NetdevSimDev() [...] Exception: Command failed: cat /sys/kernel/debug/netdevsim/netdevsim0//ports/0/dev/hwstats/l3/disable_ifindex cat: /sys/kernel/debug/netdevsim/netdevsim0//ports/0/dev/hwstats/l3/disable_ifindex: Invalid argument Fixes: 1a6d7ae7 ("netdevsim: Introduce support for L3 offload xstats") Reported-by: Jie2x Zhou <jie2x.zhou@intel.com> Tested-by: Jie2x Zhou <jie2x.zhou@intel.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Link: https://lore.kernel.org/r/20220909153830.3732504-1-idosch@nvidia.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
David Thompson authored
The MDIO gateway (GW) lock in BlueField-2 GIGE logic is set after read. This patch adds logic to make sure the lock is always cleared at the end of each MDIO transaction. Fixes: f92e1869 ("Add Mellanox BlueField Gigabit Ethernet driver") Reviewed-by: Asmaa Mnebhi <asmaa@nvidia.com> Signed-off-by: David Thompson <davthompson@nvidia.com> Link: https://lore.kernel.org/r/20220902164247.19862-1-davthompson@nvidia.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
- 16 Sep, 2022 4 commits
-
-
Peilin Ye authored
Prevent tcp_read_skb() from flooding the syslog. Suggested-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Peilin Ye <peilin.ye@bytedance.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
From: Benjamin Poirier <bpoirier@nvidia.com> To: netdev@vger.kernel.org Cc: Jay Vosburgh <j.vosburgh@gmail.com>, Veaceslav Falico <vfalico@gmail.com>, Andy Gospodarek <andy@greyhouse.net>, "David S. Miller" <davem@davemloft.net>, Eric Dumazet <edumazet@google.com>, Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>, Jiri Pirko <jiri@resnulli.us>, Shuah Khan <shuah@kernel.org>, Jonathan Toppins <jtoppins@redhat.com>, linux-kselftest@vger.kernel.org Subject: [PATCH net v3 0/4] Unsync addresses from ports when stopping aggregated devices Date: Wed, 7 Sep 2022 16:56:38 +0900 [thread overview] Message-ID: <20220907075642.475236-1-bpoirier@nvidia.com> (raw) This series fixes similar problems in the bonding and team drivers. Because of missing dev_{uc,mc}_unsync() calls, addresses added to underlying devices may be leftover after the aggregated device is deleted. Add the missing calls and a few related tests. v2: * fix selftest installation, see patch 3 v3: * Split lacpdu_multicast changes to their own patch, #1 * In ndo_{add,del}_slave methods, only perform address list changes when the aggregated device is up (patches 2 & 3) * Add selftest function related to the above change (patch 4) ==================== Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Benjamin Poirier authored
Test that the bonding and team drivers clean up an underlying device's address lists (dev->uc, dev->mc) when the aggregated device is deleted. Test addition and removal of the LACPDU multicast address on underlying devices by the bonding driver. v2: * add lag_lib.sh to TEST_FILES v3: * extend bond_listen_lacpdu_multicast test to init_state up and down cases * remove some superfluous shell syntax and 'set dev ... up' commands Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Benjamin Poirier authored
Netdev drivers are expected to call dev_{uc,mc}_sync() in their ndo_set_rx_mode method and dev_{uc,mc}_unsync() in their ndo_stop method. This is mentioned in the kerneldoc for those dev_* functions. The team driver calls dev_{uc,mc}_unsync() during ndo_uninit instead of ndo_stop. This is ineffective because address lists (dev->{uc,mc}) have already been emptied in unregister_netdevice_many() before ndo_uninit is called. This mistake can result in addresses being leftover on former team ports after a team device has been deleted; see test_LAG_cleanup() in the last patch in this series. Add unsync calls at their expected location, team_close(). v3: * When adding or deleting a port, only sync/unsync addresses if the team device is up. In other cases, it is taken care of at the right time by ndo_open/ndo_set_rx_mode/ndo_stop. Fixes: 3d249d4c ("net: introduce ethernet teaming device") Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-