- 29 Sep, 2022 27 commits
-
-
Maxim Mikityanskiy authored
Currently, the driver can silently fall back to legacy RQ after enabling XDP, even if striding RQ was active before. It happens when PAGE_SIZE is bigger than the maximum supported stride size. This commit changes this behavior to more straightforward: if an operation (enabling XDP) doesn't support the current parameters (striding RQ mode), it fails. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Maxim Mikityanskiy authored
mlx5e_verify_rx_mpwqe_strides is only used in en/params.c, so it can be made static and removed from en/params.h. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Maxim Mikityanskiy authored
No need to keep max_sq_wqebbs in mlx5e_txqsq and mlx5e_xdpsq, as it's only used when allocating the queues. Removing an extra field reduces the struct size. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Maxim Mikityanskiy authored
The return value of mlx5e_get_max_sq_wqebbs is clamped down to MLX5_SEND_WQE_MAX_WQEBBS = 16, which fits into u8. This commit changes the return type of this function to u8 for stricter type safety. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Maxim Mikityanskiy authored
Add the capability that will allow the driver to determine the minimal MTT page size to be able to map the smallest possible pages in XSK. The older firmwares that don't have this capability default to 12 (i.e. 4096-byte pages). Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linuxJakub Kicinski authored
Saeed Mahameed says: ==================== updates from mlx5-next 2022-09-24 Updates form mlx5-next including[1]: 1) HW definitions and support for NPPS clock settings. 2) various cleanups 3) Enable hash mode by default for all NICs 4) page tracker and advanced virtualization HW definitions for vfio [1] https://lore.kernel.org/netdev/20220907233636.388475-1-saeed@kernel.org/ * 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux: net/mlx5: Remove from FPGA IFC file not-needed definitions net/mlx5: Remove unused structs net/mlx5: Remove unused functions net/mlx5: detect and enable bypass port select flow table net/mlx5: Lag, enable hash mode by default for all NICs net/mlx5: Lag, set active ports if support bypass port select flow table RDMA/mlx5: Don't set tx affinity when lag is in hash mode net/mlx5: add IFC bits for bypassing port select flow table net/mlx5: Add support for NPPS with real time mode net/mlx5: Expose NPPS related registers net/mlx5: Query ADV_VIRTUALIZATION capabilities net/mlx5: Introduce ifc bits for page tracker RDMA/mlx5: Move function mlx5_core_query_ib_ppcnt() to mlx5_ib ==================== Link: https://lore.kernel.org/all/20220927201906.234015-1-saeed@kernel.org/Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Sean Anderson authored
Just use kzalloc instead. Fixes: d6f1e89b ("sunhme: Return an ERR_PTR from quattro_pci_find") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Sean Anderson <seanga2@gmail.com> Link: https://lore.kernel.org/r/20220928004157.279731-1-seanga2@gmail.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Shang XiaoJing authored
Use skb_put_data() instead of skb_put() and memcpy(), which is clear. Signed-off-by: Shang XiaoJing <shangxiaojing@huawei.com> Reviewed-by: M Chetan Kumar <m.chetan.kumar@intel.com> Link: https://lore.kernel.org/r/20220927023254.30342-1-shangxiaojing@huawei.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
Vladimir Oltean says: ==================== Rework resource allocation in Felix DSA driver The Felix DSA driver controls NXP variations of Microchip switches. Colin Foster is trying to add support in this driver for "genuine" Microchip hardware, but some of the NXP-isms in this driver need to go away before that happens cleanly. https://patchwork.kernel.org/project/netdevbpf/cover/20220926002928.2744638-1-colin.foster@in-advantage.com/ The starting point was Colin's patch 08/14 "net: dsa: felix: update init_regmap to be string-based", and this continues to be the central theme here, but things are done differently. In short (full explanations are in patches), the goal is for MFD-based switches like Colin's SPI-controlled VSC7512 to be able to request a regmap that was created 100% externally (by drivers/mfd/ocelot-core.c) in a very simple way, that does not create dependencies on other modules. That is dev_get_regmap(), and as input it wants a string, for the resource name. So we rework the resource allocation in this driver to be based on string names provided by the specific instantiation (in Colin's case, ocelot_ext.c). Patch set was boot-tested on NXP LS1028A. ==================== Link: https://lore.kernel.org/r/20220927191521.1578084-1-vladimir.oltean@nxp.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Vladimir Oltean authored
Existing felix DSA drivers (vsc9959, vsc9953) are all switches that were integrated in NXP SoCs, which makes them a bit unusual compared to the usual Microchip branded Ocelot switches. To be precise, looking at Documentation/devicetree/bindings/net/mscc,vsc7514-switch.yaml, one can see 21 memory regions for the "switch" node, and these correspond to the "targets" of the switch IP, which are spread throughout the guts of that SoC's memory space. In NXP integrations, those targets still exist, but they were condensed within a single memory region, with no other peripheral in between them, so it made more sense for the driver to ioremap the entire memory space of the switch, and then find the targets within that memory space via some offsets hardcoded in the driver. The effect of this design decision is that now, the felix driver expects hardware instantiations to provide their own resource definitions, which is kind of odd when considering a typical device (those are retrieved from 'reg' properties in the device tree, using platform_get_resource() or similar). Allow other hardware instantiations that share the felix driver to not provide a hardcoded array of resources in the future. Instead, make the common denominator based on which regmaps are created be just the resource "names". Each instantiation comes with its own array of names that are mandatory for it, and with an optional array of resources. So we split the resources in 2 arrays, one is what's requested and the other is what's provided. There is one pool of provided resources, in felix->info->resources (of length felix->info->num_resources). There are 2 different ways of requesting a resource. One is by enum ocelot_target (this handles the global regmaps), and one is by int port (this handles the per-port ones). For the existing vsc9959 and vsc9953, it would be a bit stupid to request something that's not provided, given that the 2 arrays are both defined in the same place. The advantage is that we can now modify felix_request_regmap_by_name() to make felix->info->resources[] optional, and if absent, the implementation can call dev_get_regmap() and this is something that is compatible with MFD. Co-developed-by: Colin Foster <colin.foster@in-advantage.com> Signed-off-by: Colin Foster <colin.foster@in-advantage.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Vladimir Oltean authored
Use less verbose resource definitions in vsc9959 and vsc9953. This also sets IORESOURCE_MEM in the constant array of resources, so we don't have to do this from felix_init_structs() - in fact, in the future, we may even support IORESOURCE_REG resources. Note that this macro takes start and length as argument, and we had start and end before. So transform end into length. While at it, sort the resources according to their offset. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Vladimir Oltean authored
It turns out that the idea of having a customizable implementation of a regmap creation from a resource is not exactly useful. The idea was for the new MFD-based VSC7512 driver to use something that creates a SPI regmap from a resource. But there are problems in actually getting those resources (it involves getting them from MFD). To avoid all that, we'll be getting resources by name, so this custom init_regmap() method won't be needed. Remove it. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Vladimir Oltean authored
This address is only relevant for the vsc9959, which is a PCIe device that holds its switch registers in a different PCIe BAR compared to the registers for the internal MDIO controller. Hide this aspect from the common felix driver and move the pci_resource_start() call to the only place that needs it, which is in vsc9959_mdio_bus_alloc(). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Vladimir Oltean authored
The imdio_res is used only by vsc9959, which references its own vsc9959_imdio_res through the common felix_info->imdio_res pointer. Since the common code doesn't care about this resource (and it can't be part of the common array of resources, either, because it belongs in a different PCI BAR), just reference it directly. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
We tell driver developers to always pass NAPI_POLL_WEIGHT as the weight to netif_napi_add(). This may be confusing to newcomers, drop the weight argument, those who really need to tweak the weight can use netif_napi_add_weight(). Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> # for CAN Link: https://lore.kernel.org/r/20220927132753.750069-1-kuba@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Martin KaFai Lau authored
The v6_rcv_saddr and rcv_saddr are inside a union in the 'struct inet_bind2_bucket'. When searching a bucket by following the bhash2 hashtable chain, eg. inet_bind2_bucket_match, it is only using the sk->sk_family and there is no way to check if the inet_bind2_bucket has a v6 or v4 address in the union. This leads to an uninit-value KMSAN report in [0] and also potentially incorrect matches. This patch fixes it by adding a family member to the inet_bind2_bucket and then tests 'sk->sk_family != tb->family' before matching the sk's address to the tb's address. Cc: Joanne Koong <joannelkoong@gmail.com> Fixes: 28044fc1 ("net: Add a bhash2 table hashed by port and address") Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Tested-by: Alexander Potapenko <glider@google.com> Link: https://lore.kernel.org/r/20220927002544.3381205-1-kafai@fb.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
Mat Martineau says: ==================== mptcp: MPTCP support for TCP_FASTOPEN_CONNECT RFC 8684 appendix B describes how to use TCP Fast Open with MPTCP. This series allows TFO use with MPTCP using the TCP_FASTOPEN_CONNECT socket option. The scope here is limited to the initiator of the connection - support for MSG_FASTOPEN and the listener side of the connection will be in a separate series. The preexisting TCP fastopen code does most of the work, so these changes mostly involve plumbing MPTCP through to those TCP functions. Patch 1 changes the MPTCP socket option code to pass the TCP_FASTOPEN_CONNECT option through to the initial unconnected subflow. Patch 2 exports the existing tcp_sendmsg_fastopen() function from tcp.c Patch 3 adds the call to tcp_sendmsg_fastopen() from the MPTCP send function. Patch 4 modifies mptcp_poll() to handle the deferred TFO connection. ==================== Link: https://lore.kernel.org/r/20220926232739.76317-1-mathew.j.martineau@linux.intel.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Benjamin Hesmans authored
If fastopen is used, poll must allow a first write that will trigger the SYN+data Similar to what is done in tcp_poll(). Acked-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Benjamin Hesmans <benjamin.hesmans@tessares.net> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Dmytro Shytyi authored
When TCP_FASTOPEN_CONNECT has been set on the socket before a connect, the defer flag is set and must be handled when sendmsg is called. This is similar to what is done in tcp_sendmsg_locked(). Acked-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Co-developed-by: Benjamin Hesmans <benjamin.hesmans@tessares.net> Signed-off-by: Benjamin Hesmans <benjamin.hesmans@tessares.net> Signed-off-by: Dmytro Shytyi <dmytro@shytyi.net> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Benjamin Hesmans authored
It will be used to support TCP FastOpen with MPTCP in the following commit. Acked-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Co-developed-by: Dmytro Shytyi <dmytro@shytyi.net> Signed-off-by: Dmytro Shytyi <dmytro@shytyi.net> Signed-off-by: Benjamin Hesmans <benjamin.hesmans@tessares.net> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Benjamin Hesmans authored
Set the option for the first subflow only. For the other subflows TFO can't be used because a mapping would be needed to cover the data in the SYN. Acked-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Benjamin Hesmans <benjamin.hesmans@tessares.net> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Gustavo A. R. Silva authored
Zero-length arrays are deprecated and we are moving towards adopting C99 flexible-array members, instead. So, replace zero-length arrays declarations in anonymous union with the new DECLARE_FLEX_ARRAY() helper macro. This helper allows for flexible-array members in unions. Link: https://github.com/KSPP/linux/issues/193 Link: https://github.com/KSPP/linux/issues/225 Link: https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.htmlSigned-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/YzIvfGXxfjdXmIS3@workSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
Pavel Begunkov says: ==================== shrink struct ubuf_info struct ubuf_info is large but not all fields are needed for all cases. We have limited space in io_uring for it and large ubuf_info prevents some struct embedding, even though we use only a subset of the fields. It's also not very clean trying to use this typeless extra space. Shrink struct ubuf_info to only necessary fields used in generic paths, namely ->callback, ->refcnt and ->flags, which take only 16 bytes. And make MSG_ZEROCOPY and some other users to embed it into a larger struct ubuf_info_msgzc mimicking the former ubuf_info. Note, xen/vhost may also have some cleaning on top by creating new structs containing ubuf_info but with proper types. ==================== Link: https://lore.kernel.org/r/cover.1663892211.git.asml.silence@gmail.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Pavel Begunkov authored
We can benefit from a smaller struct ubuf_info, so leave only mandatory fields and let users to decide how they want to extend it. Convert MSG_ZEROCOPY to struct ubuf_info_msgzc and remove duplicated fields. This reduces the size from 48 bytes to just 16. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Pavel Begunkov authored
struct ubuf_info will be changed, use ubuf_info_msgzc instead. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Pavel Begunkov authored
struct ubuf_info will be changed, use ubuf_info_msgzc instead. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Pavel Begunkov authored
We're going to split struct ubuf_info and leave there only mandatory fields. Users are free to extend it. Add struct ubuf_info_msgzc, which will be an extended version for MSG_ZEROCOPY and some other users. It duplicates of struct ubuf_info for now and will be removed in a couple of patches. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
- 28 Sep, 2022 13 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-nextJakub Kicinski authored
Florian Westphal says: ==================== netfilter fix for net-next This is a late bug fix for the *net-next* tree to make nftables "fib" expression play nice with VRF devices. This was broken since day 1 (v4.10) so I don't see a compelling reason to push this via net at the last minute. * 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next: netfilter: nft_fib: Fix for rpath check with VRF devices ==================== Link: https://lore.kernel.org/r/20220928113908.4525-1-fw@strlen.deSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Phil Sutter authored
Analogous to commit b575b24b ("netfilter: Fix rpfilter dropping vrf packets by mistake") but for nftables fib expression: Add special treatment of VRF devices so that typical reverse path filtering via 'fib saddr . iif oif' expression works as expected. Fixes: f6d0cbcf ("netfilter: nf_tables: add fib expression") Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Florian Westphal <fw@strlen.de>
-
David S. Miller authored
Edward Cree says: ==================== sfc: bare bones TC offload This series begins the work of supporting TC flower offload on EF100 NICs. This is the absolute minimum viable TC implementation to get traffic to VFs and allow them to be tested; it supports no match fields besides ingress port, no actions besides mirred and drop, and no stats. More matches, actions, and counters will be added in subsequent patches. Changed in v2: - Add missing 'static' on declarations (kernel test robot, sparse) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Edward Cree authored
This is the absolute minimum viable TC implementation to get traffic to VFs and allow them to be tested; it supports no match fields besides ingress port, no actions besides mirred and drop, and no stats. Example usage: tc filter add dev $PF parent ffff: flower skip_sw \ action mirred egress mirror dev $VFREP tc filter add dev $VFREP parent ffff: flower skip_sw \ action mirred egress redirect dev $PF gives a VF unfiltered access to the network out the physical port ($PF acts here as a physical port representor). More matches, actions, and counters will be added in subsequent patches. Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Edward Cree authored
Different versions of EF100 firmware and FPGA bitstreams support different matching capabilities in the Match-Action Engine. Probe for these at start of day; subsequent patches will validate TC offload requests against the reported capabilities. Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Edward Cree authored
Nothing inserts into this table yet, but we have code to remove rules on FLOW_CLS_DESTROY or at driver teardown time, in both cases also attempting to remove the corresponding hardware rules. Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Edward Cree authored
TC offload support will involve complex limitations on what matches and actions a rule can do, in some cases potentially depending on rules already offloaded. So add an ethtool private flag "log-tc-errors" which controls reporting the reasons for un-offloadable TC rules at NETIF_INFO. Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Edward Cree authored
Bind indirect blocks for recognised tunnel netdevices. Currently these connect to a stub efx_tc_flower() that only returns -EOPNOTSUPP; subsequent patches will implement flower offloads to the Match-Action Engine. Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Edward Cree authored
Bind direct blocks for the MAE-admin PF and each VF representor. Currently these connect to a stub efx_tc_flower() that only returns -EOPNOTSUPP; subsequent patches will implement flower offloads to the Match-Action Engine. Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Gustavo A. R. Silva authored
Zero-length arrays are deprecated and we are moving towards adopting C99 flexible-array members, instead. So, replace zero-length arrays declarations in anonymous union with the new DECLARE_FLEX_ARRAY() helper macro. This helper allows for flexible-array members in unions. Link: https://github.com/KSPP/linux/issues/193 Link: https://github.com/KSPP/linux/issues/221 Link: https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.htmlSigned-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Zhengchao Shao authored
Both is_bpf and is_ebpf are boolean types, so (!is_bpf && !is_ebpf) || (is_bpf && is_ebpf) can be reduced to is_bpf == is_ebpf in tcf_bpf_init(). Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Horatiu Vultur says: ==================== net: lan966x: Add tbf, cbs, ets support Add support for offloading QoS features with tc command to lan966x. The offloaded Qos features are tbf, cbs and ets. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Horatiu Vultur authored
Add ets qdisc which allows to mix strict priority with bandwidth-sharing bands. The ets qdisc needs to be attached as root qdisc. Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-