- 03 Jun, 2021 18 commits
-
-
Vladimir Oltean authored
Since all the remaining members of struct mdio_xpcs_ops have direct equivalents in struct phylink_pcs_ops, it is about time we remove it altogether. Since the phylink ops return void, we need to remove the error propagation from the various xpcs methods and simply print an error message where appropriate. Since xpcs_get_state_c73() detects link faults and attempts to reset the link on its own by calling xpcs_config(), but xpcs_config() now has a lot of phylink arguments which are not needed and cannot be simply fabricated by anybody else except phylink, the actual implementation has been moved into a smaller xpcs_do_config(). The const struct mdio_xpcs_ops *priv->hw->xpcs has been removed, so we need to look at the struct mdio_xpcs_args pointer now as an indication whether the port has an XPCS or not. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vladimir Oltean authored
Unify the 2 existing PCS drivers (lynx and xpcs) by doing a similar thing on probe, which is to have a *_create function that takes a struct mdio_device * given by the caller, and builds a private PCS structure around that. This changes stmmac to hold only a pointer to the xpcs, as opposed to the full structure. This will be used in the next patch when struct mdio_xpcs_ops is removed. Currently a pointer to struct mdio_xpcs_ops is used as a shorthand to determine whether the port has an XPCS or not. We can do the same now with the mdio_xpcs_args pointer. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vladimir Oltean authored
Use the dedicated helper for abstracting away how the clause 45 address is packed in reg_addr. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vladimir Oltean authored
Similar to the other recently functions, it is not necessary for xpcs_probe to be a function pointer, so export it so that it can be called directly. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vladimir Oltean authored
There is no good reason why we need to go through: stmmac_xpcs_config_eee -> stmmac_do_callback -> mdio_xpcs_ops->config_eee -> xpcs_config_eee when we can simply call xpcs_config_eee. priv->hw->xpcs is of the type "const struct mdio_xpcs_ops *" and is used as a placeholder/synonym for priv->plat->mdio_bus_data->has_xpcs. It is done that way because the mdio_bus_data pointer might or might not be populated in all stmmac instantiations. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vladimir Oltean authored
Calling a function pointer with a single implementation through struct mdio_xpcs_ops is clunky, and the stmmac_do_callback system forces this to return int, even though it always returns zero. Simply remove the "validate" function pointer from struct mdio_xpcs_ops and replace it with an exported xpcs_validate symbol which is called directly by stmmac. priv->hw->xpcs is of the type "const struct mdio_xpcs_ops *" and is used as a placeholder/synonym for priv->plat->mdio_bus_data->has_xpcs. It is done that way because the mdio_bus_data pointer might or might not be populated in all stmmac instantiations. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vladimir Oltean authored
The operating mode of the driver is currently to populate its struct mdio_xpcs_args::supported and struct mdio_xpcs_args::an_mode statically in xpcs_probe(), based on the passed phy_interface_t, and work with those. However this is not the operation that phylink expects from a PCS driver, because the port might be attached to an SFP cage that triggers changes of the phy_interface_t dynamically as one SFP module is unpluggged and another is plugged. To migrate towards that model, the struct mdio_xpcs_args should not cache anything related to the phy_interface_t, but just look up the statically defined, const struct xpcs_compat structure corresponding to the detected PCS OUI/model number. So we delete the "supported" and "an_mode" members of struct mdio_xpcs_args, and add the "id" structure there (since the ID is not expected to change at runtime). Since xpcs->supported is used deep in the code in _xpcs_config_aneg_c73(), we need to modify some function headers to pass the xpcs_compat from all callers. In turn, the xpcs_compat is always supplied externally to the xpcs module: - Most of the time by phylink - In xpcs_probe() it is needed because xpcs_soft_reset() writes to MDIO_MMD_PCS or to MDIO_MMD_VEND2 depending on whether an_mode is clause 37 or clause 73. In order to not introduce functional changes related to when the soft reset is issued, we continue to require the initial phy_interface_t argument to be passed to xpcs_probe() so we can pass this on to xpcs_soft_reset(). - stmmac_open() wants to know whether to call stmmac_init_phy() or not, and for that it looks inside xpcs->an_mode, because the clause 73 (backplane) AN modes supposedly do not have a PHY. Because we moved an_mode outside of struct mdio_xpcs_args, this is now no longer directly possible, so we introduce a helper function xpcs_get_an_mode() which protects the data encapsulation of the xpcs module and requires a phy_interface_t to be passed as argument. This function can look up the appropriate compat based on the phy_interface_t. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vladimir Oltean authored
The xpcs driver has an apparently inadequate structure for the actual hardware it drives. These defines and the xpcs_probe() function would suggest that there is one PHY ID per supported PHY interface type, and the driver simply validates whether the mode it should operate in (the argument of xpcs_probe) matches what the hardware is capable of: #define SYNOPSYS_XPCS_USXGMII_ID 0x7996ced0 #define SYNOPSYS_XPCS_10GKR_ID 0x7996ced0 #define SYNOPSYS_XPCS_XLGMII_ID 0x7996ced0 #define SYNOPSYS_XPCS_SGMII_ID 0x7996ced0 #define SYNOPSYS_XPCS_MASK 0xffffffff but that is not the case, because upon closer inspection, all the above 4 PHY ID definitions are in fact equal. So it is the same XPCS that is compatible with all 4 sets of PHY interface types. This change introduces an array of struct xpcs_compat which is populated by the single struct xpcs_id instance. It also eliminates the bogus defines for multiple Synopsys XPCS PHY IDs and replaces them with a single XPCS_ID, which better reflects the way in which the hardware operates. Because we are touching this area of the code anyway, the new array of struct xpcs_compat, as well as the array of xpcs_id, have been moved towards the end of the file, since they are variable declarations not definitions. If whichever of struct xpcs_compat or struct xpcs_id need to gain a function pointer member in the future, it is easier to reference functions (no forward declarations needed) if we have the const variable declarations at the end of the file. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vladimir Oltean authored
CONFIG_STMMAC_ETH selects CONFIG_PCS_XPCS, so there should be no situation where the shim should be needed. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Peng Li says: ==================== net: hdlc_cisco: clean up some code style issues This patchset clean up some code style issues. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Peng Li authored
Space prohibited between function name and open parenthesis '('. Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Peng Li authored
This patch fixes the checkpatch error about missing a blank line after declarations. Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Peng Li authored
This patch removes unnecessary out of memory message, to fix the following checkpatch.pl warning: "WARNING: Possible unnecessary 'out of memory' message" Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Peng Li authored
Add spaces required after the close parenthesis '}'. Add spaces required after that ','. Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Peng Li authored
Fix the checkpatch error as "foo* bar" and should be "foo *bar", and "(foo*)" should be "(foo *)". Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Peng Li authored
This patch removes some redundant blank lines. Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Zheng Yongjun authored
Fix some spelling mistakes in comments: enconding ==> encoding ambigous ==> ambiguous orignal ==> original encyption ==> encryption Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Zheng Yongjun authored
Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 02 Jun, 2021 22 commits
-
-
David S. Miller authored
Dmytro Linkin says: ==================== devlink: rate objects API Resending without RFC. Currently kernel provides a way to change tx rate of single VF in switchdev mode via tc-police action. When lots of VFs are configured management of theirs rates becomes non-trivial task and some grouping mechanism is required. Implementing such grouping in tc-police will bring flow related limitations and unwanted complications, like: - tc-police is a policer and there is a user request for a traffic shaper, so shared tc-police action is not suitable; - flows requires net device to be placed on, means "groups" wouldn't have net device instance itself. Taking into the account previous point was reviewed a sollution, when representor have a policer and the driver use a shaper if qdisc contains group of VFs - such approach ugly, compilated and misleading; - TC is ingress only, while configuring "other" side of the wire looks more like a "real" picture where shaping is outside of the steering world, similar to "ip link" command; According to that devlink is the most appropriate place. This series introduces devlink API for managing tx rate of single devlink port or of a group by invoking callbacks (see below) of corresponding driver. Also devlink port or a group can be added to the parent group, where driver responsible to handle rates of a group elements. To achieve all of that new rate object is added. It can be one of the two types: - leaf - represents a single devlink port; created/destroyed by the driver and bound to the devlink port. As example, some driver may create leaf rate object for every devlink port associated with VF. Since leaf have 1to1 mapping to it's devlink port, in user space it is referred as pci/<bus_addr>/<port_index>; - node - represents a group of rate objects; created/deleted by request from the userspace; initially empty (no rate objects added). In userspace it is referred as pci/<bus_addr>/<node_name>, where node name can be any, except decimal number, to avoid collisions with leafs. devlink_ops extended with following callbacks: - rate_{leaf|node}_tx_{share|max}_set - rate_node_{new|del} - rate_{leaf|node}_parent_set KAPI provides: - creation/destruction of the leaf rate object associated with devlink port - destruction of rate nodes to allow a vendor driver to free allocated resources on driver removal or due to the other reasons when nodes destruction required UAPI provides: - dumping all or single rate objects - setting tx_{share|max} of rate object of any type - creating/deleting node rate object - setting/unsetting parent of any rate object Added devlink rate object support for netdevsim driver Issues/open questions: - Does user need DEVLINK_CMD_RATE_DEL_ALL_CHILD command to clean all children of particular parent node? For example: $ devlink port function rate flush netdevsim/netdevsim10/group - priv pointer passed to the callbacks is a source of bugs; in leaf case driver can embed rate object into internal structure and use container_of() on it; in node case it cannot be done since nodes are created from userspace v1->v2: - fixed kernel-doc for devlink_rate_leaf_{create|destroy}() - s/func/function/ for all devlink port command occurences v2->v3: - devlink: - added devlink_rate_nodes_destroy() function - netdevsim: - added call of devlink_rate_nodes_destroy() function ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Dmytro Linkin authored
Add devlink rate objects section at devlink port documentation. Add devlink rate support info at netdevsim devlink documentation. Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Dmytro Linkin authored
Test verifies that netdevsim correctly implements devlink ops callbacks that set node as a parent of devlink leaf or node rate object. Co-developed-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Dmytro Linkin authored
Implement new devlink ops that allow setting rate node as a parent for devlink port (leaf) or another devlink node through devlink API. Expose parent names to netdevsim debugfs in read only mode. Co-developed-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Dmytro Linkin authored
Refactor DEVLINK_CMD_RATE_{GET|SET} command handlers to support setting a node as a parent for another rate object (leaf or node) by means of new attribute DEVLINK_ATTR_RATE_PARENT_NODE_NAME. Extend devlink ops with new callbacks rate_{leaf|node}_parent_set() to set node as a parent for rate object to allow supporting drivers to implement rate grouping through devlink. Driver implementations are allowed to support leafs or node children only. Invoking callback with NULL as parent should be threated by the driver as unset parent action. Extend rate object struct with reference counter to disallow deleting a node with any child pointing to it. User should unset parent for the child explicitly. Example: $ devlink port function rate add netdevsim/netdevsim10/group1 $ devlink port function rate add netdevsim/netdevsim10/group2 $ devlink port function rate set netdevsim/netdevsim10/group1 parent group2 $ devlink port function rate show netdevsim/netdevsim10/group1 netdevsim/netdevsim10/group1: type node parent group2 $ devlink port function rate set netdevsim/netdevsim10/group1 noparent Co-developed-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Dmytro Linkin authored
Test verifies that it is possible to create, delete and set min/max tx rate of devlink rate node on netdevsim VF. Co-developed-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Dmytro Linkin authored
Implement new devlink ops that allow creation, deletion and setting of shared/max tx rate of devlink rate nodes through devlink API. Expose rate node and it's tx rates to netdevsim debugfs. Co-developed-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Dmytro Linkin authored
Implement support for DEVLINK_CMD_RATE_{NEW|DEL} commands that are used to create and delete devlink rate nodes. Add new attribute DEVLINK_ATTR_RATE_NODE_NAME that specify node name string. The node name is an alphanumeric identifier. No valid node name can be a devlink port index, eg. decimal number. Extend devlink ops with new callbacks rate_node_{new|del}() and rate_node_tx_{share|max}_set() to allow supporting drivers to implement ports rate grouping and setting tx rate of rate nodes through devlink. Expose devlink_rate_nodes_destroy() function to allow vendor driver do proper cleanup of internally allocated resources for the nodes if the driver goes down or due to any other reasons which requires nodes to be destroyed. Disallow moving device from switchdev to legacy mode if any node exists on that device. User must explicitly delete nodes before switching mode. Example: $ devlink port function rate add netdevsim/netdevsim10/group1 $ devlink port function rate set netdevsim/netdevsim10/group1 \ tx_share 10mbit tx_max 100mbit Add + set command can be combined: $ devlink port function rate add netdevsim/netdevsim10/group1 \ tx_share 10mbit tx_max 100mbit $ devlink port function rate show netdevsim/netdevsim10/group1 netdevsim/netdevsim10/group1: type node tx_share 10mbit tx_max 100mbit $ devlink port function rate del netdevsim/netdevsim10/group1 Co-developed-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Dmytro Linkin authored
Test verifies that netdevsim VFs can set and retrieve shared/max tx rate through new devlink API. Co-developed-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Dmytro Linkin authored
Implement new devlink ops that allow shared and max tx rate control for devlink port rate objects (leafs) through devlink API. Expose rate values of VF ports to netdevsim debugfs. Co-developed-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Dmytro Linkin authored
Implement support for DEVLINK_CMD_RATE_SET command with new attributes DEVLINK_ATTR_RATE_TX_{SHARE|MAX} that are used to set devlink rate shared/max tx rate values. Extend devlink ops with new callbacks rate_leaf_tx_{share|max}_set() to allow supporting drivers to implement rate control through devlink. New attributes are optional. Driver implementations are allowed to support either or both of them. Shared rate example: $ devlink port function rate set netdevsim/netdevsim10/0 tx_share 10mbit $ devlink port function rate show netdevsim/netdevsim10/0 netdevsim/netdevsim10/0: type leaf tx_share 10mbit Max rate example: $ devlink port function rate set netdevsim/netdevsim10/0 tx_max 100mbit $ devlink port function rate show netdevsim/netdevsim10/0 netdevsim/netdevsim10/0: type leaf tx_max 100mbit Co-developed-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Dmytro Linkin authored
Test verifies that all netdevsim VF ports have rate leaf object created by default. Co-developed-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Dmytro Linkin authored
Register devlink rate leaf objects per VF. Co-developed-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Dmytro Linkin authored
Allow registering rate object for devlink ports with dedicated devlink_rate_leaf_{create|destroy}() API. Implement new netlink DEVLINK_CMD_RATE_GET command that is used to retrieve rate object info. Add new DEVLINK_CMD_RATE_{NEW|DEL} commands that are used for notifications when creating/deleting leaf rate object. Rate API is intended to be used for rate limiting of individual devlink ports (leafs) and their aggregates (nodes). Example: $ devlink port show pci/0000:03:00.0/0 pci/0000:03:00.0/1 $ devlink port function rate show pci/0000:03:00.0/0: type leaf pci/0000:03:00.0/1: type leaf Co-developed-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Dmytro Linkin authored
Implement callbacks to set/get eswitch mode value. Add helpers to check current mode. Instantiate VFs' net devices and devlink ports on switchdev enabling and remove them on legacy enabling. Changing number of VFs while in switchdev mode triggers VFs creation/deletion. Also disable NDO API callback to set VF rate, since it's legacy API. Switchdev API to set VF rate will be implemented in one of the next patches. Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Dmytro Linkin authored
Allow creation of netdevsim ports for VFs along with allocations of corresponding net devices and devlink ports. Add enums and helpers to distinguish PFs' ports from VFs' ports. Ports creation/deletion debugfs API intended to be used with physical ports only. VFs instantiation will be done in one of the next patches. Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Dmytro Linkin authored
Define type of ports, which netdevsim driver currently operates with as PF. Define new port type - VF, which will be implemented in following patches. Add helper functions to distinguish them. Add helper function to get VF index from port index. Add port indexing logic where PFs' indexes starts from 0, VFs' - from NSIM_DEV_VF_PORT_INDEX_BASE. All ports uses same index pool, which means that PF port may be created with index from VFs' indexes range. Maximum number of VFs, which the driver can allocate, is limited by UINT_MAX - BASE. Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Dmytro Linkin authored
Move VFs disabling from device release() to nsim_dev_reload_destroy() to make VFs disabling and ports removal simultaneous. This is a requirement for VFs ports implemented in next patches. Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Dmytro Linkin authored
Currently there is no limit to the number of VFs netdevsim can enable. In a real systems this value exist and used by the driver. Fore example, some features might need to consider this value when allocating memory. Expose max_vfs variable to debugfs as configurable resource. If are VFs configured (num_vfs != 0) then changing of max_vfs not allowed. Co-developed-by: Yuval Avnery <yuvalav@nvidia.com> Signed-off-by: Yuval Avnery <yuvalav@nvidia.com> Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Simon Horman says: ==================== Introduce conntrack offloading to the nfp driver Louis Peens says: This is the first in a series of patches to offload conntrack to the nfp. The approach followed is to flatten out three different flow rules into a single offloaded flow. The three different flows are: 1) The rule sending the packet to conntrack (pre_ct) 2) The rule matching on +trk+est after a packet has been through conntrack. (post_ct) 3) The rule received via callback from the netfilter (nft) In order to offload a flow we need a combination of all three flows, but they could be added/deleted at different times and in different order. To solve this we save potential offloadable CT flows in the driver, and every time we receive a callback we check against these saved flows for valid merges. Once we have a valid combination of all three flows this will be offloaded to the NFP. This is demonstrated in the diagram below. +-------------+ +----------+ | pre_ct flow +--------+ | nft flow | +-------------+ v +------+---+ +----------+ | | tc_merge +--------+ | +----------+ v v +--------------+ ^ +-------------+ | post_ct flow +-------+ +---+nft_tc merge | +--------------+ | +-------------+ | | | v Offload to nfp This series is only up to the point of the pre_ct and post_ct merges into the tc_merge. Follow up series will continue to add the nft flows and merging of these flows with the result of the pre_ct and post_ct merged flows. Changes since v2: - nfp: flower-ct: add zone table entry when handling pre/post_ct flows Fixed another docstring. Should finally have the patch check environment properly configured now to avoid more of these. - nfp: flower-ct: add tc merge functionality Fixed warning found by "kernel test robot <lkp@intel.com>" Added code comment explaining chain_index comparison Changes since v1: - nfp: flower-ct: add ct zone table Fixed unused variable compile warning Fixed missing colon in struct description ==================== Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Louis Peens authored
Add merging of pre/post_ct flow rules into the tc_merge table. Pre_ct flows needs to be merge with post_ct flows and vice versa. This needs to be done for all flows in the same zone table, as well as with the wc_zone_table, which is for flows masking out ct_zone info. Cleanup is happening when all the tables are cleared up and prints a warning traceback as this is not expected in the final version. At this point we are not actually returning success for the offload, so we do not get any delete requests for flows, so we can't delete them that way yet. This means that cleanup happens in what would usually be an exception path. Signed-off-by: Louis Peens <louis.peens@corigine.com> Signed-off-by: Yinjun Zhang <yinjun.zhang@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Louis Peens authored
Add the table required to store the merge result of pre_ct and post_ct flows. This is just the initial setup and teardown of the table, the implementation will be in follow-up patches. Signed-off-by: Louis Peens <louis.peens@corigine.com> Signed-off-by: Yinjun Zhang <yinjun.zhang@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-