- 25 Feb, 2022 28 commits
-
-
Russell King (Oracle) authored
When the supported interfaces bitmap is populated, phylink will itself check that the interface mode is present in this bitmap. Drivers no longer need to perform this check themselves. Remove these checks. Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Russell King (Oracle) authored
Populate the supported interfaces bitmap for the SJA1105 DSA switch. This switch only supports a static model of configuration, so we restrict the interface modes to the configured setting. Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Vladimir Oltean <vladimir. │ Signed-off-by: David S. Miller <davem@davemloft.net>
-
Toms Atteka authored
This change adds a new OpenFlow field OFPXMT_OFB_IPV6_EXTHDR and packets can be filtered using ipv6_ext flag. Signed-off-by: Toms Atteka <cpp.code.lv@gmail.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
Simon Horman says: ==================== nfp: flow-independent tc action hardware offload Baowen Zheng says: Allow nfp NIC to offload tc actions independent of flows. The motivation for this work is to offload tc actions independent of flows for nfp NIC. We allow nfp driver to provide hardware offload of OVS metering feature - which calls for policers that may be used by multiple flows and whose lifecycle is independent of any flows that use them. When nfp driver tries to offload a flow table using the independent action, the driver will search if the action is already offloaded to the hardware. If not, the flow table offload will fail. When the nfp NIC successes to offload an action, the user can check in_hw_count when dumping the tc action. Tc cli command to offload and dump an action: # tc actions add action police rate 100mbit burst 10000k index 200 skip_sw # tc -s -d actions list action police total acts 1 action order 0: police 0xc8 rate 100Mbit burst 10000Kb mtu 2Kb action reclassify overhead 0b linklayer ethernet ref 1 bind 0 installed 142 sec used 0 sec Action statistics: Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 skip_sw in_hw in_hw_count 1 used_hw_stats delayed ==================== Link: https://lore.kernel.org/r/20220223162302.97609-1-simon.horman@corigine.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Baowen Zheng authored
Add NFP_FL_FEATS_QOS_METER to host features to enable meter offload in driver. Before adding this feature, we will not offload any police action since we will check the host features before offloading any police action. Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com> Signed-off-by: Louis Peens <louis.peens@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Baowen Zheng authored
Offload flow table if the action is already offloaded to hardware when flow table uses this action. Change meter id to type of u32 to support all the action index. Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com> Signed-off-by: Louis Peens <louis.peens@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Baowen Zheng authored
Add a process to update action stats from hardware. This stats data will be updated to tc action when dumping actions or filters. Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com> Signed-off-by: Louis Peens <louis.peens@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Baowen Zheng authored
Add a hash table to store meter table. This meter table will also be used by flower action. Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com> Signed-off-by: Louis Peens <louis.peens@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Baowen Zheng authored
Add process to offload tc action to hardware. Currently we only support to offload police action. Add meter capability to check if firmware supports meter offload. Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com> Signed-off-by: Louis Peens <louis.peens@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Baowen Zheng authored
Add an policer API to support ingress/egress meter. Change ingress police to compatible with the new API. Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com> Signed-off-by: Louis Peens <louis.peens@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Dmitry Safonov authored
The functions do essentially the same work to verify TCP-MD5 sign. Code can be merged into one family-independent function in order to reduce copy'n'paste and generated code. Later with TCP-AO option added, this will allow to create one function that's responsible for segment verification, that will have all the different checks for MD5/AO/non-signed packets, which in turn will help to see checks for all corner-cases in one function, rather than spread around different families and functions. Cc: Eric Dumazet <edumazet@google.com> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Signed-off-by: Dmitry Safonov <dima@arista.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20220223175740.452397-1-dima@arista.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
Vladimir Oltean says: ==================== FDB entries on DSA LAG interfaces This work permits having static and local FDB entries on LAG interfaces that are offloaded by DSA ports. New API needs to be introduced in drivers. To maintain consistency with the bridging offload code, I've taken the liberty to reorganize the data structures added by Tobias in the DSA core a little bit. Tested on NXP LS1028A (felix switch). Would appreciate feedback/testing on other platforms too. Testing procedure was the one described here: https://patchwork.kernel.org/project/netdevbpf/cover/20210205130240.4072854-1-vladimir.oltean@nxp.com/ with this script: ip link del bond0 ip link add bond0 type bond mode 802.3ad ip link set swp1 down && ip link set swp1 master bond0 && ip link set swp1 up ip link set swp2 down && ip link set swp2 master bond0 && ip link set swp2 up ip link del br0 ip link add br0 type bridge && ip link set br0 up ip link set br0 arp off ip link set bond0 master br0 && ip link set bond0 up ip link set swp0 master br0 && ip link set swp0 up ip link set dev bond0 type bridge_slave flood off learning off bridge fdb add dev bond0 <mac address of other eno0> master static I'm noticing a problem in 'bridge fdb dump' with the 'self' entries, and I didn't solve this. On Ocelot, an entry learned on a LAG is reported as being on the first member port of it (so instead of saying 'self bond0', it says 'self swp1'). This is better than not seeing the entry at all, but when DSA queries for the FDBs on a port via ds->ops->port_fdb_dump, it never queries for FDBs on a LAG. Not clear what we should do there, we aren't in control of the ->ndo_fdb_dump of the bonding/team drivers. Alternatively, we could just consider the 'self' entries reported via ndo_fdb_dump as "better than nothing", and concentrate on the 'master' entries that are in sync with the bridge when packets are flooded to software. ==================== Link: https://lore.kernel.org/r/20220223140054.3379617-1-vladimir.oltean@nxp.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Vladimir Oltean authored
This adds the logic in the Felix DSA driver and Ocelot switch library. For Ocelot switches, the DEST_IDX that is the output of the MAC table lookup is a logical port (equal to physical port, if no LAG is used, or a dynamically allocated number otherwise). The allocation we have in place for LAG IDs is different from DSA's, so we can't use that: - DSA allocates a continuous range of LAG IDs starting from 1 - Ocelot appears to require that physical ports and LAG IDs are in the same space of [0, num_phys_ports), and additionally, ports that aren't in a LAG must have physical port id == logical port id The implication is that an FDB entry towards a LAG might need to be deleted and reinstalled when the LAG ID changes. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Vladimir Oltean authored
This change introduces support for installing static FDB entries towards a bridge port that is a LAG of multiple DSA switch ports, as well as support for filtering towards the CPU local FDB entries emitted for LAG interfaces that are bridge ports. Conceptually, host addresses on LAG ports are identical to what we do for plain bridge ports. Whereas FDB entries _towards_ a LAG can't simply be replicated towards all member ports like we do for multicast, or VLAN. Instead we need new driver API. Hardware usually considers a LAG to be a "logical port", and sets the entire LAG as the forwarding destination. The physical egress port selection within the LAG is made by hashing policy, as usual. To represent the logical port corresponding to the LAG, we pass by value a copy of the dsa_lag structure to all switches in the tree that have at least one port in that LAG. To illustrate why a refcounted list of FDB entries is needed in struct dsa_lag, it is enough to say that: - a LAG may be a bridge port and may therefore receive FDB events even while it isn't yet offloaded by any DSA interface - DSA interfaces may be removed from a LAG while that is a bridge port; we don't want FDB entries lingering around, but we don't want to remove entries that are still in use, either For all the cases below to work, the idea is to always keep an FDB entry on a LAG with a reference count equal to the DSA member ports. So: - if a port joins a LAG, it requests the bridge to replay the FDB, and the FDB entries get created, or their refcount gets bumped by one - if a port leaves a LAG, the FDB replay deletes or decrements refcount by one - if an FDB is installed towards a LAG with ports already present, that entry is created (if it doesn't exist) and its refcount is bumped by the amount of ports already present in the LAG echo "Adding FDB entry to bond with existing ports" ip link del bond0 ip link add bond0 type bond mode 802.3ad ip link set swp1 down && ip link set swp1 master bond0 && ip link set swp1 up ip link set swp2 down && ip link set swp2 master bond0 && ip link set swp2 up ip link del br0 ip link add br0 type bridge ip link set bond0 master br0 bridge fdb add dev bond0 00:01:02:03:04:05 master static ip link del br0 ip link del bond0 echo "Adding FDB entry to empty bond" ip link del bond0 ip link add bond0 type bond mode 802.3ad ip link del br0 ip link add br0 type bridge ip link set bond0 master br0 bridge fdb add dev bond0 00:01:02:03:04:05 master static ip link set swp1 down && ip link set swp1 master bond0 && ip link set swp1 up ip link set swp2 down && ip link set swp2 master bond0 && ip link set swp2 up ip link del br0 ip link del bond0 echo "Adding FDB entry to empty bond, then removing ports one by one" ip link del bond0 ip link add bond0 type bond mode 802.3ad ip link del br0 ip link add br0 type bridge ip link set bond0 master br0 bridge fdb add dev bond0 00:01:02:03:04:05 master static ip link set swp1 down && ip link set swp1 master bond0 && ip link set swp1 up ip link set swp2 down && ip link set swp2 master bond0 && ip link set swp2 up ip link set swp1 nomaster ip link set swp2 nomaster ip link del br0 ip link del bond0 Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Vladimir Oltean authored
When switchdev_handle_fdb_event_to_device() replicates a FDB event emitted for the bridge or for a LAG port and DSA offloads that, we should notify back to switchdev that the FDB entry on the original device is what was offloaded, not on the DSA slave devices that the event is replicated on. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Vladimir Oltean authored
By construction, the struct net_device *dev passed to dsa_slave_switchdev_event_work() via struct dsa_switchdev_event_work is always a DSA slave device. Therefore, it is redundant to pass struct dsa_switch and int port information in the deferred work structure. This can be retrieved at all times from the provided struct net_device via dsa_slave_to_port(). For the same reason, we can drop the dsa_is_user_port() check in dsa_fdb_offload_notify(). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Vladimir Oltean authored
When the switchdev_handle_fdb_event_to_device() event replication helper was created, my original thought was that FDB events on LAG interfaces should most likely be special-cased, not just replicated towards all switchdev ports beneath that LAG. So this replication helper currently does not recurse through switchdev lower interfaces of LAG bridge ports, but rather calls the lag_mod_cb() if that was provided. No switchdev driver uses this helper for FDB events on LAG interfaces yet, so that was an assumption which was yet to be tested. It is certainly usable for that purpose, as my RFC series shows: https://patchwork.kernel.org/project/netdevbpf/cover/20220210125201.2859463-1-vladimir.oltean@nxp.com/ however this approach is slightly convoluted because: - the switchdev driver gets a "dev" that isn't its own net device, but rather the LAG net device. It must call switchdev_lower_dev_find(dev) in order to get a handle of any of its own net devices (the ones that pass check_cb). - in order for FDB entries on LAG ports to be correctly refcounted per the number of switchdev ports beneath that LAG, we haven't escaped the need to iterate through the LAG's lower interfaces. Except that is now the responsibility of the switchdev driver, because the replication helper just stopped half-way. So, even though yes, FDB events on LAG bridge ports must be special-cased, in the end it's simpler to let switchdev_handle_fdb_* just iterate through the LAG port's switchdev lowers, and let the switchdev driver figure out that those physical ports are under a LAG. The switchdev_handle_fdb_event_to_device() helper takes a "foreign_dev_check" callback so it can figure out whether @dev can autonomously forward to @foreign_dev. DSA fills this method properly: if the LAG is offloaded by another port in the same tree as @dev, then it isn't foreign. If it is a software LAG, it is foreign - forwarding happens in software. Whether an interface is foreign or not decides whether the replication helper will go through the LAG's switchdev lowers or not. Since the lan966x doesn't properly fill this out, FDB events on software LAG uppers will get called. By changing lan966x_foreign_dev_check(), we can suppress them. Whereas DSA will now start receiving FDB events for its offloaded LAG uppers, so we need to return -EOPNOTSUPP, since we currently don't do the right thing for them. Cc: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Vladimir Oltean authored
The main purpose of this change is to create a data structure for a LAG as seen by DSA. This is similar to what we have for bridging - we pass a copy of this structure by value to ->port_lag_join and ->port_lag_leave. For now we keep the lag_dev, id and a reference count in it. Future patches will add a list of FDB entries for the LAG (these also need to be refcounted to work properly). The LAG structure is created using dsa_port_lag_create() and destroyed using dsa_port_lag_destroy(), just like we have for bridging. Because now, the dsa_lag itself is refcounted, we can simplify dsa_lag_map() and dsa_lag_unmap(). These functions need to keep a LAG in the dst->lags array only as long as at least one port uses it. The refcounting logic inside those functions can be removed now - they are called only when we should perform the operation. dsa_lag_dev() is renamed to dsa_lag_by_id() and now returns the dsa_lag structure instead of the lag_dev net_device. dsa_lag_foreach_port() now takes the dsa_lag structure as argument. dst->lags holds an array of dsa_lag structures. dsa_lag_map() now also saves the dsa_lag->id value, so that linear walking of dst->lags in drivers using dsa_lag_id() is no longer necessary. They can just look at lag.id. dsa_port_lag_id_get() is a helper, similar to dsa_port_bridge_num_get(), which can be used by drivers to get the LAG ID assigned by DSA to a given port. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Vladimir Oltean authored
Make the intent of the code more clear by using the dedicated helper for iterating over the ports of a switch. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Vladimir Oltean authored
The DSA LAG API will be changed to become more similar with the bridge data structures, where struct dsa_bridge holds an unsigned int num, which is generated by DSA and is one-based. We have a similar thing going with the DSA LAG, except that isn't stored anywhere, it is calculated dynamically by dsa_lag_id() by iterating through dst->lags. The idea of encoding an invalid (or not requested) LAG ID as zero for the purpose of simplifying checks in drivers means that the LAG IDs passed by DSA to drivers need to be one-based too. So back-and-forth conversion is needed when indexing the dst->lags array, as well as in drivers which assume a zero-based index. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Vladimir Oltean authored
In preparation of converting struct net_device *dp->lag_dev into a struct dsa_lag *dp->lag, we need to rename, for consistency purposes, all occurrences of the "lag" variable in qca8k to "lag_dev". Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Vladimir Oltean authored
In preparation of converting struct net_device *dp->lag_dev into a struct dsa_lag *dp->lag, we need to rename, for consistency purposes, all occurrences of the "lag" variable in mv88e6xxx to "lag_dev". Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Vladimir Oltean authored
In preparation of converting struct net_device *dp->lag_dev into a struct dsa_lag *dp->lag, we need to rename, for consistency purposes, all occurrences of the "lag" variable in the DSA core to "lag_dev". Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Oleksij Rempel authored
This functions are mostly same except of one hard coded "in_pm" variable. So, rework them to reduce maintenance overhead. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Link: https://lore.kernel.org/r/20220223110633.3006551-1-o.rempel@pengutronix.deSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Yang Yingliang authored
rhashtable_lookup_fast() returns NULL pointer not ERR_PTR(), so it can return fib_node directly in prestera_kern_fib_cache_find(). Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Link: https://lore.kernel.org/r/20220223084954.1771075-2-yangyingliang@huawei.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Yang Yingliang authored
rhashtable_lookup_fast() returns NULL pointer not ERR_PTR(), so it can return fib_node directly in prestera_fib_node_find(). Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Link: https://lore.kernel.org/r/20220223084954.1771075-1-yangyingliang@huawei.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Casper Andersson authored
Though the SparX-5i can control IPv4/6 multicasts separately from non-IP multicasts, these are all muxed onto the bridge's BR_MCAST_FLOOD flag. Signed-off-by: Casper Andersson <casper.casan@gmail.com> Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com> Link: https://lore.kernel.org/r/20220223082700.qrot7lepwqcdnyzw@wse-c0155Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski authored
tools/testing/selftests/net/mptcp/mptcp_join.sh 34aa6e3b ("selftests: mptcp: add ip mptcp wrappers") 857898eb ("selftests: mptcp: add missing join check") 6ef84b15 ("selftests: mptcp: more robust signal race test") https://lore.kernel.org/all/20220221131842.468893-1-broonie@kernel.org/ drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/act.h drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/ct.c fb7e76ea ("net/mlx5e: TC, Skip redundant ct clear actions") c63741b4 ("net/mlx5e: Fix MPLSoUDP encap to use MPLS action information") 09bf9792 ("net/mlx5e: TC, Move pedit_headers_action to parse_attr") 84ba8062 ("net/mlx5e: Test CT and SAMPLE on flow attr") efe6f961 ("net/mlx5e: CT, Don't set flow flag CT for ct clear flow") 3b49a7ed ("net/mlx5e: TC, Reject rules with multiple CT actions") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
- 24 Feb, 2022 12 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pciLinus Torvalds authored
Pull pci fixes from Bjorn Helgaas: - Fix a merge error that broke PCI device enumeration on mvebu platforms, including Turris Omnia (Armada 385) (Pali Rohár) - Avoid using ATS on all AMD Navi10 and Navi14 GPUs because some VBIOSes don't account for "harvested" (disabled) parts of the chip when initializing caches (Alex Deucher) * tag 'pci-v5.17-fixes-5' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI: Mark all AMD Navi10 and Navi14 GPU ATS as broken PCI: mvebu: Fix device enumeration regression
-
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds authored
Pull networking fixes from Jakub Kicinski: "Including fixes from bpf and netfilter. Current release - regressions: - bpf: fix crash due to out of bounds access into reg2btf_ids - mvpp2: always set port pcs ops, avoid null-deref - eth: marvell: fix driver load from initrd - eth: intel: revert "Fix reset bw limit when DCB enabled with 1 TC" Current release - new code bugs: - mptcp: fix race in overlapping signal events Previous releases - regressions: - xen-netback: revert hotplug-status changes causing devices to not be configured - dsa: - avoid call to __dev_set_promiscuity() while rtnl_mutex isn't held - fix panic when removing unoffloaded port from bridge - dsa: microchip: fix bridging with more than two member ports Previous releases - always broken: - bpf: - fix crash due to incorrect copy_map_value when both spin lock and timer are present in a single value - fix a bpf_timer initialization issue with clang - do not try bpf_msg_push_data with len 0 - add schedule points in batch ops - nf_tables: - unregister flowtable hooks on netns exit - correct flow offload action array size - fix a couple of memory leaks - vsock: don't check owner in vhost_vsock_stop() while releasing - gso: do not skip outer ip header in case of ipip and net_failover - smc: use a mutex for locking "struct smc_pnettable" - openvswitch: fix setting ipv6 fields causing hw csum failure - mptcp: fix race in incoming ADD_ADDR option processing - sysfs: add check for netdevice being present to speed_show - sched: act_ct: fix flow table lookup after ct clear or switching zones - eth: intel: fixes for SR-IOV forwarding offloads - eth: broadcom: fixes for selftests and error recovery - eth: mellanox: flow steering and SR-IOV forwarding fixes Misc: - make __pskb_pull_tail() & pskb_carve_frag_list() drop_monitor friends not report freed skbs as drops - force inlining of checksum functions in net/checksum.h" * tag 'net-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (85 commits) net: mv643xx_eth: process retval from of_get_mac_address ping: remove pr_err from ping_lookup Revert "i40e: Fix reset bw limit when DCB enabled with 1 TC" openvswitch: Fix setting ipv6 fields causing hw csum failure ipv6: prevent a possible race condition with lifetimes net/smc: Use a mutex for locking "struct smc_pnettable" bnx2x: fix driver load from initrd Revert "xen-netback: Check for hotplug-status existence before watching" Revert "xen-netback: remove 'hotplug-status' once it has served its purpose" net/mlx5e: Fix VF min/max rate parameters interchange mistake net/mlx5e: Add missing increment of count net/mlx5e: MPLSoUDP decap, fix check for unsupported matches net/mlx5e: Fix MPLSoUDP encap to use MPLS action information net/mlx5e: Add feature check for set fec counters net/mlx5e: TC, Skip redundant ct clear actions net/mlx5e: TC, Reject rules with forward and drop actions net/mlx5e: TC, Reject rules with drop and modify hdr action net/mlx5e: kTLS, Use CHECKSUM_UNNECESSARY for device-offloaded packets net/mlx5e: Fix wrong return value on ioctl EEPROM query failure net/mlx5: Fix possible deadlock on rule deletion ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queueJakub Kicinski authored
Nguyen, Anthony L says: ==================== 10GbE Intel Wired LAN Driver Updates 2022-02-23 Yang Li fixes incorrect indenting as reported by smatch for ixgbevf. Piotr removes non-inclusive language from ixgbe driver. * '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: ixgbe: Remove non-inclusive language ixgbevf: clean up some inconsistent indenting ==================== Link: https://lore.kernel.org/r/20220223185424.2129067-1-anthony.l.nguyen@intel.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
git://git.kernel.dk/linux-blockLinus Torvalds authored
Pull block fixes from Jens Axboe: - NVMe pull request: - send H2CData PDUs based on MAXH2CDATA (Varun Prakash) - fix passthrough to namespaces with unsupported features (Christoph Hellwig) - Clear iocb->private at poll completion (Stefano) * tag 'block-5.17-2022-02-24' of git://git.kernel.dk/linux-block: nvme-tcp: send H2CData PDUs based on MAXH2CDATA nvme: also mark passthrough-only namespaces ready in nvme_update_ns_info nvme: don't return an error from nvme_configure_metadata block: clear iocb->private in blkdev_bio_end_io_async()
-
git://git.kernel.dk/linux-blockLinus Torvalds authored
Pull io_uring fixes from Jens Axboe: - Add a conditional schedule point in io_add_buffers() (Eric) - Fix for a quiesce speedup merged in this release (Dylan) - Don't convert to jiffies for event timeout waiting, it's way too coarse when we accept a timespec as input (me) * tag 'io_uring-5.17-2022-02-23' of git://git.kernel.dk/linux-block: io_uring: disallow modification of rsrc_data during quiesce io_uring: don't convert to jiffies for waiting on timeouts io_uring: add a schedule point in io_add_buffers()
-
Linus Torvalds authored
Merge tag 'platform-drivers-x86-v5.17-4' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull more x86 platform driver fixes from Hans de Goede: "Two more fixes: - Fix suspend/resume regression on AMD Cezanne APUs in >= 5.16 - Fix Microsoft Surface 3 battery readings" * tag 'platform-drivers-x86-v5.17-4' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: surface: surface3_power: Fix battery readings on batteries without a serial number platform/x86: amd-pmc: Set QOS during suspend on CZN w/ timer wakeup
-
Mauri Sandberg authored
Obtaining a MAC address may be deferred in cases when the MAC is stored in an NVMEM block, for example, and it may not be ready upon the first retrieval attempt and return EPROBE_DEFER. It is also possible that a port that does not rely on NVMEM has been already created when getting the defer request. Thus, also the resources allocated previously must be freed when doing a roll-back. Fixes: 76723bca ("net: mv643xx_eth: add DT parsing support") Signed-off-by: Mauri Sandberg <maukka@ext.kapsi.fi> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20220223142337.41757-1-maukka@ext.kapsi.fiSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Xin Long authored
As Jakub noticed, prints should be avoided on the datapath. Also, as packets would never come to the else branch in ping_lookup(), remove pr_err() from ping_lookup(). Fixes: 35a79e64 ("ping: fix the dif and sdif check in ping_lookup") Reported-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Xin Long <lucien.xin@gmail.com> Link: https://lore.kernel.org/r/1ef3f2fcd31bd681a193b1fcf235eee1603819bd.1645674068.git.lucien.xin@gmail.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Mateusz Palczewski authored
Revert of a patch that instead of fixing a AQ error when trying to reset BW limit introduced several regressions related to creation and managing TC. Currently there are errors when creating a TC on both PF and VF. Error log: [17428.783095] i40e 0000:3b:00.1: AQ command Config VSI BW allocation per TC failed = 14 [17428.783107] i40e 0000:3b:00.1: Failed configuring TC map 0 for VSI 391 [17428.783254] i40e 0000:3b:00.1: AQ command Config VSI BW allocation per TC failed = 14 [17428.783259] i40e 0000:3b:00.1: Unable to configure TC map 0 for VSI 391 This reverts commit 3d250466. Fixes: 3d250466 (i40e: Fix reset bw limit when DCB enabled with 1 TC) Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20220223175347.1690692-1-anthony.l.nguyen@intel.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Paul Blakey authored
Ipv6 ttl, label and tos fields are modified without first pulling/pushing the ipv6 header, which would have updated the hw csum (if available). This might cause csum validation when sending the packet to the stack, as can be seen in the trace below. Fix this by updating skb->csum if available. Trace resulted by ipv6 ttl dec and then sending packet to conntrack [actions: set(ipv6(hlimit=63)),ct(zone=99)]: [295241.900063] s_pf0vf2: hw csum failure [295241.923191] Call Trace: [295241.925728] <IRQ> [295241.927836] dump_stack+0x5c/0x80 [295241.931240] __skb_checksum_complete+0xac/0xc0 [295241.935778] nf_conntrack_tcp_packet+0x398/0xba0 [nf_conntrack] [295241.953030] nf_conntrack_in+0x498/0x5e0 [nf_conntrack] [295241.958344] __ovs_ct_lookup+0xac/0x860 [openvswitch] [295241.968532] ovs_ct_execute+0x4a7/0x7c0 [openvswitch] [295241.979167] do_execute_actions+0x54a/0xaa0 [openvswitch] [295242.001482] ovs_execute_actions+0x48/0x100 [openvswitch] [295242.006966] ovs_dp_process_packet+0x96/0x1d0 [openvswitch] [295242.012626] ovs_vport_receive+0x6c/0xc0 [openvswitch] [295242.028763] netdev_frame_hook+0xc0/0x180 [openvswitch] [295242.034074] __netif_receive_skb_core+0x2ca/0xcb0 [295242.047498] netif_receive_skb_internal+0x3e/0xc0 [295242.052291] napi_gro_receive+0xba/0xe0 [295242.056231] mlx5e_handle_rx_cqe_mpwrq_rep+0x12b/0x250 [mlx5_core] [295242.062513] mlx5e_poll_rx_cq+0xa0f/0xa30 [mlx5_core] [295242.067669] mlx5e_napi_poll+0xe1/0x6b0 [mlx5_core] [295242.077958] net_rx_action+0x149/0x3b0 [295242.086762] __do_softirq+0xd7/0x2d6 [295242.090427] irq_exit+0xf7/0x100 [295242.093748] do_IRQ+0x7f/0xd0 [295242.096806] common_interrupt+0xf/0xf [295242.100559] </IRQ> [295242.102750] RIP: 0033:0x7f9022e88cbd [295242.125246] RSP: 002b:00007f9022282b20 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffda [295242.132900] RAX: 0000000000000005 RBX: 0000000000000010 RCX: 0000000000000000 [295242.140120] RDX: 00007f9022282ba8 RSI: 00007f9022282a30 RDI: 00007f9014005c30 [295242.147337] RBP: 00007f9014014d60 R08: 0000000000000020 R09: 00007f90254a8340 [295242.154557] R10: 00007f9022282a28 R11: 0000000000000246 R12: 0000000000000000 [295242.161775] R13: 00007f902308c000 R14: 000000000000002b R15: 00007f9022b71f40 Fixes: 3fdbd1ce ("openvswitch: add ipv6 'set' action") Signed-off-by: Paul Blakey <paulb@nvidia.com> Link: https://lore.kernel.org/r/20220223163416.24096-1-paulb@nvidia.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Niels Dossche authored
valid_lft, prefered_lft and tstamp are always accessed under the lock "lock" in other places. Reading these without taking the lock may result in inconsistencies regarding the calculation of the valid and preferred variables since decisions are taken on these fields for those variables. Signed-off-by: Niels Dossche <dossche.niels@gmail.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Niels Dossche <niels.dossche@ugent.be> Link: https://lore.kernel.org/r/20220223131954.6570-1-niels.dossche@ugent.beSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Fabio M. De Francesco authored
smc_pnetid_by_table_ib() uses read_lock() and then it calls smc_pnet_apply_ib() which, in turn, calls mutex_lock(&smc_ib_devices.mutex). read_lock() disables preemption. Therefore, the code acquires a mutex while in atomic context and it leads to a SAC bug. Fix this bug by replacing the rwlock with a mutex. Reported-and-tested-by: syzbot+4f322a6d84e991c38775@syzkaller.appspotmail.com Fixes: 64e28b52 ("net/smc: add pnet table namespace support") Confirmed-by: Tony Lu <tonylu@linux.alibaba.com> Signed-off-by: Fabio M. De Francesco <fmdefrancesco@gmail.com> Acked-by: Karsten Graul <kgraul@linux.ibm.com> Link: https://lore.kernel.org/r/20220223100252.22562-1-fmdefrancesco@gmail.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-