- 26 Jan, 2022 18 commits
-
-
David S. Miller authored
Russell King says: ==================== net: stmmac/xpcs: modernise PCS support This series updates xpcs and stmmac for the recent changes to phylink to better support split PCS and to get rid of private MAC validation functions. This series is slightly more involved than other conversions as stmmac has already had optional proper split PCS support. The first six patches of this series were originally posted on 16th December for CFT, and Wong Vee Khee reported his Intel Elkhart Lake setup was fine the first six these. However, no tested-by was given. The patches: 1) Provide a function to query the xpcs for the interface modes that are supported. 2) Populates the MAC capabilities and switches stmmac_validate() to use phylink_get_linkmodes(). We do not use phylink_generic_validate() yet as (a) we do not always have the supported interfaces populated, and (b) the existing code does not restrict based on interface. There should be no functional effect from this patch. 3) Populates phylink's supported interfaces from the xpcs when the xpcs is configured by firmware and also the firmware configured interface mode. Note: this will restrict stmmac to only supporting these interfaces modes - stmmac maintainers need to verify that this behaviour is acceptable. 4) stmmac_validate() tail-calls xpcs_validate(), but we don't need it to now that PCS have their own validation method. Convert stmmac and xpcs to use this method instead. 5) xpcs sets the poll field of phylink_pcs to true, meaning xpcs requires its status to be polled. There is no need to also set the phylink_config.pcs_poll. Remove this. 6) Switch to phylink_generic_validate(). This is probably the most contravertial change in this patch set as this will cause the MAC to restrict link modes based on the interface mode. From an inspection of the xpcs driver, this should be safe, as XPCS only further restricts the link modes to a subset of these (whether that is correct or not is not an issue I am addressing here.) For implementations that do not use xpcs, this is a more open question and needs feedback from stmmac maintainers. 7) Convert to use mac_select_pcs() rather than phylink_set_pcs() to set the PCS - the intention is to eventually remove phylink_set_pcs() once there are no more users of this. v2: fix signoff and temporary warning in patch 4 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Russell King (Oracle) authored
Convert stmmac to use the mac_select_pcs() interface rather than using phylink_set_pcs(). The intention here is to unify the approach for PCS and eventually to remove phylink_set_pcs(). Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Russell King (Oracle) authored
Convert stmmac to use phylink_generic_validate() now that we have the MAC capabilities and supported interfaces filled in, and we have the PCS validation handled via the PCS operations. Tested-by: Wong Vee Khee <vee.khee.wong@linux.intel.com> # Intel EHL Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Russell King (Oracle) authored
Phylink will use PCS polling whenever the PCS's poll member is set, so setting phylink_config.pcs_poll as well is redundant. Tested-by: Wong Vee Khee <vee.khee.wong@linux.intel.com> # Intel EHL Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Russell King (Oracle) authored
stmmac explicitly calls the xpcs driver to validate the ethtool linkmodes. This is no longer necessary as phylink now supports validation through a PCS method. Convert both drivers to use this new mechanism. Tested-by: Wong Vee Khee <vee.khee.wong@linux.intel.com> # Intel EHL Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Russell King (Oracle) authored
Fill in phylink's supported_interfaces bitmap with the PHY interface modes which can be used to talk to the PHY. We indicate that the PHY interface mode passed in platform data is always supported, as this is the initial mode passed into phylink. When there is no PCS specified, we assume that this is the only mode that is supported - indeed, the driver appears not to support dynamic switching of interface types at present. When a xpcs is present, it defines the PHY interface modes that the stmmac driver can support. Request the supported interfaces from the xpcs driver, and pass them to phylink. Tested-by: Wong Vee Khee <vee.khee.wong@linux.intel.com> # Intel EHL Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Russell King (Oracle) authored
Add the MAC speed, duplex and pause capabilities to the phylink_config structure, and switch stmmac_validate() to use phylink_get_linkmodes() to generate the mask of supported ethtool link modes. Tested-by: Wong Vee Khee <vee.khee.wong@linux.intel.com> # Intel EHL Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Russell King (Oracle) authored
Add a function to the xpcs driver to retrieve the supported PHY interface modes, which can be used by drivers to fill in phylink's supported_interfaces mask. We validate the interface bit index to ensure that it fits within the bitmap as xpcs lists PHY_INTERFACE_MODE_MAX in an entry. Tested-by: Wong Vee Khee <vee.khee.wong@linux.intel.com> # Intel EHL Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Ido Schimmel says: ==================== mlxsw: Add RJ45 ports support We are in the process of qualifying a new system that has RJ45 ports as opposed to the transceiver modules (e.g., SFP, QSFP) present on all existing systems. This patchset adds support for these ports in mlxsw by adding a couple of missing BaseT link modes and rejecting ethtool operations that are specific to transceiver modules. Patchset overview: Patches #1-#3 are cleanups and preparations. Patch #4 adds support for two new link modes. Patches #5-#6 query and cache the port module's type (e.g., QSFP, RJ45) during initialization. Patches #7-#9 forbid ethtool operations that are invalid on RJ45 ports. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Danielle Ratson authored
Transceiver module reset through 'rst' field in PMAOS register is not supported on RJ45 ports, so module reset should be rejected. Therefore, before trying to access this field, validate the port module type that was queried during initialization and return an error to user space in case the port module type is RJ45 (twisted pair). Output example: # ethtool --reset swp11 phy ETHTOOL_RESET 0x40 Cannot issue ETHTOOL_RESET: Invalid argument $ dmesg mlxsw_spectrum 0000:03:00.0 swp11: Reset module is not supported on port module type Signed-off-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Danielle Ratson authored
PMMP (Port Module Memory Map Properties) and MCION (Management Cable IO and Notifications) registers are not supported on RJ45 ports, so setting and getting power mode should be rejected. Therefore, before trying to access those registers, validate the port module type that was queried during initialization and return an error to user space in case the port module type is RJ45 (twisted pair). Set output example: # ethtool --set-module swp1 power-mode-policy auto netlink error: mlxsw_core: Power mode is not supported on port module type netlink error: Invalid argument Get output example: $ ethtool --show-module swp11 netlink error: mlxsw_core: Power mode is not supported on port module type netlink error: Invalid argument Signed-off-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Danielle Ratson authored
MCIA (Management Cable Info Access) register is not supported on RJ45 ports, so getting module EEPROM should be rejected. Therefore, before trying to access this register, validate the port module type that was queried during initialization and return an error to user space in case the port module type is RJ45 (twisted pair). Examples for output when trying to get EEPROM module: Using netlink: # ethtool -m swp1 netlink error: mlxsw_core: EEPROM is not equipped on port module type netlink error: Invalid argument Using IOCTL: # ethtool -m swp1 Cannot get module EEPROM information: Invalid argument $ dmesg mlxsw_spectrum 0000:03:00.0 swp1: EEPROM is not equipped on port module type Signed-off-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Danielle Ratson authored
Query and store port module's type during initialization so that it could be later used to determine if certain configurations are allowed based on the type. Signed-off-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Danielle Ratson authored
Add the Port Module Type Mapping (PMTP) register. It will be used by subsequent patches to query port module types and forbid certain configurations based on the port module's type. Signed-off-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Danielle Ratson authored
As part of a process for supporting a new system with RJ45 connectors, 100BaseT and 1000BaseT link modes need to be supported. Add support for these two link modes by adding the two corresponding bits in PTYS (Port Type and Speed) register. Signed-off-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Danielle Ratson authored
The next patches will forbid querying the port module's EEPROM info when its type is RJ45 as in this case no transceiver module can ever be connected to the port. Add netdev argument to mlxsw_env_get_module_info() so it could be used to print an error to the kernel log via netdev_err(). Signed-off-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ido Schimmel authored
The number of modules can be resolved from the first argument, so do not pass it. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ido Schimmel authored
Remove the 'err' variable and simply return. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 25 Jan, 2022 22 commits
-
-
David Ahern authored
sk_gso_max_size is set based on the dst dev. Both users of it adjust the value by the same offset - (MAX_TCP_HEADER + 1). Rather than compute the same adjusted value on each call do the adjustment once when set. Signed-off-by: David Ahern <dsahern@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20220125024511.27480-1-dsahern@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Colin Ian King authored
Variable new_csr6 is being initialized with a value that is never read, it is being re-assigned later on. The assignment is redundant and can be removed. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Link: https://lore.kernel.org/r/20220123183440.112495-1-colin.i.king@gmail.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
IPv6 GRO considers packets to belong to different flows when their hop_limit is different. This seems counter-intuitive, the flow is the same. hop_limit may vary because of various bugs or hacks but that doesn't mean it's okay for GRO to reorder packets. Practical impact of this problem on overall TCP performance is unclear, but TCP itself detects this reordering and bumps TCPSACKReorder resulting in user complaints. Eric warns that there may be performance regressions in setups which do packet spraying across links with similar RTT but different hop count. To be safe let's target -next and not treat this as a fix. If the packet spraying is using flow label there should be no difference in behavior as flow label is checked first. Note that the code plays an easy to miss trick by upcasting next_hdr to a u16 pointer and compares next_hdr and hop_limit in one go. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Gustavo A. R. Silva authored
Make use of the struct_size() helper instead of an open-coded version, in order to avoid any potential type mistakes or integer overflows that, in the worst scenario, could lead to heap overflows. Also, address the following sparse warnings: drivers/net/ethernet/microsoft/mana/gdma_main.c:677:24: warning: using sizeof on a flexible structure Link: https://github.com/KSPP/linux/issues/174Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Heiner Kallweit authored
This is based on series [0] that extended the PM core. Now the compiler can see the PM callbacks also on systems not defining CONFIG_PM. The optimizer will remove the functions then in this case. [0] https://lore.kernel.org/netdev/20211207002102.26414-1-paul@crapouillou.net/Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Tobias Waldekranz says: ==================== net: dsa: Avoid cross-chip syncing of VLAN filtering This bug has been latent in the source for quite some time, I suspect due to the homogeneity of both typical configurations and hardware. On singlechip systems, this would never be triggered. The only reason I saw it on my multichip system was because not all chips had the same number of ports, which means that the misdemeanor alien call turned into a felony array-out-of-bounds access. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Tobias Waldekranz authored
Changes to VLAN filtering are not applicable to cross-chip notifications. On a system like this: .-----. .-----. .-----. | sw1 +---+ sw2 +---+ sw3 | '-1-2-' '-1-2-' '-1-2-' Before this change, upon sw1p1 leaving a bridge, a call to dsa_port_vlan_filtering would also be made to sw2p1 and sw3p1. In this scenario: .---------. .-----. .-----. | sw1 +---+ sw2 +---+ sw3 | '-1-2-3-4-' '-1-2-' '-1-2-' When sw1p4 would leave a bridge, dsa_port_vlan_filtering would be called for sw2 and sw3 with a non-existing port - leading to array out-of-bounds accesses and crashes on mv88e6xxx. Fixes: d371b7c9 ("net: dsa: Unset vlan_filtering when ports leave the bridge") Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Tobias Waldekranz authored
Most of dsa_switch_bridge_leave was, in fact, dealing with the syncing of VLAN filtering for switches on which that is a global setting. Separate the two phases to prepare for the cross-chip related bugfix in the following commit. Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Eric Dumazet says: ==================== netns: speedup netns dismantles netns are dismantled by a single thread, from cleanup_net() On hosts with many TCP sockets, and/or many cpus, this thread is spending too many cpu cycles, and can not keep up with some workloads. - Removing 3*num_possible_cpus() sockets per netns, for icmp and tcp protocols. - Iterating over all TCP sockets to remove stale timewait sockets. This patch series removes ~50% of cleanup_net() cpu costs on hosts with 256 cpus. It also reduces per netns memory footprint. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
TCP ipv4 uses per-cpu/per-netns ctl sockets in order to send RST and some ACK packets (on behalf of TIMEWAIT sockets). This adds memory and cpu costs, which do not seem needed. Now typical servers have 256 or more cores, this adds considerable tax to netns users. tcp sockets are used from BH context, are not receiving packets, and do not store any persistent state but the 'struct net' pointer in order to be able to use IPv4 output functions. Note that I attempted a related change in the past, that had to be hot-fixed in commit bdbbb852 ("ipv4: tcp: get rid of ugly unicast_sock") This patch could very well surface old bugs, on layers not taking care of sk->sk_kern_sock properly. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
Back in linux-2.6.25 (commit 98c6d1b2 "[NETNS]: Make icmpv6_sk per namespace.", we added private per-cpu/per-netns ipv6 icmp sockets. This adds memory and cpu costs, which do not seem needed. Now typical servers have 256 or more cores, this adds considerable tax to netns users. icmp sockets are used from BH context, are not receiving packets, and do not store any persistent state but the 'struct net' pointer. icmpv6_xmit_lock() already makes sure to lock the chosen per-cpu socket. This patch has a considerable impact on the number of netns that the worker thread in cleanup_net() can dismantle per second, because ip6mr_sk_done() is no longer called, meaning we no longer acquire the rtnl mutex, competing with other threads adding new netns. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
Back in linux-2.6.25 (commit 4a6ad7a1 "[NETNS]: Make icmp_sk per namespace."), we added private per-cpu/per-netns ipv4 icmp sockets. This adds memory and cpu costs, which do not seem needed. Now typical servers have 256 or more cores, this adds considerable tax to netns users. icmp sockets are used from BH context, are not receiving packets, and do not store any persistent state but the 'struct net' pointer. icmp_xmit_lock() already makes sure to lock the chosen per-cpu socket. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
Prior patches in the series made sure tw_timer_handler() can be fired after netns has been dismantled/freed. We no longer have to scan a potentially big TCP ehash table at netns dismantle. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
We will soon get rid of inet_twsk_purge(). This means that tw_timer_handler() might fire after a netns has been dismantled/freed. Instead of adding a function (and data structure) to find a netns from tw->tw_net_cookie, just update the SNMP counters a bit earlier, when the netns is known to be alive. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
We want to allow inet_twsk_kill() working even if netns has been dismantled/freed, to get rid of inet_twsk_purge(). This patch adds tw->tw_bslot to cache the bind bucket slot so that inet_twsk_kill() no longer needs to dereference twsk_net(tw) Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Shannon Nelson says: ==================== ionic: updates for stable FW recovery Recent FW work has tightened up timings in its error recovery handling and uncovered weaknesses in the driver's responses, so this is a set of updates primarily for better handling of the firmware's recovery mechanisms. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Shannon Nelson authored
This (ab)use of a data buffer made some static code checkers rather itchy, so we replace the a generic data buffer with the union in the struct ionic_vf_setattr_cmd. Fixes: fbb39807 ("ionic: support sr-iov operations") Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Shannon Nelson authored
The driver can be premature in detecting stalled firmware when the heartbeat is not updated because the firmware can occasionally take a long time (more than 2 seconds) to service a request, and doesn't update the heartbeat during that time. The firmware heartbeat is not necessarily a steady 1 second periodic beat, but better described as something that should progress at least once in every DECVMD_TIMEOUT period. The single-threaded design in the FW means that if a devcmd or adminq request launches a large internal job, it is stuck waiting for that job to finish before it can get back to updating the heartbeat. Since all requests are "guaranteed" to finish within the DEVCMD_TIMEOUT period, the driver needs to less aggressive in checking the heartbeat progress. We change our current 2 second window to something bigger than DEVCMD_TIMEOUT which should take care of most of the issue. We stop checking for the heartbeat while waiting for a request, as long as we're still watching for the FW status. Lastly, we make sure our FW status is up to date before running a devcmd request. Once we do this, we need to not check the heartbeat on DEV commands because it may be stalled while we're on the fw_down path. Instead, we can rely on the is_fw_running check. Fixes: b2b9a8d7 ("ionic: avoid races in ionic_heartbeat_check") Signed-off-by: Brett Creeley <brett@pensando.io> Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Shannon Nelson authored
The dbid_inuse bitmap is not useful in this driver so remove it. Fixes: 6461b446 ("ionic: Add interrupts and doorbells") Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Brett Creeley authored
When the driver is going through reset, it will eventually call ionic_lif_init(), which does a lot of re-initialization. One of the re-initialization steps is to setup the adminq and enable napi for it. If something breaks after this point we can end up with a kernel NULL pointer dereference through ionic_adminq_napi. Fix this by making sure to call napi_disable() in the cleanup path of ionic_lif_init(). This forces any pending napi contexts to finish and prevents them from being recalled before deleting the napi context. Fixes: 77ceb68e ("ionic: Add notifyq support") Signed-off-by: Brett Creeley <brett@pensando.io> Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Brett Creeley authored
Buffer DMA mapping happens in ionic_tx_map_skb() and this function is called from ionic_tx() and ionic_tx_tso(). If ionic_tx_map_skb() succeeds, but a failure is encountered later in ionic_tx() or ionic_tx_tso() we aren't unmapping the buffers. This can be fixed in ionic_tx() by changing functions it calls to return void because they always return 0. For ionic_tx_tso(), there's an actual possibility that we leave the buffers mapped, so fix this by introducing the helper function ionic_tx_desc_unmap_bufs(). This function is also re-used in ionic_tx_clean(). Fixes: 0f3154e6 ("ionic: Add Tx and Rx handling") Signed-off-by: Brett Creeley <brett@pensando.io> Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Brett Creeley authored
Currently when a request for add/deleting a filter is made when ionic_heartbeat_check() returns failure the driver will be overly verbose about failures, especially when these are usually temporary fails and the request will be retried later. An example of this is a filter add when the FW is in the middle of resetting: IONIC_CMD_RX_FILTER_ADD (31) failed: IONIC_RC_ERROR (-6) rx_filter add failed: ADDR 01:80:c2:00:00:0e Fix this by checking for -ENXIO and other error values on filter request fails before printing the error message. Add similar checking to the delete filter code. Fixes: f91958cc ("ionic: tame the filter no space message") Signed-off-by: Brett Creeley <brett@pensando.io> Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
-