- 17 Feb, 2020 40 commits
-
-
Ido Schimmel authored
Up until now only local ports and the router port (which is also a local port) took a reference on the corresponding FID (Filtering Identifier) when joining a bridge. For example: 192.0.2.1/24 br0 | +------+------+ | | swp1 vxlan0 In this case the reference count of the FID will be '2'. Since the VXLAN device does not take a reference on the FID, whenever a local port joins the bridge it needs to check if a VXLAN device is already enslaved. If the VXLAN device should be mapped to the FID in question, then the VXLAN device's VNI is set on the FID. Beside the fact that this scheme special-cases the VXLAN device, it also creates an unnecessary dependency between the routing and bridge code: 1. [R] IP address is added on 'br0', which prompts the creation of a RIF and a backing FID 2. [B] VNI is enabled on backing FID 3. [R] Host route corresponding to VXLAN device's source address is promoted to perform NVE decapsulation [R] - Routing code [B] - Bridge code This back and forth dependency will become problematic when a lock is added in the routing code instead of relying on RTNL, as it will result in an AA deadlock. Instead, have the VXLAN device take a reference on the FID just like all the other netdev members of the bridge. In order to correctly handle the case where VXLAN devices are already enslaved to the bridge when it is offloaded, walk the bridge's slaves and replay the configuration. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ido Schimmel authored
Propagate extack to bridge creation function so that error messages could be passed to user space via netlink instead of printing them to kernel log. A subsequent patch will pass the new extack argument to more functions. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ido Schimmel authored
'refcount_t' is very useful for catching over/under flows. Convert the FID (Filtering Identifier) objects to use it instead of 'unsigned int' for their reference count. A subsequent patch in the series will change the way VXLAN devices hold / release the FID reference, which is why the conversion is made now. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Julian Wiedmann authored
This enables ndo_dflt_bridge_getlink() to report a bridge port's offload settings for multicast and broadcast flooding. CC: Roopa Prabhu <roopa@cumulusnetworks.com> CC: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Edward Cree says: ==================== couple more ARFS tidy-ups Tie up some loose ends from the recent ARFS work. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Edward Cree authored
efx_filter_rfs_expire() is a work-function, so it being inline makes no sense. It's only ever used in efx_channels.c, so move it there. While we're at it, clean out some related unused cruft. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Edward Cree authored
Prevent excessive CPU time spent running a workitem with nothing to do. We avoid any races by keeping the same check in efx_filter_rfs_expire(). Suggested-by: Martin Habets <mhabets@solarflare.com> Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Julian Wiedmann authored
When a real dev unregisters, vlan_device_event() also unregisters all of its vlan interfaces. For each VID this ends up in __vlan_vid_del(), which attempts to remove the VID from the real dev's VLAN filter. But the unregistering real dev might no longer be able to issue the required IOs, and return an error. Subsequently we raise a noisy warning msg that is not appropriate for this situation: the real dev is being torn down anyway, there shouldn't be any worry about cleanly releasing all of its HW-internal resources. So to avoid scaring innocent users, suppress this warning when the failed deletion happens on an unregistering device. While at it also convert the raw pr_warn() to a more fitting netdev_warn(). Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Andy Shevchenko authored
Since PCI core provides a generic PCI_DEVICE_DATA() macro, replace STMMAC_DEVICE() with former one. No functional change intended. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Vlad Buslov says: ==================== Remove rtnl lock dependency from flow_action infra Currently, TC flow_action infrastructure code obtain rtnl lock before accessing action state in tc_setup_flow_action() function and releases it afterwards. This behavior is not supposed to impact TC filter insertion rate because filling flow_action representation is only a small part of creating new filter and expensive operations (hardware offload callbacks, classifiers, cls API code that creates chains and classifiers instances) already support unlocked execution. However, typical vswitch implementation might need to also dump TC filters concurrently, for example to age out unused flows or update flow counters. TC dump is fully serialized and holds rtnl lock during its whole execution in kernel space. As such, it can significantly impact concurrent tasks that try to intermittently obtain rtnl lock when filling intermediate representation for new filter offload (performance evaluation at the end of this mail). Refactor flow_action cls API infrastructure and its dependencies to not rely on rtnl lock for synchronization. Patch set overview: - Refactor tc_setup_flow_action() to obtain action tcf_lock when accessing action state. Fix its dependencies to not obtain tcf_lock themselves and assume that caller already holds it (needs to be done in same patch to prevent deadlock) and not to call sleeping functions (needs to be done in same patch to prevent "sleeping while atomic" dmesg warnings). - Refactor action helper functions to require tcf_lock instead of rtnl. Internally, all of the actions already use tcf_lock for synchronization to accommodate unlocked classifier API, so this change relies on already existing functionality. - Remove rtnl lock and "rtnl_held" argument from tc_setup_flow_action() function. To test the change, multiple concurrent TC instances are invoked with following command: time ls add* | xargs -n 1 -P 100 sudo tc -b Ten batch files with following typical rules (100k each) are used: filter add dev ens1f0_0 protocol ip ingress prio 1 handle 1 flower src_mac e4:11:0:0:0:0 dst_mac e4:12:0:0:0:0 src_ip 192.168.111.1 dst_ip 192.168.111.2 ip_proto udp dst_port 1 src_port 1 action tunnel_key set id 1 src_ip 2.2.2.2 dst_ip 2.2.2.3 dst_port 4789 no_percpu action mirred egress redirect dev vxlan1 no_percpu TC dump of same device is called in infinite loop from five concurrent instances: while true do tc -s filter show dev $NIC ingress >/dev/null done Results obtained on current net-next commit 9f68e365 ("Merge tag 'drm-next-2020-01-30' of git://anongit.freedesktop.org/drm/drm"): | net-next | this change ---------------+----------+------------- TC add | 6.3s | 6.3s TC add + dump | 29.3s | 6.8s Test results confirm significant impact of concurrent TC dump. The impact is almost fully mitigated by proposed change (differences can be attributed to contention for chain and tp locks between add and dump TC instances). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vlad Buslov authored
Refactor tc_setup_flow_action() function not to use rtnl lock and remove 'rtnl_held' argument that is no longer needed. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vlad Buslov authored
In order to remove rtnl lock dependency from flow_action representation translator, change rtnl_dereference() to rcu_dereference_protected() in ct action helpers that provide external access to zone and action values. This is safe to do because the functions are not called from anywhere else outside flow_action infrastructure which was modified to obtain tcf_lock when accessing action data in one of previous patches in the series. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vlad Buslov authored
In order to remove rtnl lock dependency from flow_action representation translator, change rcu_dereference_bh_rtnl() to rcu_dereference_protected() in police action helpers that provide external access to rate and burst values. This is safe to do because the functions are not called from anywhere else outside flow_action infrastructure which was modified to obtain tcf_lock when accessing action data in one of previous patches in the series. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vlad Buslov authored
In order to remove dependency on rtnl lock, take action's tcfa_lock when constructing its representation as flow_action_entry structure. Refactor tcf_sample_get_group() to assume that caller holds tcf_lock and don't take it manually. This callback is only called from flow_action infra representation translator which now calls it with tcf_lock held, so this refactoring is necessary to prevent deadlock. Allocate memory with GFP_ATOMIC flag for ip_tunnel_info copy because tcf_tunnel_info_copy() is only called from flow_action representation infra code with tcf_lock spinlock taken. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Lorenzo Bianconi says: ==================== add xdp ethtool stats to mvneta driver Rework mvneta stats accounting in order to introduce xdp ethtool statistics in the mvneta driver. Introduce xdp_redirect, xdp_pass, xdp_drop and xdp_tx counters to ethtool statistics. Fix skb_alloc_error and refill_error ethtool accounting ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Lorenzo Bianconi authored
Get rid of xdp_ret in mvneta_swbm_rx_frame routine since now we can rely on xdp_stats to flush in case of xdp_redirect Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Lorenzo Bianconi authored
Add xdp_redirect, xdp_pass, xdp_drop and xdp_tx counters to ethtool statistics Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Lorenzo Bianconi authored
Introduce mvneta_stats structure in mvneta_update_stats routine signature in order to collect all the rx stats and update them at the end at the napi loop. mvneta_stats will be reused adding xdp statistics support to ethtool. Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Lorenzo Bianconi authored
In oreder to avoid unnecessary instructions rely on open-coding updating per-cpu stats in mvneta_tx/mvneta_xdp_submit_frame and mvneta_rx_hwbm routines. This patch will be used to add xdp support to ethtool for the mvneta driver Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Lorenzo Bianconi authored
mvneta_ethtool_update_stats routine is currently reporting skb_alloc_error and refill_error only for the first rx queue. Fix the issue moving skb_alloc_err and refill_err in mvneta_pcpu_stats structure. Moreover this patch will be used to introduce xdp statistics to ethtool for the mvneta driver Fixes: 17a96da6 ("net: mvneta: discriminate error cause for missed packet") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Andrew Lunn says: ==================== mv88e6xxx: Add SERDES/PCS registers to ethtool -d ethtool -d will dump the registers of an interface. For mv88e6xxx switch ports, this dump covers the port specific registers. Extend this with the SERDES/PCS registers, if a port has a SERDES. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Andrew Lunn authored
The mv88e6390 has upto 8 sets of PCS registers, depending on how ports 9 and 10 are configured. The can be spread over 8 ports. If a port has a PCS register set, return it along with the port registers. The register space is sparse, so hard code a list of registers which will be returned. It can later be extended, if needed, by append to the end of the list. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Andrew Lunn authored
The mv88e6352 has one PCS which can be used for 1000BaseX or SGMII. Add the registers to the dump for the port which the PCS is associated to. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Andrew Lunn authored
ethtool provides a generic mechanism for a driver to return the registers of an ethernet device. DSA uses this to give the port registers associated with an interfaces. Extend this to allow PCS registers to also be returned, if the port has a PCS associated to it. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queueDavid S. Miller authored
Jeff Kirsher says: ==================== 100GbE Intel Wired LAN Driver Updates 2020-02-15 This series contains updates to ice driver only. Brett adds support for "Queue in Queue" (QinQ) support, by supporting S-tag & C-tag VLAN traffic by disabling pruning when there are no 0x8100 VLAN interfaces currently on top of the PF. Also refactored the port VLAN configuration to re-use the common code for enabling and disabling a port VLAN in single function. Added a helper function to determine if the VF link is up. Fixed how the port VLAN configures the priority bits for a VF interface. Fixed the port VLAN to only see its own broadcast and multicast traffic. Added support to enable and disable all receive queues, by refactoring adding a new function to do the necessary steps to enable/disable a queue with the necessary read flush. Fixed how we set the mapping mode for transmit and receive queues. Added support for VF queues to handle LAN overflow events. Fixed and refactored how receive queues get disabled for VFs, which was being handled one queue at at time, so improve it to handle when the VF is requesting more than one queue to be disabled. Fixed how the virtchnl_queue_select bitmap is validated. Finally a patch not authored by Brett, Bruce cleans up "fallthrough" comments which are unnecessary. Also replaces the "fallthough" comments with the GCC reserved word fallthrough, along with other GCC compiler fixes. Add missing function header comment regarding a function argument that was missing. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Finn Thain says: ==================== Improvements for SONIC ethernet drivers Now that the necessary sonic driver fixes have been merged, and the merge window has closed again, I'm sending the remainder of my sonic driver patch queue. A couple of these patches will have to be applied in sequence to avoid 'git am' rejects. The others are independent and could have been submitted individually. Please let me know if I should do that. The complete sonic driver patch queue was tested on National Semiconductor hardware (macsonic), qemu-system-m68k (macsonic) and qemu-system-mips64el (jazzsonic). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Finn Thain authored
On m68k, local irqs remain enabled while interrupt handlers execute. Therefore the macsonic driver has had to disable interrupts to avoid re-entering sonic_interrupt(). As of commit 865ad2f2 ("net/sonic: Add mutual exclusion for accessing shared state"), sonic_interrupt() became re-entrant, and its wrapper became redundant. Tested-by: Stan Johnson <userm57@yahoo.com> Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Finn Thain authored
Give the transmit command as soon as the transmit descriptor is ready. Tested-by: Stan Johnson <userm57@yahoo.com> Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Finn Thain authored
The explicit memory barriers are redundant now that proper locking and MMIO accessors have been employed. Tested-by: Stan Johnson <userm57@yahoo.com> Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Finn Thain authored
The transmit queue must be running already otherwise sonic_send_packet() would not have been called. If the queue was stopped by the interrupt handler, the interrupt handler will restart it again. Tested-by: Stan Johnson <userm57@yahoo.com> Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Finn Thain authored
The eol_tx variable is the one that matters to the tx algorithm because packets are always placed at the end of the list. The next_tx variable just confuses things so remove it. Tested-by: Stan Johnson <userm57@yahoo.com> Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Finn Thain authored
No functional change. Tested-by: Stan Johnson <userm57@yahoo.com> Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Finn Thain authored
The comment is meaningless since mark_bh() was removed a long time ago. Tested-by: Stan Johnson <userm57@yahoo.com> Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Sergei Shtylyov says: ==================== sh_eth: get rid of the dedicated regiseter mapping for RZ/A1 (R7S72100) Here's a set of 5 patches against DaveM's 'net-next.git' repo. I changed my mind about the RZ/A1 SoC needing its own register map -- now that we don't depend on the register map array in order to determine whether a given register exists any more, we can add a new flag to determine if the GECMR exists (this register is present only on true GEther chips, not RZ/A1). We also need to add the sh_eth_cpu_data::* flag checks where they were missing so far: in the ethtool API for the register dump. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Sergei Shtylyov authored
The register maps for the Gigabit controllers and the Ether one used on RZ/A1 (AKA R7S72100) are identical except for GECMR which is only present on the true GEther controllers. We no longer use the register map arrays to determine if a given register exists, and have added the GECMR flag to the 'struct sh_eth_cpu_data' in the previous patch, so we're ready to drop the R7S72100 specific register map -- this saves 216 bytes of object code (ARM gcc 4.8.5). Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Tested-by: Chris Brandt <chris.brandt@renesas.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Sergei Shtylyov authored
Not all Ether controllers having the Gigabit register layout have GECMR -- RZ/A1 (AKA R7S72100) actually has the same layout but no Gigabit speed support and hence no GECMR. In the past, the new register map table was added for this SoC, now I think we should have used the existing Gigabit table with the differences (such as GECMR) covered by the mere flags in the 'struct sh_eth_cpu_data'. Add such flag for GECMR -- and then we can get rid of the R7S72100 specific layout in the next patch... Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Tested-by: Chris Brandt <chris.brandt@renesas.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Sergei Shtylyov authored
When adding the sh_eth_cpu_data::no_xdfar flag I forgot to add the flag check to __sh_eth_get_regs(), causing the non-existing RDFAR/TDFAR to be considered for dumping on the R-Car gen1/2 SoCs (the register offset check has the final say here)... Fixes: 4c1d4585 ("sh_eth: add sh_eth_cpu_data::cexcr flag") Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Tested-by: Chris Brandt <chris.brandt@renesas.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Sergei Shtylyov authored
When adding the sh_eth_cpu_data::cexcr flag I forgot to add the flag check to __sh_eth_get_regs(), causing the non-existing RX packet counter registers to be considered for dumping on the R7S72100 SoC (the register offset sanity check has the final say here)... Fixes: 4c1d4585 ("sh_eth: add sh_eth_cpu_data::cexcr flag") Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Tested-by: Chris Brandt <chris.brandt@renesas.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Sergei Shtylyov authored
When adding the sh_eth_cpu_data::no_tx_cntrs flag I forgot to add the flag check to __sh_eth_get_regs(), causing the non-existing TX counter registers to be considered for dumping on the R7S72100 SoC (the register offset sanity check has the final say here)... Fixes: ce9134df ("sh_eth: add sh_eth_cpu_data::no_tx_cntrs flag") Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Tested-by: Chris Brandt <chris.brandt@renesas.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Russell King says: ==================== Pause updates for phylib and phylink Currently, phylib resolves the speed and duplex settings, which MAC drivers use directly. phylib also extracts the "Pause" and "AsymPause" bits from the link partner's advertisement, and stores them in struct phy_device's pause and asym_pause members with no further processing. It is left up to each MAC driver to implement decoding for this information. phylink converted drivers are able to take advantage of code therein which resolves the pause advertisements for the MAC driver, but this does nothing for unconverted drivers. It also does not allow us to make use of hardware-resolved pause states offered by several PHYs. This series aims to address this by: 1. Providing a generic implementation, linkmode_resolve_pause(), that takes the ethtool linkmode arrays for the link partner and local advertisements, decoding the result to whether pause frames should be allowed to be transmitted or received and acted upon. I call this the pause enablement state. 2. Providing a phylib implementation, phy_get_pause(), which allows MAC drivers to request the pause enablement state from phylib. 3. Providing a generic linkmode_set_pause() for setting the pause advertisement according to the ethtool tx/rx flags - note that this design has some shortcomings, see the comments in the kerneldoc for this function. 4. Remove the ability in phylink to set the pause states for fixed links, which brings them into line with how we deal with the speed and duplex parameters; we can reintroduce this later if anyone requires it. This could be a userspace-visible change. 5. Split application of manual pause enablement state from phylink's resolution of the same to allow use of phylib's new phy_get_pause() interface by phylink, and make that switch. 6. Resolve the fixed-link pause enablement state using the generic linkmode_resolve_pause() helper introduced earlier. This, in connection with the previous commits, allows us to kill the MLO_PAUSE_SYM and MLO_PAUSE_ASYM flags. 7. make phylink's ethtool pause setting implementation update the pause advertisement in the same way that phylib does, with the same caveats that are present there (as mentioned above w.r.t linkmode_set_pause()). 8. create a more accurate initial configuration for MACs, used when phy_start() is called or a SFP is detected. In particular, this ensures that the pause bits seen by MAC drivers in state->pause are accurate for SGMII. 9. finally, update the kerneldoc descriptions for mac_config() for the above changes. This series has been build-tested against net-next; the boot tested patches are in my "phy" branch against v5.5 plus the queued phylink changes that were merged for 5.6. The next series will introduce the ability for phylib drivers to provide hardware resolved pause enablement state. These patches can be found in my "phy" branch. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-