- 13 Jun, 2019 12 commits
-
-
Moshe Shemesh authored
Create mlx5_devlink_health_reporter for fw fatal reporter. The fw fatal reporter is added in addition to the fw reporter and implements the recover callback. The point of having two reporters for FW issues, is that we don't want to run FW recover on any issue, but only fatal ones. Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-
Moshe Shemesh authored
Use devlink_health_report() to report any symptom of FW issue as FW counter miss or new health syndrome. The FW issues detected in mlx5 during poll_health which is called in timer atomic context and so health work queue is used to schedule the reports. Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-
Moshe Shemesh authored
Add support of dump callback for mlx5 FW reporter. Once we trigger FW dump, the FW will write the core dump to its raw data buffer. The tracer translates the raw data to traces and save it to a cyclic array. Once dump is done, the saved traces data is filled into the dump buffer. In case syndrome is not zero the health buffer content will be printed as well. FW dump example: $ devlink health dump show pci/0000:82:00.0 reporter fw dump fw traces: timestamp: 509006640427 lost: false event_id: 185 msg: dump general info GVMI=0x0000 timestamp: 509006645474 lost: false event_id: 185 msg: GVMI management info, gvmi_management context: timestamp: 509006654463 lost: false event_id: 185 msg: [000]: 00000000 00000000 00000000 00000000 timestamp: 509006656127 lost: false event_id: 185 msg: [010]: 00000000 00000000 00000000 00000000 timestamp: 509006656255 lost: false event_id: 185 msg: [020]: 00000000 00000000 00000000 00000000 timestamp: 509006656511 lost: false event_id: 185 msg: [030]: 00000000 00000000 00000000 00000000 timestamp: 509006656639 lost: false event_id: 185 msg: [040]: 00000000 00000000 00000000 00000000 timestamp: 509006656895 lost: false event_id: 185 msg: [050]: 00000000 00000000 00000000 00000000 timestamp: 509006657023 lost: false event_id: 185 msg: [060]: 00000000 00000000 00000000 00000000 timestamp: 509006657180 lost: false event_id: 185 msg: [070]: 00000000 00000000 00000000 00000000 timestamp: 509006659839 lost: false event_id: 185 msg: CMDIF dbase from IRON: active_dbase_slots = 0x00000000 timestamp: 509006667391 lost: false event_id: 185 msg: GVMI=0x0000 hw_toc context: timestamp: 509006667647 lost: false event_id: 185 msg: [000]: 00000000 00000000 00000000 fffff000 timestamp: 509006667775 lost: false event_id: 185 msg: [010]: 00000000 00000000 00000000 80d00000 ... ... Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-
Moshe Shemesh authored
Create mlx5_devlink_health_reporter for FW reporter. The FW reporter implements devlink_health_reporter diagnose callback. The fw reporter diagnose command can be triggered any time by the user to check current fw status. In healthy status, it will return clear syndrome. Otherwise it will return the syndrome and description of the error type. Command example and output on healthy status: $ devlink health diagnose pci/0000:82:00.0 reporter fw Syndrome: 0 Command example and output on non healthy status: $ devlink health diagnose pci/0000:82:00.0 reporter fw Syndrome: 8 Description: unrecoverable hardware error Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-
Feras Daoud authored
If a FW assert is considered fatal, indicated by a new bit in the health buffer, reset the FW. After the reset go through the normal recovery flow. Only one PF needs to issue the reset, so an attempt is made to prevent the 2nd function from also issuing the reset. It's not an error if that happens, it just slows recovery. Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-
Feras Daoud authored
Since the FW can be shared between different PFs/VFs it is common that more than one health poll will detected a failure, this can lead to multiple resets which are unneeded. The solution is to use a FW locking mechanism using semaphore space to provide a way to allow only one device to collect the cr-dump and to issue a sw-reset. Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-
Feras Daoud authored
New mlx5 adapters allow the driver to reset the FW in the event of an error, this action called "SW Reset". When an SW reset is issued on any PF all PFs enter reset state which is a recoverable condition. The existing recovery flow was designed to allow the recovery of a VF after a PF driver reload. This patch adds the sw reset to the NIC states as a preparation for sw reset handling. When a software reset is issued the following occurs: 1. The NIC interface mode is set to 7 while the reset is in progress. 2. Once the reset completes the NIC interface mode is set to 1. Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Reviewed-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-
Alex Vesker authored
Crdump allows the driver to retrieve a dump of the FW PCI crspace. This is useful in case of catastrophic issues which may require FW reset. The crspace dump can be used for later debug. Signed-off-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Reviewed-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-
Alex Vesker authored
The Vendor Specific Capability (VSC) is used to activate a gateway interfacing with the device. The gateway is used to read or write device configurations, which are organized in different domains (spaces). A configuration access may result in multiple actions, reads, writes. Example usages are accessing the Crspace domain to read the crspace or locking a device semaphore using the Semaphore domain. The configuration access use pci_cfg_access to prevent parallel access to the VSC space by the driver and userspace calls. Signed-off-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-
Eran Ben Elisha authored
Centralize all devlink related callbacks in one file. In the downstream patch, some more functionality will be added, this patch is preparing the driver infrastructure for it. Currently, move devlink un/register functions calls into this file. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Reviewed-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-
Saeed Mahameed authored
Add initial documentation for mlx5 driver. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-
Aya Levin authored
The devlink health reporter provides a dump method on an error. Dump may contain a large amount of data, in this case doit cb isn't sufficient. This is because the user side is blocking and doesn't allow draining of the socket until the socket runs out of buffers. Using dumpit cb is the correct way to go. Please note that thankfully the dump op is not yet implemented in any driver and therefore this change is not breaking userspace. Fixes: 35455e23 ("devlink: Add health dump {get,clear} commands") Signed-off-by: Aya Levin <ayal@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-
- 12 Jun, 2019 19 commits
-
-
Eric Dumazet authored
Adding delays to TCP flows is crucial for studying behavior of TCP stacks, including congestion control modules. Linux offers netem module, but it has unpractical constraints : - Need root access to change qdisc - Hard to setup on egress if combined with non trivial qdisc like FQ - Single delay for all flows. EDT (Earliest Departure Time) adoption in TCP stack allows us to enable a per socket delay at a very small cost. Networking tools can now establish thousands of flows, each of them with a different delay, simulating real world conditions. This requires FQ packet scheduler or a EDT-enabled NIC. This patchs adds TCP_TX_DELAY socket option, to set a delay in usec units. unsigned int tx_delay = 10000; /* 10 msec */ setsockopt(fd, SOL_TCP, TCP_TX_DELAY, &tx_delay, sizeof(tx_delay)); Note that FQ packet scheduler limits might need some tweaking : man tc-fq PARAMETERS limit Hard limit on the real queue size. When this limit is reached, new packets are dropped. If the value is lowered, packets are dropped so that the new limit is met. Default is 10000 packets. flow_limit Hard limit on the maximum number of packets queued per flow. Default value is 100. Use of TCP_TX_DELAY option will increase number of skbs in FQ qdisc, so packets would be dropped if any of the previous limit is hit. Use of a jump label makes this support runtime-free, for hosts never using the option. Also note that TSQ (TCP Small Queues) limits are slightly changed with this patch : we need to account that skbs artificially delayed wont stop us providind more skbs to feed the pipe (netem uses skb_orphan_partial() for this purpose, but FQ can not use this trick) Because of that, using big delays might very well trigger old bugs in TSO auto defer logic and/or sndbuf limited detection. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Sameeh Jubran says: ==================== Support for dynamic queue size changes This patchset introduces the following: * add new admin command for supporting different queue size for Tx/Rx * add support for Tx/Rx queues size modification through ethtool * allow queues allocation backoff when low on memory * update driver version Difference from v2: * Dropped superfluous range checks which are already done in ethtool. [patch 5/7] * Dropped inline keyword from function. [patch 4/7] * Added a new patch which drops inline keyword all *.c files. [patch 6/7] Difference from v1: * Changed ena_update_queue_sizes() signature to use u32 instead of int type for the size arguments. [patch 5/7] ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Sameeh Jubran authored
Update driver version to match device specification. Signed-off-by: Sameeh Jubran <sameehj@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Sameeh Jubran authored
Let the compiler decide if the function should be inline in *.c files Signed-off-by: Sameeh Jubran <sameehj@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Sameeh Jubran authored
Implement the set_ringparam() function of the ethtool interface to enable the changing of io queue sizes. Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: Sameeh Jubran <sameehj@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Sameeh Jubran authored
If there is not enough memory to allocate io queues the driver will try to allocate smaller queues. The backoff algorithm is as follows: 1. Try to allocate TX and RX and if successful. 1.1. return success 2. Divide by 2 the size of the larger of RX and TX queues (or both if their size is the same). 3. If TX or RX is smaller than 256 3.1. return failure. 4. else 4.1. go back to 1. Also change the tx_queue_size, rx_queue_size field names in struct adapter to requested_tx_queue_size and requested_rx_queue_size, and use RX and TX queue 0 for actual queue sizes. Explanation: The original fields were useless as they were simply used to assign values once from them to each of the queues in the adapter in ena_probe(). They could simply be deleted. However now that we have a backoff feature, we have use for them. In case of backoff there is a difference between the requested queue sizes and the actual sizes. Therefore there is a need to save the requested queue size for future retries of queue allocation (for example if allocation failed and then ifdown + ifup was called we want to start the allocation from the original requested size of the queues). Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: Sameeh Jubran <sameehj@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Sameeh Jubran authored
Currently ethtool -g shows the same size for current and max queue sizes. Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: Sameeh Jubran <sameehj@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Sameeh Jubran authored
Use MAX_QUEUES_EXT get feature capability to query the device. Signed-off-by: Netanel Belgazal <netanel@amazon.com> Signed-off-by: Sameeh Jubran <sameehj@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Arthur Kiyanovski authored
Add a new admin command to support different queue size for Tx/Rx queues (the change also support different SQ/CQ sizes) Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: Sameeh Jubran <sameehj@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Ioana Radulescu says: ==================== dpaa2-eth: Add support for MQPRIO offloading Add support for adding multiple TX traffic classes with mqprio. We can have up to one netdev queue and hardware frame queue per TC per core. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ioana Radulescu authored
Implement mqprio qdisc support by mapping traffic classes to different hardware enqueue priorities. The maximum number of supported traffic classes is an attribute of each DPNI object. The traffic classes map to hardware priorities from highest (0) to lowest (highest prio number). The skb priority information received from the stack is used to select the hardware Tx queue on which to enqueue the frame. Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com> Signed-off-by: Bogdan Purcareata <bogdan.purcareata@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ioana Radulescu authored
DPNI objects can have multiple traffic classes, as reflected by the num_tc attribute. Until now we ignored its value and only used traffic class 0. This patch adds support for multiple Tx traffic classes; we have num_queues x num_tcs hardware queues available for each interface. Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ioana Radulescu authored
Move the code configuring xps on the netdev TX queues to a separate function. A subsequent patch will need to call this in another context as well. Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Grygorii Strashko authored
Add dependency to TI CPTS from Common CLK framework COMMON_CLK to fix allyesconfig build for Powerpc: drivers/net/ethernet/ti/cpts.c: In function 'cpts_of_mux_clk_setup': drivers/net/ethernet/ti/cpts.c:567:2: error: implicit declaration of function 'of_clk_parent_fill'; did you mean 'of_clk_get_parent_name'? [-Werror=implicit-function-declaration] of_clk_parent_fill(refclk_np, parent_names, num_parents); ^~~~~~~~~~~~~~~~~~ of_clk_get_parent_name Fixes: a3047a81 ("net: ethernet: ti: cpts: add support for ext rftclk selection") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Florian Fainelli authored
We need to specifically deal with phylink_of_phy_connect() returning -ENODEV, because this can happen when a CPU/DSA port does connect neither to a PHY, nor has a fixed-link property. This is a valid use case that is permitted by the binding and indicates to the switch: auto-configure port with maximum capabilities. Fixes: 0e279218 ("net: dsa: Use PHYLINK for the CPU/DSA ports") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vivien Didelot authored
During a port FDB dump operation, the mutex protecting the concurrent access to the switch registers is currently held by the internal mv88e6xxx_port_db_dump and mv88e6xxx_port_db_dump_fid helpers. It must be held at the higher level in mv88e6xxx_port_fdb_dump which is called directly by DSA through ds->ops->port_fdb_dump. Fix this. Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Nicolas Saenz Julienne authored
Add bindings for Wiznet's w5x00 series of SPI interfaced Ethernet chips. Based on the bindings for microchip,enc28j60. Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Nicolas Saenz Julienne authored
The w5X00 chip provides an SPI to Ethernet inteface. This patch allows platform devices to be defined through the device tree. Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vlad Buslov authored
To remove rtnl lock dependency in tc filter update API when using ingress Qdisc, set QDISC_CLASS_OPS_DOIT_UNLOCKED flag in ingress Qdisc_class_ops. Ingress Qdisc ops don't require any modifications to be used without rtnl lock on tc filter update path. Ingress implementation never changes its q->block and only releases it when Qdisc is being destroyed. This means it is enough for RTM_{NEWTFILTER|DELTFILTER|GETTFILTER} message handlers to hold ingress Qdisc reference while using it without relying on rtnl lock protection. Unlocked Qdisc ops support is already implemented in filter update path by unlocked cls API patch set. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 11 Jun, 2019 9 commits
-
-
David S. Miller authored
Jakub Kicinski says: ==================== tls: add support for kernel-driven resync and nfp RX offload This series adds TLS RX offload for NFP and completes the offload by providing resync strategies. When TLS data stream looses segments or experiences reorder NIC can no longer perform in line offload. Resyncs provide information about placement of records in the stream so that offload can resume. Existing TLS resync mechanisms are not a great fit for the NFP. In particular the TX resync is hard to implement for packet-centric NICs. This patchset adds an ability to perform TX resync in a way similar to the way initial sync is done - by calling down to the driver when new record is created after driver indicated sync had been lost. Similarly on the RX side, we try to wait for a gap in the stream and send record information for the next record. This works very well for RPC workloads which are the primary focus at this time. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
When TCP stream gets out of sync (driver stops receiving skbs with expected TCP sequence numbers) request a TX resync from the kernel. We try to distinguish retransmissions from missed transmissions by comparing the sequence number to expected - if it's further than the expected one - we probably missed packets. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
TLS offload drivers keep track of TCP seq numbers to make sure the packets are fed into the HW in order. When packets get dropped on the way through the stack, the driver will get out of sync and have to use fallback encryption, but unless TCP seq number is resynced it will never match the packets correctly (or even worse - use incorrect record sequence number after TCP seq wraps). Existing drivers (mlx5) feed the entire record on every out-of-order event, allowing FW/HW to always be in sync. This patch adds an alternative, more akin to the RX resync. When driver sees a frame which is past its expected sequence number the stream must have gotten out of order (if the sequence number is smaller than expected its likely a retransmission which doesn't require resync). Driver will ask the stack to perform TX sync before it submits the next full record, and fall back to software crypto until stack has performed the sync. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
Currently only RX direction is ever resynced, however, TX may also get out of sequence if packets get dropped on the way to the driver. Rename the resync callback and add a direction parameter. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
Set ethtool TLS RX feature based on NIC capabilities, and enable TLS RX when connections are added for decryption. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Dirk van der Merwe authored
Enable kernel-controlled RX resync and propagate TLS connection RX resync from kernel TLS to firmware. Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
Some control messages must be sent from atomic context. The mailbox takes sleeping locks and uses a waitqueue so add a "posted" version of communication. Trylock the semaphore and if that's successful kick of the device communication. The device communication will be completed from a workqueue, which will also release the semaphore. If locks are taken queue the message and return. Schedule a different workqueue to take the semaphore and run the communication. Note that the there are currently no atomic users which would actually need the return value, so all replies to posted messages are just freed. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
We need the name nfp_ccm_mbox_alloc() for allocating the mailbox communication channel itself. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Dirk van der Merwe authored
Firmware indicates when a packet has been decrypted by reusing the currently unused BPF flag. Transfer this information into the skb and provide a statistic of all decrypted segments. Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-