An error occurred fetching the project authors.
- 15 Oct, 2021 1 commit
-
-
Maciej Fijalkowski authored
This field is dead and driver is not making any use of it. Simply remove it. Signed-off-by:
Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by:
Gurucharan G <gurucharanx.g@intel.com> Signed-off-by:
Tony Nguyen <anthony.l.nguyen@intel.com>
-
- 11 Oct, 2021 1 commit
-
-
Kiran Patil authored
Implement ndo_setup_tc net device callback for TC HW offload on PF device. ndo_setup_tc provides support for HW offloading various TC filters. Add support for configuring the following filter with tc-flower: - default L2 filters (src/dst mac addresses, ethertype, VLAN) - variations of L3, L3+L4, L2+L3+L4 filters using advanced filters (including ipv4 and ipv6 addresses). Allow for adding/removing TC flows when PF device is configured in eswitch switchdev mode. Two types of actions are supported at the moment: FLOW_ACTION_DROP and FLOW_ACTION_REDIRECT. Co-developed-by:
Priyalee Kushwaha <priyalee.kushwaha@intel.com> Signed-off-by:
Priyalee Kushwaha <priyalee.kushwaha@intel.com> Signed-off-by:
Kiran Patil <kiran.patil@intel.com> Signed-off-by:
Wojciech Drewek <wojciech.drewek@intel.com> Tested-by:
Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by:
Tony Nguyen <anthony.l.nguyen@intel.com>
-
- 07 Oct, 2021 5 commits
-
-
Grzegorz Nitka authored
As resetting all VFs behaves mostly like creating new VFs also eswitch infrastructure has to be recreated. The easiest way to do that is to rebuild eswitch after resetting VFs. Implement helper functions to start and stop all representors queues. This is used to disable traffic on port representors. In rebuild path: - NAPI has to be disabled - eswitch environment has to be set up - new port representors have to be created, because the old one had pointer to not existing VFs - new control plane VSI ring should be remapped - NAPI hast to be enabled - rxdid has to be set to FLEX_NIC_2, because this descriptor id support source_vsi, which is needed on control plane VSI queues - port representors queues have to be started Signed-off-by:
Grzegorz Nitka <grzegorz.nitka@intel.com> Tested-by:
Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by:
Tony Nguyen <anthony.l.nguyen@intel.com>
-
Grzegorz Nitka authored
Only way to enable switchdev is to create VFs when the eswitch mode is set to switchdev. Check if correct mode is set and enable switchdev in function which creating VFs. Disable switchdev when user change number of VFs to 0. Changing eswitch mode back to legacy when VFs are created in switchdev mode isn't allowed. As switchdev takes care of managing filter rules, adding new rules on VF is blocked. In case of resetting VF driver has to update pointer in ice_repr struct, because after reset VSI related things can change. Co-developed-by:
Wojciech Drewek <wojciech.drewek@intel.com> Signed-off-by:
Wojciech Drewek <wojciech.drewek@intel.com> Signed-off-by:
Grzegorz Nitka <grzegorz.nitka@intel.com> Tested-by:
Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by:
Tony Nguyen <anthony.l.nguyen@intel.com>
-
Grzegorz Nitka authored
New type of VSI has to be defined for switchdev control plane VSI. Number of allocated Tx and Rx queue has to be equal to amount of VFs, because each port representor should have one Tx and Rx queue. Also to not increase number of used irqs too much, control plane VSI uses only one q_vector and handle all queues in one irq. To allow handling all queues in one irq , new function to clean msix for eswitch was introduced. This function will schedule napi for each representor instead of scheduling it only for one like in normal clean irq function. Only one additional msix has to be requested. Always try to request it in ice_ena_msix_range function. Signed-off-by:
Grzegorz Nitka <grzegorz.nitka@intel.com> Tested-by:
Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by:
Tony Nguyen <anthony.l.nguyen@intel.com>
-
Grzegorz Nitka authored
Switchdev environment has to be set up when user create VFs and eswitch mode is switchdev. Release is done when user delete all VFs. Data path in this implementation is based on control plane VSI. This VSI is used to pass traffic from port representors to corresponding VFs and vice versa. Default TX rule has to be added to forward packet to control plane VSI. This will redirect packets from VFs which don't match other rules to control plane VSI. On RX side default rule is added on uplink VSI to receive all traffic that doesn't match other rules. When setting switchdev environment all other rules from VFs should be removed. Packet to VFs will be forwarded by control plane VSI. As VF without any mac rules can't send any packet because of antispoof mechanism, VSI antispoof should be turned off on each VFs. To send packet from representor to correct VSI, destination VSI field in TX descriptor will have to be filled. Allow that by setting destination override bit in control plane VSI security config. Packet from VFs will be received on control plane VSI. Driver should decide to which netdev forward the packet. Decision is made based on src_vsi field from descriptor. There is a target netdev list in control plane VSI struct which choose netdev based on src_vsi number. Co-developed-by:
Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Signed-off-by:
Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Signed-off-by:
Grzegorz Nitka <grzegorz.nitka@intel.com> Tested-by:
Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by:
Tony Nguyen <anthony.l.nguyen@intel.com>
-
Wojciech Drewek authored
Keeping devlink port inside VSI data structure causes some issues. Since VF VSI is released during reset that means that we have to unregister devlink port and register it again every time reset is triggered. With the new changes in devlink API it might cause deadlock issues. After calling devlink_port_register/devlink_port_unregister devlink API is going to lock rtnl_mutex. It's an issue when VF reset is triggered in netlink operation context (like setting VF MAC address or VLAN), because rtnl_lock is already taken by netlink. Another call of rtnl_lock from devlink API results in dead-lock. By moving devlink port to PF/VF we avoid creating/destroying it during reset. Since this patch, devlink ports are created during ice_probe, destroyed during ice_remove for PF and created during ice_repr_add, destroyed during ice_repr_rem for VF. Signed-off-by:
Wojciech Drewek <wojciech.drewek@intel.com> Tested-by:
Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by:
Tony Nguyen <anthony.l.nguyen@intel.com>
-
- 05 Oct, 2021 1 commit
-
-
Jakub Kicinski authored
Convert all Ethernet drivers from memcpy(... dev->addr_len) to eth_hw_addr_set(): @@ expression dev, np; @@ - memcpy(dev->dev_addr, np, dev->addr_len) + eth_hw_addr_set(dev, np) In theory addr_len may not be ETH_ALEN, but we don't expect non-Ethernet devices to live under this directory, and only the following cases of setting addr_len exist: - cxgb4 for mgmt device, and the drivers which set it to ETH_ALEN: s2io, mlx4, vxge. Signed-off-by:
Jakub Kicinski <kuba@kernel.org> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 02 Oct, 2021 1 commit
-
-
Jakub Kicinski authored
Convert Ethernet from ether_addr_copy() to eth_hw_addr_set(): @@ expression dev, np; @@ - ether_addr_copy(dev->dev_addr, np) + eth_hw_addr_set(dev, np) Signed-off-by:
Jakub Kicinski <kuba@kernel.org> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 28 Sep, 2021 3 commits
-
-
Anirudh Venkataramanan authored
The messaging for unsupported module detection is different for lenient mode and strict mode. Update the code to print the right messaging for a given link mode. Media topology conflict is not an error in lenient mode, so return an error code only if not in lenient mode. Signed-off-by:
Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by:
Gurucharan G <gurucharanx.g@intel.com> Signed-off-by:
Tony Nguyen <anthony.l.nguyen@intel.com>
-
Anirudh Venkataramanan authored
DSCP a.k.a L3 QoS is only supported on certain devices. To enforce this, this patch introduces a bitmap of features and helper functions. The feature bitmap is set based on device IDs on driver init. Currently, DSCP is the only feature in this bitmap, but there will be more in the future. In the DCB netlink flow, check if the feature bit is set before exercising DSCP. Signed-off-by:
Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by:
Gurucharan G <gurucharanx.g@intel.com> Signed-off-by:
Tony Nguyen <anthony.l.nguyen@intel.com>
-
Dave Ertman authored
Implement code to handle submission of APP TLV's containing DSCP to TC mapping. The first such mapping received on an interface will cause that PF to switch to L3 DSCP QoS mode, apply the default config for that mode, and apply the received mapping. Only one such mapping will be allowed per DSCP value, and when the last DSCP mapping is deleted, the PF will switch back into L2 VLAN QoS mode, applying the appropriate default QoS settings. L3 DSCP QoS mode will only be allowed in SW DCBx mode, in other words, when the FW LLDP engine is disabled. Commands that break this mutual exclusivity will be blocked. Co-developed-by:
Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Signed-off-by:
Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Signed-off-by:
Dave Ertman <david.m.ertman@intel.com> Tested-by:
Gurucharan G <gurucharanx.g@intel.com> Signed-off-by:
Tony Nguyen <anthony.l.nguyen@intel.com>
-
- 27 Sep, 2021 1 commit
-
-
Leon Romanovsky authored
Move devlink_registration routine to be the last command, when the device is fully initialized. Signed-off-by:
Leon Romanovsky <leonro@nvidia.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 24 Sep, 2021 1 commit
-
-
Leon Romanovsky authored
PF pointer is always valid when PCI core calls its .shutdown() and .remove() callbacks. There is no need to check it again. Fixes: 837f08fd ("ice: Add basic driver framework for Intel(R) E800 Series") Signed-off-by:
Leon Romanovsky <leonro@nvidia.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 22 Sep, 2021 1 commit
-
-
Leon Romanovsky authored
devlink_register() can't fail and always returns success, but all drivers are obligated to check returned status anyway. This adds a lot of boilerplate code to handle impossible flow. Make devlink_register() void and simplify the drivers that use that API call. Signed-off-by:
Leon Romanovsky <leonro@nvidia.com> Acked-by:
Simon Horman <simon.horman@corigine.com> Acked-by: Vladimir Oltean <olteanv@gmail.com> # dsa Reviewed-by:
Jiri Pirko <jiri@nvidia.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 27 Aug, 2021 1 commit
-
-
Brett Creeley authored
commit 3ba7f53f ("ice: don't remove netdev->dev_addr from uc sync list") introduced calls to netif_addr_lock_bh() and netif_addr_unlock_bh() in the driver's ndo_set_mac() callback. This is fine since the driver is updated the netdev's dev_addr, but since this is a spinlock, the driver cannot sleep when the lock is held. Unfortunately the functions to add/delete MAC filters depend on a mutex. This was causing a trace with the lock debug kernel config options enabled when changing the mac address via iproute. [ 203.273059] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:281 [ 203.273065] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 6698, name: ip [ 203.273068] Preemption disabled at: [ 203.273068] [<ffffffffc04aaeab>] ice_set_mac_address+0x8b/0x1c0 [ice] [ 203.273097] CPU: 31 PID: 6698 Comm: ip Tainted: G S W I 5.14.0-rc4 #2 [ 203.273100] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0010.010620200716 01/06/2020 [ 203.273102] Call Trace: [ 203.273107] dump_stack_lvl+0x33/0x42 [ 203.273113] ? ice_set_mac_address+0x8b/0x1c0 [ice] [ 203.273124] ___might_sleep.cold.150+0xda/0xea [ 203.273131] mutex_lock+0x1c/0x40 [ 203.273136] ice_remove_mac+0xe3/0x180 [ice] [ 203.273155] ? ice_fltr_add_mac_list+0x20/0x20 [ice] [ 203.273175] ice_fltr_prepare_mac+0x43/0xa0 [ice] [ 203.273194] ice_set_mac_address+0xab/0x1c0 [ice] [ 203.273206] dev_set_mac_address+0xb8/0x120 [ 203.273210] dev_set_mac_address_user+0x2c/0x50 [ 203.273212] do_setlink+0x1dd/0x10e0 [ 203.273217] ? __nla_validate_parse+0x12d/0x1a0 [ 203.273221] __rtnl_newlink+0x530/0x910 [ 203.273224] ? __kmalloc_node_track_caller+0x17f/0x380 [ 203.273230] ? preempt_count_add+0x68/0xa0 [ 203.273236] ? _raw_spin_lock_irqsave+0x1f/0x30 [ 203.273241] ? kmem_cache_alloc_trace+0x4d/0x440 [ 203.273244] rtnl_newlink+0x43/0x60 [ 203.273245] rtnetlink_rcv_msg+0x13a/0x380 [ 203.273248] ? rtnl_calcit.isra.40+0x130/0x130 [ 203.273250] netlink_rcv_skb+0x4e/0x100 [ 203.273256] netlink_unicast+0x1a2/0x280 [ 203.273258] netlink_sendmsg+0x242/0x490 [ 203.273260] sock_sendmsg+0x58/0x60 [ 203.273263] ____sys_sendmsg+0x1ef/0x260 [ 203.273265] ? copy_msghdr_from_user+0x5c/0x90 [ 203.273268] ? ____sys_recvmsg+0xe6/0x170 [ 203.273270] ___sys_sendmsg+0x7c/0xc0 [ 203.273272] ? copy_msghdr_from_user+0x5c/0x90 [ 203.273274] ? ___sys_recvmsg+0x89/0xc0 [ 203.273276] ? __netlink_sendskb+0x50/0x50 [ 203.273278] ? mod_objcg_state+0xee/0x310 [ 203.273282] ? __dentry_kill+0x114/0x170 [ 203.273286] ? get_max_files+0x10/0x10 [ 203.273288] __sys_sendmsg+0x57/0xa0 [ 203.273290] do_syscall_64+0x37/0x80 [ 203.273295] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 203.273296] RIP: 0033:0x7f8edf96e278 [ 203.273298] Code: 89 02 48 c7 c0 ff ff ff ff eb b5 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 8d 05 25 63 2c 00 8b 00 85 c0 75 17 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 41 54 41 89 d4 55 [ 203.273300] RSP: 002b:00007ffcb8bdac08 EFLAGS: 00000246 ORIG_RAX: 000000000000002e [ 203.273303] RAX: ffffffffffffffda RBX: 000000006115e0ae RCX: 00007f8edf96e278 [ 203.273304] RDX: 0000000000000000 RSI: 00007ffcb8bdac70 RDI: 0000000000000003 [ 203.273305] RBP: 0000000000000000 R08: 0000000000000001 R09: 00007ffcb8bda5b0 [ 203.273306] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001 [ 203.273306] R13: 0000555e10092020 R14: 0000000000000000 R15: 0000000000000005 Fix this by only locking when changing the netdev->dev_addr. Also, make sure to restore the old netdev->dev_addr on any failures. Fixes: 3ba7f53f ("ice: don't remove netdev->dev_addr from uc sync list") Signed-off-by:
Brett Creeley <brett.creeley@intel.com> Tested-by:
Gurucharan G <gurucharanx.g@intel.com> Signed-off-by:
Tony Nguyen <anthony.l.nguyen@intel.com>
-
- 09 Aug, 2021 2 commits
-
-
Brett Creeley authored
In some circumstances, such as with bridging, it's possible that the stack will add the device's own MAC address to its unicast address list. If, later, the stack deletes this address, the driver will receive a request to remove this address. The driver stores its current MAC address as part of the VSI MAC filter list instead of separately. So, this causes a problem when the device's MAC address is deleted unexpectedly, which results in traffic failure in some cases. The following configuration steps will reproduce the previously mentioned problem: > ip link set eth0 up > ip link add dev br0 type bridge > ip link set br0 up > ip addr flush dev eth0 > ip link set eth0 master br0 > echo 1 > /sys/class/net/br0/bridge/vlan_filtering > modprobe -r veth > modprobe -r bridge > ip addr add 192.168.1.100/24 dev eth0 The following ping command fails due to the netdev->dev_addr being deleted when removing the bridge module. > ping <link partner> Fix this by making sure to not delete the netdev->dev_addr during MAC address sync. After fixing this issue it was noticed that the netdev_warn() in .set_mac was overly verbose, so make it at netdev_dbg(). Also, there is a possibility of a race condition between .set_mac and .set_rx_mode. Fix this by calling netif_addr_lock_bh() and netif_addr_unlock_bh() on the device's netdev when the netdev->dev_addr is going to be updated in .set_mac. Fixes: e94d4478 ("ice: Implement filter sync, NDO operations and bump version") Signed-off-by:
Brett Creeley <brett.creeley@intel.com> Tested-by:
Liang Li <liali@redhat.com> Tested-by:
Gurucharan G <gurucharanx.g@intel.com> Signed-off-by:
Tony Nguyen <anthony.l.nguyen@intel.com>
-
Anirudh Venkataramanan authored
The userspace utility "driverctl" can be used to change/override the system's default driver choices. This is useful in some situations (buggy driver, old driver missing a device ID, trying a workaround, etc.) where the user needs to load a different driver. However, this is also prone to user error, where a driver is mapped to a device it's not designed to drive. For example, if the ice driver is mapped to driver iavf devices, the ice driver crashes. Add a check to return an error if the ice driver is being used to probe a virtual function. Fixes: 837f08fd ("ice: Add basic driver framework for Intel(R) E800 Series") Signed-off-by:
Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by:
Gurucharan G <gurucharanx.g@intel.com> Tested-by:
Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by:
Tony Nguyen <anthony.l.nguyen@intel.com>
-
- 27 Jul, 2021 1 commit
-
-
Arnd Bergmann authored
Most users of ndo_do_ioctl are ethernet drivers that implement the MII commands SIOCGMIIPHY/SIOCGMIIREG/SIOCSMIIREG, or hardware timestamping with SIOCSHWTSTAMP/SIOCGHWTSTAMP. Separate these from the few drivers that use ndo_do_ioctl to implement SIOCBOND, SIOCBR and SIOCWANDEV commands. This is a purely cosmetic change intended to help readers find their way through the implementation. Cc: Doug Ledford <dledford@redhat.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jay Vosburgh <j.vosburgh@gmail.com> Cc: Veaceslav Falico <vfalico@gmail.com> Cc: Andy Gospodarek <andy@greyhouse.net> Cc: Andrew Lunn <andrew@lunn.ch> Cc: Vivien Didelot <vivien.didelot@gmail.com> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Vladimir Oltean <olteanv@gmail.com> Cc: Leon Romanovsky <leon@kernel.org> Cc: linux-rdma@vger.kernel.org Signed-off-by:
Arnd Bergmann <arnd@arndb.de> Acked-by:
Jason Gunthorpe <jgg@nvidia.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 25 Jun, 2021 2 commits
-
-
Maciej Machnikowski authored
The E810 device supports programmable pins for enabling both input and output events related to the PTP hardware clock. This includes both output signals with programmable period, as well as timestamping of events on input pins. Add support for enabling these using the CONFIG_PTP_1588_CLOCK interface. This allows programming the software defined pins to take advantage of the hardware clock features. Signed-off-by:
Maciej Machnikowski <maciej.machnikowski@intel.com> Signed-off-by:
Jacob Keller <jacob.e.keller@intel.com> Signed-off-by:
Tony Nguyen <anthony.l.nguyen@intel.com>
-
Jesse Brandeburg authored
This patch is modeled after one by Scott Peterson for i40e. Add tracepoints to the driver, via a new file ice_trace.h and some new trace calls added in interesting places in the driver. Add some tracing for DIMLIB to help debug interrupt moderation problems. Performance should not be affected, and this can be very useful for debugging and adding new trace events to paths in the future. Note eBPF programs can attach to these events, as well as perf can count them since we're attaching to the events subsystem in the kernel. Co-developed-by:
Ben Shelton <benjamin.h.shelton@intel.com> Signed-off-by:
Ben Shelton <benjamin.h.shelton@intel.com> Signed-off-by:
Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by:
Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by:
Tony Nguyen <anthony.l.nguyen@intel.com>
-
- 17 Jun, 2021 2 commits
-
-
Paul M Stillwell Jr authored
Remove the local variable since it's only used once. Instead, use it directly. Signed-off-by:
Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com> Tested-by:
Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by:
Tony Nguyen <anthony.l.nguyen@intel.com>
-
Paul M Stillwell Jr authored
There are some places where the scope of a variable can be reduced so do that. Signed-off-by:
Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com> Tested-by:
Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by:
Tony Nguyen <anthony.l.nguyen@intel.com>
-
- 11 Jun, 2021 4 commits
-
-
Jacob Keller authored
Add support for enabling Tx timestamp requests for outgoing packets on E810 devices. The ice hardware can support multiple outstanding Tx timestamp requests. When sending a descriptor to hardware, a Tx timestamp request is made by setting a request bit, and assigning an index that represents which Tx timestamp index to store the timestamp in. Hardware makes no effort to synchronize the index use, so it is up to software to ensure that Tx timestamp indexes are not re-used before the timestamp is reported back. To do this, introduce a Tx timestamp tracker which will keep track of currently in-use indexes. In the hot path, if a packet has a timestamp request, an index will be requested from the tracker. Unfortunately, this does require a lock as the indexes are shared across all queues on a PHY. There are not enough indexes to reliably assign only 1 to each queue. For the E810 devices, the timestamp indexes are not shared across PHYs, so each port can have its own tracking. Once hardware captures a timestamp, an interrupt is fired. In this interrupt, trigger a new work item that will figure out which timestamp was completed, and report the timestamp back to the stack. This function loops through the Tx timestamp indexes and checks whether there is now a valid timestamp. If so, it clears the PHY timestamp indication in the PHY memory, locks and removes the SKB and bit in the tracker, then reports the timestamp to the stack. It is possible in some cases that a timestamp request will be initiated but never completed. This might occur if the packet is dropped by software or hardware before it reaches the PHY. Add a task to the periodic work function that will check whether a timestamp request is more than a few seconds old. If so, the timestamp index is cleared in the PHY, and the SKB is released. Just as with Rx timestamps, the Tx timestamps are only 40 bits wide, and use the same overall logic for extending to 64 bits of nanoseconds. With this change, E810 devices should be able to perform basic PTP functionality. Future changes will extend the support to cover the E822-based devices. Signed-off-by:
Jacob Keller <jacob.e.keller@intel.com> Tested-by:
Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by:
Tony Nguyen <anthony.l.nguyen@intel.com>
-
Jacob Keller authored
Add SIOCGHWTSTAMP and SIOCSHWTSTAMP ioctl handlers to respond to requests to enable timestamping support. If the request is for enabling Rx timestamps, set a bit in the Rx descriptors to indicate that receive timestamps should be reported. Hardware captures receive timestamps in the PHY which only captures part of the timer, and reports only 40 bits into the Rx descriptor. The upper 32 bits represent the contents of GLTSYN_TIME_L at the point of packet reception, while the lower 8 bits represent the upper 8 bits of GLTSYN_TIME_0. The networking and PTP stack expect 64 bit timestamps in nanoseconds. To support this, implement some logic to extend the timestamps by using the full PHC time. If the Rx timestamp was captured prior to the PHC time, then the real timestamp is PHC - (lower_32_bits(PHC) - timestamp) If the Rx timestamp was captured after the PHC time, then the real timestamp is PHC + (timestamp - lower_32_bits(PHC)) These calculations are correct as long as neither the PHC timestamp nor the Rx timestamps are more than 2^32-1 nanseconds old. Further, we can detect when the Rx timestamp is before or after the PHC as long as the PHC timestamp is no more than 2^31-1 nanoseconds old. In that case, we calculate the delta between the lower 32 bits of the PHC and the Rx timestamp. If it's larger than 2^31-1 then the Rx timestamp must have been captured in the past. If it's smaller, then the Rx timestamp must have been captured after PHC time. Add an ice_ptp_extend_32b_ts function that relies on a cached copy of the PHC time and implements this algorithm to calculate the proper upper 32bits of the Rx timestamps. Cache the PHC time periodically in all of the Rx rings. This enables each Rx ring to simply call the extension function with a recent copy of the PHC time. By ensuring that the PHC time is kept up to date periodically, we ensure this algorithm doesn't use stale data and produce incorrect results. To cache the time, introduce a kworker and a kwork item to periodically store the Rx time. It might seem like we should use the .do_aux_work interface of the PTP clock. This doesn't work because all PFs must cache this time, but only one PF owns the PTP clock device. Thus, the ice driver will manage its own kthread instead of relying on the PTP do_aux_work handler. With this change, the driver can now report Rx timestamps on all incoming packets. Signed-off-by:
Jacob Keller <jacob.e.keller@intel.com> Tested-by:
Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by:
Tony Nguyen <anthony.l.nguyen@intel.com>
-
Jacob Keller authored
Add a new ice_ptp.c file for holding the basic PTP clock interface functions. If the device supports PTP, call the new ice_ptp_init and ice_ptp_release functions where appropriate. If the function owns the hardware resource associated with the PTP hardware clock, register with the PTP_1588_CLOCK infrastructure to allocate a new clock object that represents the device hardware clock. Implement basic functionality for reading and setting the clock time, performing clock adjustments, and adjusting the clock frequency. Future changes will introduce functionality for handling related features including Tx and Rx timestamps. Signed-off-by:
Jacob Keller <jacob.e.keller@intel.com> Tested-by:
Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by:
Tony Nguyen <anthony.l.nguyen@intel.com>
-
Jacob Keller authored
In order to support certain device features, including enabling the PTP hardware clock, the ice driver needs to control some registers on the device PHY. These registers are accessed by sending sideband messages. For some hardware, these messages must be sent over the device admin queue, while other hardware has a dedicated control queue for the sideband messages. Add the neighbor device message structure for sending a message to the neighboring device. Where supported, initialize the sideband control queue and handle cleanup. Add a wrapper function for sending sideband control queue messages that read or write a neighboring device register. Because some devices send sideband messages over the AdminQ, also increase the length of the admin queue to allow more messages to be queued up. This is important because the sideband messages add additional pressure on the AQ usage. This support will be used in following patches to enable support for CONFIG_1588_PTP_CLOCK. Signed-off-by:
Jacob Keller <jacob.e.keller@intel.com> Tested-by:
Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by:
Tony Nguyen <anthony.l.nguyen@intel.com>
-
- 09 Jun, 2021 1 commit
-
-
Maciej Fijalkowski authored
ice driver requires a programmable pipeline firmware package in order to have a support for advanced features. Otherwise, driver falls back to so called 'safe mode'. For that mode, ndo_bpf callback is not exposed and when user tries to load XDP program, the following happens: $ sudo ./xdp1 enp179s0f1 libbpf: Kernel error message: Underlying driver does not support XDP in native mode link set xdp fd failed which is sort of confusing, as there is a native XDP support, but not in the current mode. Improve the user experience by providing the specific ndo_bpf callback dedicated for safe mode which will make use of extack to explicitly let the user know that the DDP package is missing and that's the reason that the XDP can't be loaded onto interface currently. Cc: Jamal Hadi Salim <jhs@mojatatu.com> Fixes: efc2214b ("ice: Add support for XDP") Signed-off-by:
Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by:
Kiran Bhandare <kiranx.bhandare@intel.com> Signed-off-by:
Tony Nguyen <anthony.l.nguyen@intel.com>
-
- 07 Jun, 2021 3 commits
-
-
Anirudh Venkataramanan authored
Determine whether an unsupported power configuration is preventing link establishment by storing and checking the link_cfg_err_byte. Print error messages when module power levels are unsupported. Also add a new flag bit to prevent spamming said error messages. Co-developed-by:
Jeb Cramer <jeb.j.cramer@intel.com> Signed-off-by:
Jeb Cramer <jeb.j.cramer@intel.com> Signed-off-by:
Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by:
Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by:
Tony Nguyen <anthony.l.nguyen@intel.com>
-
Jacob Keller authored
After performing a flash update, a device EMP reset may occur. This reset will cause the newly downloaded firmware to be initialized. When this happens, the driver still reports the previous NVM version information. This is because the NVM versions are cached within the hw structure. This can be confusing, as the new firmware is in fact running in this case. Handle this by calling ice_init_nvm when rebuilding the driver state. This will update the flash version information and ensures that the current values are displayed when reporting the NVM versions to the stack. Signed-off-by:
Jacob Keller <jacob.e.keller@intel.com> Tested-by:
Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by:
Tony Nguyen <anthony.l.nguyen@intel.com>
-
Jacob Keller authored
Requesting device firmware information while the device is busy cleaning up after a reset can result in an unexpected failure: This occurs because the command is attempting to access the device AdminQ while it is down. Resolve this by having the command wait for a while until the reset is complete. To do this, introduce a reset_wait_queue and associated helper function "ice_wait_for_reset". This helper will use the wait queue to sleep until the driver is done rebuilding. Use of a wait queue is preferred because the potential sleep duration can be several seconds. To ensure that the thread wakes up properly, a new wake_up call is added during all code paths which clear the reset state bits associated with the driver rebuild flow. Using this ensures that tools can request device information without worrying about whether the driver is cleaning up from a reset. Specifically, it is expected that a flash update could result in a device reset, and it is better to delay the response for information until the reset is complete rather than exit with an immediate failure. Signed-off-by:
Jacob Keller <jacob.e.keller@intel.com> Tested-by:
Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by:
Tony Nguyen <anthony.l.nguyen@intel.com>
-
- 29 May, 2021 3 commits
-
-
Dave Ertman authored
Register ice client auxiliary RDMA device on the auxiliary bus per PCIe device function for the auxiliary driver (irdma) to attach to. It allows to realize a single RDMA driver (irdma) capable of working with multiple netdev drivers over multi-generation Intel HW supporting RDMA. There is no load ordering dependencies between ice and irdma. Signed-off-by:
Dave Ertman <david.m.ertman@intel.com> Signed-off-by:
Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by:
Tony Nguyen <anthony.l.nguyen@intel.com>
-
Dave Ertman authored
Add implementations for supporting iidc operations for device operation such as allocation of resources and event notifications. Signed-off-by:
Dave Ertman <david.m.ertman@intel.com> Signed-off-by:
Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by:
Tony Nguyen <anthony.l.nguyen@intel.com>
-
Dave Ertman authored
Probe the device's capabilities to see if it supports RDMA. If so, allocate and reserve resources to support its operation; populate structures with initial values. Signed-off-by:
Dave Ertman <david.m.ertman@intel.com> Signed-off-by:
Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by:
Tony Nguyen <anthony.l.nguyen@intel.com>
-
- 22 Apr, 2021 1 commit
-
-
Vignesh Sridhar authored
Attempt to detect malicious VFs and, if suspected, log the information but keep going to allow the user to take any desired actions. Potentially malicious VFs are identified by checking if the VFs are transmitting too many messages via the PF-VF mailbox which could cause an overflow of this channel resulting in denial of service. This is done by creating a snapshot or static capture of the mailbox buffer which can be traversed and in which the messages sent by VFs are tracked. Co-developed-by:
Yashaswini Raghuram Prathivadi Bhayankaram <yashaswini.raghuram.prathivadi.bhayankaram@intel.com> Signed-off-by:
Yashaswini Raghuram Prathivadi Bhayankaram <yashaswini.raghuram.prathivadi.bhayankaram@intel.com> Co-developed-by:
Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com> Signed-off-by:
Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com> Co-developed-by:
Brett Creeley <brett.creeley@intel.com> Signed-off-by:
Brett Creeley <brett.creeley@intel.com> Signed-off-by:
Vignesh Sridhar <vignesh.sridhar@intel.com> Tested-by:
Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by:
Tony Nguyen <anthony.l.nguyen@intel.com>
-
- 15 Apr, 2021 5 commits
-
-
Paul M Stillwell Jr authored
We were saving the return value from ice_vsi_manage_rss_lut(), but the errors from that function are not critical so change it to return void and remove the code that saved the value. Signed-off-by:
Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com> Tested-by:
Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by:
Tony Nguyen <anthony.l.nguyen@intel.com>
-
Jesse Brandeburg authored
The driver previously printed it's PCI address in the name field for the pci resource, which when displayed via /proc/iomem, would print the same thing twice. It's more useful for debugging to see the driver name, as most other modules do. Here's a diff of before and after this change: 99100000-991fffff : 0000:3b:00.1 9a000000-a04fffff : PCI Bus 0000:3b 9a000000-9bffffff : 0000:3b:00.1 - 9a000000-9bffffff : 0000:3b:00.1 + 9a000000-9bffffff : ice 9c000000-9dffffff : 0000:3b:00.0 - 9c000000-9dffffff : 0000:3b:00.0 + 9c000000-9dffffff : ice 9e000000-9effffff : 0000:3b:00.1 9f000000-9fffffff : 0000:3b:00.0 a0000000-a000ffff : 0000:3b:00.1 Signed-off-by:
Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by:
Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by:
Tony Nguyen <anthony.l.nguyen@intel.com>
-
Jacob Keller authored
The ice driver has support for adaptive interrupt moderation, an algorithm for tuning the interrupt rate dynamically. This algorithm is based on various assumptions about ring size, socket buffer size, link speed, SKB overhead, ethernet frame overhead and more. The Linux kernel has support for a dynamic interrupt moderation algorithm known as "dimlib". Replace the custom driver-specific implementation of dynamic interrupt moderation with the kernel's algorithm. The Intel hardware has a different hardware implementation than the originators of the dimlib code had to work with, which requires the driver to use a slightly different set of inputs for the actual moderation values, while getting all the advice from dimlib of better/worse, shift left or right. The change made for this implementation is to use a pair of values for each of the 5 "slots" that the dimlib moderation expects, and the driver will program those pairs when dimlib recommends a slot to use. The currently implementation uses two tables, one for receive and one for transmit, and the pairs of values in each slot set the maximum delay of an interrupt and a maximum number of interrupts per second (both expressed in microseconds). There are two separate kinds of bugs fixed by using DIMLIB, one is UDP single stream send was too slow, and the other is that 8K ping-pong was going to the most aggressive moderation and has much too high latency. The overall result of using DIMLIB is that we meet or exceed our performance expectations set based on the old algorithm. Co-developed-by:
Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by:
Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by:
Jacob Keller <jacob.e.keller@intel.com> Tested-by:
Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by:
Tony Nguyen <anthony.l.nguyen@intel.com>
-
Anirudh Venkataramanan authored
Add two new VSI states, one to track if a netdev for the VSI has been allocated and the other to track if the netdev has been registered. Call unregister_netdev/free_netdev only when the corresponding state bits are set. Signed-off-by:
Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by:
Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by:
Tony Nguyen <anthony.l.nguyen@intel.com>
-
Anirudh Venkataramanan authored
Remove the leading underscores in enum ice_pf_state. This is not really communicating anything and is unnecessary. No functional change. Signed-off-by:
Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by:
Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by:
Tony Nguyen <anthony.l.nguyen@intel.com>
-