- 27 Feb, 2018 40 commits
-
-
Thomas Falcon authored
Sorry, the previous change introduced a race condition between transmit completion processing and tracking TX descriptors. If a completion is received before the number of descriptors is logged, the number of descriptors will be add but not removed. After enough times, this could halt the transmit queue forever. Log the number of descriptors used by a transmit before sending. I stress tested the fix on two different systems running over the weekend without any issues. Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Niklas Cassel says: ==================== stmmac barrier fixes and cleanup ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Niklas Cassel authored
Make dwmac4_release_tx_desc() clear all descriptor fields, not just TDES2 and TDES3. I'm suspecting that TDES0 and TDES1 wasn't cleared because the DMA engine uses them to store the tx hardware timestamp (if PTP is enabled). However, stmmac_tx_clean() calls stmmac_get_tx_hwtstamp(), which reads and saves the timestamp, before it calls release_tx_desc(), so this is not an issue. stmmac_xmit() and stmmac_tso_xmit() both always overwrite TDES0, however, stmmac_tso_xmit() sometimes sets TDES1, and since neither stmmac_xmit() nor stmmac_tso_xmit() explicitly clears TDES1, both functions might reuse a DMA descriptor with old TDES1 data. I haven't observed any misbehavior even though TDES1 sometimes point to an old skb, however, explicitly clearing both TDES0 and TDES1 in dwmac4_release_tx_desc() minimizes the chances of undefined behavior. Signed-off-by: Niklas Cassel <niklas.cassel@axis.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Niklas Cassel authored
According to Documentation/memory-barriers.txt, we need to use a dma_rmb() after reading the status/own bit, to ensure that all descriptor fields are read after reading the own bit. This way, we ensure that the DMA engine is done with the DMA descriptor before we read the other descriptor fields, e.g. reading the tx hardware timestamp (if PTP is enabled). Signed-off-by: Niklas Cassel <niklas.cassel@axis.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Niklas Cassel authored
The last memory barrier in stmmac_xmit()/stmmac_tso_xmit() is placed between a coherent memory write and a MMIO write: The own bit is written in First Desc (TSO: MSS desc or First Desc). <barrier> The DMA engine is started by a write to the tx desc tail pointer/ enable dma transmission register, i.e. a MMIO write. This barrier cannot be a simple dma_wmb(), since a dma_wmb() is only used to guarantee the ordering, with respect to other writes, to cache coherent DMA memory. To guarantee that the cache coherent memory writes have completed before we attempt to write to the cache incoherent MMIO region, we need to use the more heavyweight barrier wmb(). Signed-off-by: Niklas Cassel <niklas.cassel@axis.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Niklas Cassel authored
A dma_wmb() is used to guarantee the ordering, with respect to other writes, to cache coherent DMA memory. There is a dma_wmb() in prepare_tx_desc()/prepare_tso_tx_desc() which ensures that TDES0/1/2 is written before TDES3 (which contains the own bit), for First Desc. However, in the rare case that MSS changes, there will be a MSS context descriptor in front of the regular DMA descriptors: <MSS desc> <- DMA Next Descriptor <First Desc> <desc n> <Last Desc> Thus, for this special case, we need a dma_wmb() after prepare_tso_tx_desc()/before writing the own bit to the MSS desc, so that we flush the write to TDES3 for First Desc, in order to ensure that the MSS descriptor is the last descriptor to set the own bit. Signed-off-by: Niklas Cassel <niklas.cassel@axis.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Sowmini Varadhan says: ==================== RDS: optimized notification for zerocopy completion Resending with acked-by additions: previous attempt does not show up in Patchwork. This time with a new mail Message-Id. RDS applications use predominantly request-response, transacation based IPC, so that ingress and egress traffic are well-balanced, and it is possible/desirable to reduce system-call overhead by piggybacking the notifications for zerocopy completion response with data. Moreover, it has been pointed out that socket functions block if sk_err is non-zero, thus if the RDS code does not plan/need to use sk_error_queue path for completion notification, it is preferable to remove the sk_errror_queue related paths in RDS. Both of these goals are implemented in this series. v2: removed sk_error_queue support v3: incorporated additional code review comments (details in each patch) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Sowmini Varadhan authored
PF_RDS sockets pass up cookies for zerocopy completion as ancillary data. Update msg_zerocopy to reap this information. Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Acked-by: Willem de Bruijn <willemb@google.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Sowmini Varadhan authored
This commit is an optimization over commit 01883eda ("rds: support for zcopy completion notification") for PF_RDS sockets. RDS applications are predominantly request-response transactions, so it is more efficient to reduce the number of system calls and have zerocopy completion notification delivered as ancillary data on the POLLIN channel. Cookies are passed up as ancillary data (at level SOL_RDS) in a struct rds_zcopy_cookies when the returned value of recvmsg() is greater than, or equal to, 0. A max of RDS_MAX_ZCOOKIES may be passed with each message. This commit removes support for zerocopy completion notification on MSG_ERRQUEUE for PF_RDS sockets. Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Acked-by: Willem de Bruijn <willemb@google.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Sowmini Varadhan authored
In preparation for optimized reception of zerocopy completion, revert the Rx side changes introduced by Commit dfb8434b ("selftests/net: add zerocopy support for PF_RDS test case") Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Acked-by: Willem de Bruijn <willemb@google.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queueDavid S. Miller authored
Jeff Kirsher says: ==================== 40GbE Intel Wired LAN Driver Updates 2018-02-26 This series contains updates to i40e and i40evf only. Mariusz adds a new ethtool private flag for forcing true link state with the requested changes from Jakub Kicinski. Paweł fixes an issue where we were double locking the same resource which would generate a kernel panic after bringing an interface up for i40evf. Alan modifies both drivers to use software values to determine if there are packets stalled on the ring with the added benefit of being less CPU intensive since we do not need to reach into the hardware to get the values. Colin Ian King provides a few fixes detected by Coverity, first was to pass a struct by reference versus by value to be more efficient. Then verify the VSI pointer is not NULL before trying to dereference it. Cleaned up redundant checks that always return true. Dan Carpenter fixes over indented lines of code. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Heiner Kallweit authored
This patch improves few aspects of interrupt handling: - update to current interrupt allocation API (use pci_alloc_irq_vectors() instead of deprecated pci_enable_msi()) - this implicitly will allocate a MSI-X interrupt if available - get rid of flag RTL_FEATURE_MSI - remove some dead code, intentionally disabling (unreliable) MSI being partially available on old PCI chips. The patch works fine on a RTL8168evl (chip version 34) and on a RTL8169SB (chip version 04). Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David Ahern authored
Fixes: 153e1b84 ("selftests: Add FIB onlink tests") Reported-by: Ido Schimmel <idosch@idosch.org> Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Madalin Bucur says: ==================== DPAA Ethernet fixes Fixed an issue on the Tx path that was visible in netperf TCP_SENDFILE tests. Addressed another issue with Rx errors not being always counted. Adding control for allmulti. v2: rephrased commit message, reduced changes in the SG mapping fix ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Radu Bulie authored
This patch adds allmulticast option for memac, dtsec and 10GEC controllers. Signed-off-by: Radu Bulie <radu-andrei.bulie@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Madalin Bucur authored
Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Madalin Bucur authored
Simplify the code and avoid some Rx errors not being accounted. Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Madalin Bucur authored
An issue in the code mapping the skb fragments into scatter-gather frames was evidentiated by netperf TCP_SENDFILE tests. The size was set wrong for all fragments but the first, affecting the transmission of any skb with more than one fragment. Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Merge branch 'ieee802154-for-davem-2018-02-26' of git://git.kernel.org/pub/scm/linux/kernel/git/sschmidt/wpan-next Stefan Schmidt says: ==================== pull-request: ieee802154-next 2018-02-26 An update from ieee802154 for *net-next* Alexander corrected a setting which got lost during some 6lowpan rework a while back and Xue Liu provided us with a new driver for the MCR20A transceiver. If there are any issues let me know. If not, please pull. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Kirill Tkhai says: ==================== Converting pernet_operations (part #3) This patchset continues to review and to convert pernet_operations to async. Where it is possible, they are grouped by type of actions init/exit methods ([1/28], for example). I hope this will make the review a little bit easier. The changes are tree-wide: in net, fs, drivers and security. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Kirill Tkhai authored
These pernet_operations only register and unregister nf hooks. So, they are able to be marked as async. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Kirill Tkhai authored
These pernet_operations only register and unregister nf hooks. So, they are able to be marked as async. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Kirill Tkhai authored
These pernet_operations only unregister nf hooks. So, they are able to be marked as async. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Kirill Tkhai authored
These pernet_operations register and unregister nf hooks. Also they populate and depopulate ila_net_id-pointed hash table. The table is changed by hooks during skb processing and via netlink request. It looks impossible for another net pernet_operations to force the table reading or writing, so, they are able to be marked as async. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Kirill Tkhai authored
These pernet_operations only unregister nf hooks. So, they are able to be marked as async. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Kirill Tkhai authored
These pernet_operations register and unregister nf hooks, and populate and destroy /proc entry. So, they are able to be marked as async. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Kirill Tkhai authored
These pernet_operations only unregister nf hooks. So, they are able to be marked as async. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Kirill Tkhai authored
These pernet_operations unregister ipvlan net hooks. nf_unregister_net_hooks() removes hooks one-by-one, and then frees the memory via rcu. This looks similar to that happens, when a new hooks is added: allocation of bigger memory region, copy of old content, and rcu freeing the old memory. So, all of net code should be well with this behavior. Also at the time of hook unregistering, there are no packets, and foreign net pernet_operations are not interested in others hooks. So, we mark them as async. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Kirill Tkhai authored
These pernet_operations have only exit method, which moves devices from cfg802154_rdev_list to init_net. This may occur in any time from nl802154_wpan_phy_netns(), so we are nice with rtnl_lock() synchronization. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Acked-by: Stefan Schmidt <stefan@osg.samsung.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Kirill Tkhai authored
These pernet_operations are similar to ip6_tnl_net_ops. Exit method unregisters all net sit devices, and it looks like another pernet_operations are not interested in foreign net sit list. Init method registers netdevice. So, it's possible to mark them async. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Kirill Tkhai authored
These pernet_operations are similar to ip6_tnl_net_ops. Exit method unregisters all net vti6 tunnels, and it looks like another pernet_operations are not interested in foreign net vti6 list. Init method registers netdevice. So, it's possible to mark them async. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Kirill Tkhai authored
These pernet_operations are similar to ip6gre_net_ops. Exit method unregisters all net ip6_tnl tunnels, and it looks like another pernet_operations are not interested in foreign net tunnels list. So, it's possible to mark them async. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Kirill Tkhai authored
These pernet_operations are similar to bond_net_ops. Exit method unregisters all net ip6gre devices, and it looks like another pernet_operations are not interested in foreign net ip6gre list or net_generic()->tunnels_wc. Init method registers net device. So, it's possible to mark them async. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Kirill Tkhai authored
These pernet_operations are similar to bond_net_ops. Exit methods unregisters all net ipgre/ipgre_tap/erspan/vti/ipip devices, and it looks like another pernet_operations are not interested in foreign net ipgre/ipgre_tap/erspan/vti/ipip list. Init method also does not intersect with something pernet-specific. So, it's possible to mark them async. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Kirill Tkhai authored
These pernet_operations are similar to bond_net_ops. Exit method unregisters all net bridge devices, and it looks like another pernet_operations are not interested in foreign net bridge list. So, it's possible to mark them async. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Kirill Tkhai authored
These pernet_operations are similar to bond_net_ops. Exit method unregisters all net vlanx devices, and it looks like another pernet_operations are not interested in foreign net vlanx list. So, it's possible to mark them async. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Kirill Tkhai authored
These pernet_operations are similar to bond_net_ops. Exit method unregisters all net ppp devices, and it looks like another pernet_operations are not interested in foreign net ppp list. So, it's possible to mark them async. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Kirill Tkhai authored
These pernet_operations are similar to bond_net_ops. Exit method unregisters all net gtp devices, and it looks like another pernet_operations are not interested in foreign net gtp list. So, it's possible to mark them async. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Kirill Tkhai authored
These pernet_operations are similar to bond_net_ops. Exit method unregisters all net geneve devices, and it looks like another pernet_operations are not interested in foreign net geneve list. So, it's possible to mark them async. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Kirill Tkhai authored
These pernet_operations populate/depopulate /proc and /sys entries. Exit method unregisters all net bond devices, and it seems another pernet_operations are not interested in foreign net bond list. So, it's possible to mark them async. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-