- 18 Aug, 2015 32 commits
-
-
Stefan Assmann authored
During driver probing the following code path is triggered. igb_probe ->igb_sw_init ->igb_probe_vfs ->igb_pci_enable_sriov ->igb_sriov_reinit Doing the SR-IOV re-init is not necessary during probing since we're starting from scratch. Here we can call igb_enable_sriov() right away. Running igb_sriov_reinit() during igb_probe() also seems to cause occasional packet loss on some onboard 82576 NICs. Reproduced on Dell and HP servers with onboard 82576 NICs. Example: Intel Corporation 82576 Gigabit Network Connection [8086:10c9] (rev 01) Subsystem: Dell Device [1028:0481] Signed-off-by: Stefan Assmann <sassmann@kpanic.de> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Vasily Averin authored
Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Shota Suzuki authored
When initializing igb driver (e.g. 82576, I350), IGB_FLAG_QUEUE_PAIRS is set if adapter->rss_queues exceeds half of max_rss_queues in igb_init_queue_configuration(). On the other hand, IGB_FLAG_QUEUE_PAIRS is not set even if the number of queues exceeds half of max_combined in igb_set_channels() when changing the number of queues by "ethtool -L". In this case, if numvecs is larger than MAX_MSIX_ENTRIES (10), the size of adapter->msix_entries[], an overflow can occur in igb_set_interrupt_capability(), which in turn leads to an oops. Fix this problem as follows: - When changing the number of queues by "ethtool -L", set IGB_FLAG_QUEUE_PAIRS in the same way as initializing igb driver. - When increasing the size of q_vector, reallocate it appropriately. (With IGB_FLAG_QUEUE_PAIRS set, the size of q_vector gets larger.) Another possible way to fix this problem is to cap the queues at its initial number, which is the number of the initial online cpus. But this is not the optimal way because we cannot increase queues when another cpu becomes online. Note that before commit cd14ef54 ("igb: Change to use statically allocated array for MSIx entries"), this problem did not cause oops but just made the number of queues become 1 because of entering msi_only mode in igb_set_interrupt_capability(). Fixes: 907b7835 ("igb: Add ethtool support to configure number of channels") CC: stable <stable@vger.kernel.org> Signed-off-by: Shota Suzuki <suzuki_shota_t3@lab.ntt.co.jp> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
David S. Miller authored
Phil Sutter says: ==================== net: Convert drivers to IFF_NO_QUEUE and cleanup afterwards This series converts in-tree users away from the old and deprecated 'tx_queue_len = 0' idiom, adds a warning to notify out-of-tree driver maintainers that there is need for action on their behalf and finally drops any workarounds in scheduling algorithm implementations. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Phil Sutter authored
Those were all workarounds for the formerly double meaning of tx_queue_len, which broke scheduling algorithms if untreated. Now that all in-tree drivers have been converted away from setting tx_queue_len = 0, it should be safe to drop these workarounds for categorically broken setups. Signed-off-by: Phil Sutter <phil@nwl.cc> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Phil Sutter authored
Due to the introduction of IFF_NO_QUEUE, there is a better way for drivers to indicate that no qdisc should be attached by default. Though, the old convention can't be dropped since ignoring that setting would break drivers still using it. Instead, add a warning so out-of-tree driver maintainers get a chance to adjust their code before we finally get rid of any special handling of tx_queue_len == 0. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Phil Sutter authored
Signed-off-by: Phil Sutter <phil@nwl.cc> Cc: Johnny Kim <johnny.kim@atmel.com> Cc: Rachel Kim <rachel.kim@atmel.com> Cc: Dean Lee <dean.lee@atmel.com> Cc: Chris Park <chris.park@atmel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Phil Sutter authored
Signed-off-by: Phil Sutter <phil@nwl.cc> Cc: Dmitry Tarnyagin <dmitry.tarnyagin@lockless.no> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Phil Sutter authored
Signed-off-by: Phil Sutter <phil@nwl.cc> Cc: Arvid Brodin <arvid.brodin@alten.se> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Phil Sutter authored
Signed-off-by: Phil Sutter <phil@nwl.cc> Cc: Marek Lindner <mareklindner@neomailbox.ch> Cc: Simon Wunderlich <sw@simonwunderlich.de> Cc: Antonio Quartulli <antonio@meshcoding.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Phil Sutter authored
Signed-off-by: Phil Sutter <phil@nwl.cc> Cc: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Phil Sutter authored
Signed-off-by: Phil Sutter <phil@nwl.cc> Cc: Jouni Malinen <j@w1.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Phil Sutter authored
Signed-off-by: Phil Sutter <phil@nwl.cc> Cc: Lennert Buytenhek <buytenh@wantstofly.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Phil Sutter authored
Signed-off-by: Phil Sutter <phil@nwl.cc> Cc: Mahesh Bandewar <maheshb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Phil Sutter authored
Signed-off-by: Phil Sutter <phil@nwl.cc> Cc: Jay Vosburgh <j.vosburgh@gmail.com> Cc: Veaceslav Falico <vfalico@gmail.com> Cc: Andy Gospodarek <gospo@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Phil Sutter authored
Signed-off-by: Phil Sutter <phil@nwl.cc> Cc: Alexander Aring <alex.aring@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Phil Sutter authored
Signed-off-by: Phil Sutter <phil@nwl.cc> Cc: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Phil Sutter authored
Signed-off-by: Phil Sutter <phil@nwl.cc> Cc: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Phil Sutter authored
Signed-off-by: Phil Sutter <phil@nwl.cc> Cc: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Phil Sutter authored
Signed-off-by: Phil Sutter <phil@nwl.cc> Cc: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Phil Sutter authored
Signed-off-by: Phil Sutter <phil@nwl.cc> Cc: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Phil Sutter authored
Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Phil Sutter authored
Signed-off-by: Phil Sutter <phil@nwl.cc> Cc: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Phil Sutter authored
Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Phil Sutter authored
Signed-off-by: Phil Sutter <phil@nwl.cc> Cc: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Tom Herbert says: ==================== net: Identifier Locator Addressing - Part I This patch set provides rudimentary support for Identifier Locator Addressing or ILA. The basic concept of ILA is that we split an IPv6 address into a 64 bit locator and 64 bit identifier. The identifier is the identity of an entity in communication ("who"), and the locator expresses the location of the entity ("where"). Applications use externally visible address that contains the identifier. When a packet is actually sent, a translation is done that overwrites the first 64 bits of the address with a locator. The packet can then be forwarded over the network to the host where the addressed entity is located. At the receiver, the reverse translation is done so the that the application sees the original, untranslated address. Presumably an external control plane will provide identifier->locator mappings. v2: - Fix compilation erros when LWT not configured - Consolidate ILA into a single ila.c v3: - Change pseudohdr argument od inet_proto_csum_replace functions to be a bool v4: - In ila_build_state check locator being in netlink params before allocating tunnel state The data path for ILA is a simple NAT translation that only operates on the upper 64 bits of a destination address in IPv6 packets. The basic process is: 1) Lookup 64 bit identifier (lower 64 bits of destination) 2) If a match is found a) Overwrite locator (upper 64 bits of destination) with the new locator b) Adjust any checksum that has destination address included in pseudo header 3) Send or receive packet ILA is a means to implement tunnels or network virtualization without encapsulation. Since there is no encapsulation involved, we assume that stateless support in the network for IPv6 (e.g. RSS, ECMP, TSO, etc.) just works. Also, since we're minimally changing the packet many of the worries about encapsulation (MTU, checksum, fragmentation) are not relevant. The downside is that, ILA is not extensible like other encapsulations (GUE for instance) so it might not be appropriate for all use cases. Also, this only makes sense to do in IPv6! A key aspect of ILA is performance. The intent is that ILA would be used in data centers in virtualizing tasks or jobs. In the fullest incarnation all intra data center communications might be targeted to virtual ILA addresses. This is basically adding a new virtualization capability to the existing services in a datacenter, so there is a strong expectation is that this does not degrade performance for existing applications. Performance seems to be dependent on how ILA is hooked into kernel. ILA can be implemented under some different models: - Mechanically it is a form a stateless DNAT - It can be thought of as a type of (source) routing - As a functional replacement of encapsulation In this patch set we hook into the data path using Light Weight Tunnels (LWT) infrastructure. As part of that, we add support in LWT to redirect dst input. iproute will be modified to take a new ila encap type. ILA can be configured like: ip route add 3333:0:0:1:5555:0:2:0/128 \ encap ila 2001:0:0:2 via 2401:db00:20:911a:face:0:27:0 ip -6 addr add 3333:0:0:1:5555:0:1:0/128 dev eth0 ip route add table local local 2001:0:0:1:5555:0:1:0/128 encap ila 3333:0:0:1 dev lo So sending to destination 3333:0:0:1:5555:0:2:0 will have destination of 2001:0:0:2:5555:0:2:0 on the wire. Performance results are below. With ILA we see about a 10% drop in pps compared to non-ILA. Much of this drop can be attributed to the loss of early demux on input (translation occurs after it is attempted). We will address this in the next patch set. Also, IPvlan input path does not work with ILA since the routing is bypassed-- this will be addressed in a future patch. Performance testing: Performing netperf TCP_RR with 200 clients: Non-ILA baseline 84.92% CPU utilization 1861922.9 tps 93/163/330 50/90/99% latencies ILA single destination 83.16% CPU utilization 1679683.4 tps 105/180/332 50/90/99% latencies References: Slides from netconf: http://vger.kernel.org/netconf2015Herbert-ILA.pdf Slides from presentation at IETF: https://www.ietf.org/proceedings/92/slides/slides-92-nvo3-1.pdf I-D: https://tools.ietf.org/html/draft-herbert-nvo3-ila-00 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Tom Herbert authored
Adding new module name ila. This implements ILA translation. Light weight tunnel redirection is used to perform the translation in the data path. This is configured by the "ip -6 route" command using the "encap ila <locator>" option, where <locator> is the value to set in destination locator of the packet. e.g. ip -6 route add 3333:0:0:1:5555:0:1:0/128 \ encap ila 2001:0:0:1 via 2401:db00:20:911a:face:0:25:0 Sets a route where 3333:0:0:1 will be overwritten by 2001:0:0:1 on output. Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Tom Herbert authored
This function updates a checksum field value and skb->csum based on a value which is the difference between the old and new checksum. Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Tom Herbert authored
inet_proto_csum_replace4,2,16 take a pseudohdr argument which indicates the checksum field carries a pseudo header. This argument should be a boolean instead of an int. Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Tom Herbert authored
This patch adds the capability to redirect dst input in the same way that dst output is redirected by LWT. Also, save the original dst.input and and dst.out when setting up lwtunnel redirection. These can be called by the client as a pass- through. Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
>> drivers/net/ethernet/cisco/enic/vnic_dev.c:1095:13: sparse: incorrect type in assignment (different address spaces) drivers/net/ethernet/cisco/enic/vnic_dev.c:1095:13: expected void *res drivers/net/ethernet/cisco/enic/vnic_dev.c:1095:13: got void [noderef] <asn:2>* Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
>> drivers/net/ethernet/mellanox/mlx5/core/en_rx.c:173:44: sparse: incorrect type in argument 1 (different base types) drivers/net/ethernet/mellanox/mlx5/core/en_rx.c:173:44: expected restricted __sum16 [usertype] n drivers/net/ethernet/mellanox/mlx5/core/en_rx.c:173:44: got restricted __be16 [usertype] check_sum Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 17 Aug, 2015 8 commits
-
-
David Ahern authored
Table lookup compiles out when VRF is not enabled. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David Ahern authored
kbuild test robot reported: tree: git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git master head: d52736e2 commit: 4e3c8992 [751/762] net: Introduce VRF related flags and helpers reproduce: make htmldocs >> Warning(include/linux/netdevice.h:1293): Enum value 'IFF_VRF_MASTER' not described in enum 'netdev_priv_flags' Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David Ahern authored
As Eric noted netif_index_is_vrf is not called with rcu_read_lock held, so wrap the dev_get_by_index_rcu in rcu_read_lock and unlock. If VRF is not enabled or oif is 0 skip the device lookup. In both cases index cannot be the VRF master. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Achiad Shochat says: ==================== Driver updates 16-Aug-2015 This patchset contains bug fixes, new RSS and pause parameters ethtool options, and support for RX CHECKSUM_COMPLETE. Patchset was applied and tested over commit adc6310 ("Merge branch 'mv88e6xxx-switchdev-fdb'"). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Achiad Shochat authored
Only for packets with first ethertype set to IPv4/6 for now. Signed-off-by: Achiad Shochat <achiad@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Achiad Shochat authored
Only rx/tx pause settings. Autoneg setting is currently not supported. Signed-off-by: Achiad Shochat <achiad@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Achiad Shochat authored
- Port speed settings are applied by the device only upon port admin status transition from DOWN to UP. So we enforce this transition regardless of the port's current operation state (which may be occasionally DOWN if for example the network cable is disconnected). - Fix the PORT_UP/DOWN device interface enum - Set the local_port bit in the device PAOS register - EXPORT the PAOS (Port Administrative and Operational Status) register set/query access functions. Signed-off-by: Achiad Shochat <achiad@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Achiad Shochat authored
- Change the maximum LRO session size from 16KB to 64KB - Reduce the LRO session timeout from 512us to 32us in order to reduce the TCP latency of non-LRO'ed flows. - Fix skb_shinfo(skb)->gso_size and set skb_shinfo(skb)->gso_type. - Fix a bug accessing un-initialized mdev pointer. Signed-off-by: Achiad Shochat <achiad@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-