- 30 Jul, 2018 21 commits
-
-
Yidong Ren authored
This patch implements following ethtool stats fields for netvsc: cpu<n>_tx/rx_packets/bytes cpu<n>_vf_tx/rx_packets/bytes Corresponding per-cpu counters already exist in current code. Exposing these counters will help troubleshooting performance issues. for_each_present_cpu() was used instead of for_each_possible_cpu(). for_each_possible_cpu() would create very long and useless output. It is still being used for internal buffer, but not for ethtool output. There could be an overflow if cpu was added between ethtool call netvsc_get_sset_count() and netvsc_get_ethtool_stats() and netvsc_get_strings(). (still safe if cpu was removed) ethtool makes these three function calls separately. As long as we use ethtool, I can't see any clean solution. Currently and in foreseeable short term, Hyper-V doesn't support cpu hot-plug. Plus, ethtool is for admin use. Unlikely the admin would perform such combo operations. Changes in v2: - Remove cpp style comment - Resubmit after freeze Changes in v3: - Reimplemented with kvmalloc instead of alloc_percpu Changes in v4: - Fixed inconsistent array size - Use kvmalloc_array instead of kvmalloc Signed-off-by: Yidong Ren <yidren@microsoft.com> Reviewed-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Petr Machata says: ==================== A test for mirror-to-gretap with team in UL packet path This patchset adds a test for "tc action mirred mirror" where the mirrored-to device is a gretap, and underlay path contains a team device. In patch #1 require_command() is added, which should henceforth be used to declare dependence on a certain tool. In patch #2, two new functions, team_create() and team_destroy(), are added to lib.sh. The newly-added test uses arping, which isn't necessarily available. Therefore patch #3 introduces $ARPING, and a preexisting test is fixed to require_command $ARPING. In patches #4 and #5, two new tests are added. In both cases, a team device is on egress path of a mirrored packet in a mirror-to-gretap scenario. In the first one, the team device is in loadbalance mode, in the second one it's in lacp mode. (The difference in modes necessitates a different testing strategy, hence two test cases instead of just parameterizing one.) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Petr Machata authored
This tests mirror-to-gretap when an underlay packet path includes a team device which is not in loadbalance mode, but in LACP mode. The test manipulates LAG membership to achieve changes in txability, thus making sure that a driver that offloads mirror-to-gretap doesn't just consider upness of a device. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Petr Machata authored
Test for "tc action mirred egress mirror" that mirrors to gretap when the underlay route points at a VLAN-aware bridge (802.1q), and the traffic egresses the bridge through a team device. Test upping and downing individual team device slaves and verify the traffic flows as expected. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Petr Machata authored
Instead of relying on "arping" being installed everywhere under that name, introduce a variable $ARPING like the other tools do. Convert an existing test, mirror_gre_vlan_bridge_1q.sh to require_command $ARPING and then invoke arping through the variable. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Petr Machata authored
Add team_create() and team_destroy() to manage team netdevices. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Petr Machata authored
The logic for testing whether a certain command is available is used several times in the current code base. The tests in follow-up patches add more requirements like that. Therefore extract the logic into a named function, require_command(), that can be used directly from lib.sh as well as from any test that wishes to declare dependence on some command. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
YueHaibing authored
kfree(NULL) is safe,so this removes NULL check before freeing the mem Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Quentin Schulz authored
The Extended Page Access is a 16-bit register, so change the page parameter of vsc85xx_phy_page_set to a u16. Signed-off-by: Quentin Schulz <quentin.schulz@bootlin.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vakul Garg authored
On receipt of a complete tls record, use socket's saved data_ready callback instead of state_change callback. In function tls_queue(), the TLS record is queued in encrypted state. But the decryption happen inline when tls_sw_recvmsg() or tls_sw_splice_read() get invoked. So it should be ok to notify the waiting context about the availability of data as soon as we could collect a full TLS record. For new data availability notification, sk_data_ready callback is more appropriate. It points to sock_def_readable() which wakes up specifically for EPOLLIN event. This is in contrast to the socket callback sk_state_change which points to sock_def_wakeup() which issues a wakeup unconditionally (without event mask). Signed-off-by: Vakul Garg <vakul.garg@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Paolo Abeni says: ==================== TC: refactor act_mirred packets re-injection This series is aimed at improving the act_mirred redirect performances. Such action is used by OVS to represent TC S/W flows, and it's current largest bottle-neck is the need for a skb_clone() for each packet. The first 2 patches introduce some cleanup and safeguards to allow extending tca_result - we will use it to store RCU protected redirect information - and introduce a clear separation between user-space accessible tcfa_action values and internal values accessible only by the kernel. Then a new tcfa_action value is introduced: TC_ACT_REINJECT, similar to TC_ACT_REDIRECT, but preserving the mirred semantic. Such value is not accessible from user-space. The last patch exploits the newly introduced infrastructure in the act_mirred action, to avoid a skb_clone, when possible. Overall this the above gives a ~10% performance improvement in forwarding tput, when using the TC S/W datapath. v1 -> v2: - preserve the rcu lock in act_bpf - add and use a new action value to reinject the packets, preserving the mirred semantic v2 -> v3: - renamed to new action as TC_ACT_REINJECT - TC_ACT_REINJECT is not exposed to user-space v3 -> v4: - dropped the TC_ACT_REDIRECT patch - report failure via extack, too - rename the new action as TC_ACT_REINSERT - skip clone only if the control action don't touch tcf_result v4 -> v5: - fix a couple of build issue reported by kbuild bot - dont split messages ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Paolo Abeni authored
When mirred is invoked from the ingress path, and it wants to redirect the processed packet, it can now use the TC_ACT_REINSERT action, filling the tcf_result accordingly, and avoiding a per packet skb_clone(). Overall this gives a ~10% improvement in forwarding performance for the TC S/W data path and TC S/W performances are now comparable to the kernel openvswitch datapath. v1 -> v2: use ACT_MIRRED instead of ACT_REDIRECT v2 -> v3: updated after action rename, fixed typo into the commit message v3 -> v4: updated again after action rename, added more comments to the code (JiriP), skip the optimization if the control action need to touch the tcf_result (Paolo) v4 -> v5: fix sparse warning (kbuild bot) Signed-off-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Paolo Abeni authored
This is similar TC_ACT_REDIRECT, but with a slightly different semantic: - on ingress the mirred skbs are passed to the target device network stack without any additional check not scrubbing. - the rcu-protected stats provided via the tcf_result struct are updated on error conditions. This new tcfa_action value is not exposed to the user-space and can be used only internally by clsact. v1 -> v2: do not touch TC_ACT_REDIRECT code path, introduce a new action type instead v2 -> v3: - rename the new action value TC_ACT_REINJECT, update the helper accordingly - take care of uncloned reinjected packets in XDP generic hook v3 -> v4: - renamed again the new action value (JiriP) v4 -> v5: - fix build error with !NET_CLS_ACT (kbuild bot) Signed-off-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Paolo Abeni authored
Each lockless action currently does its own RCU locking in ->act(). This allows using plain RCU accessor, even if the context is really RCU BH. This change drops the per action RCU lock, replace the accessors with the _bh variant, cleans up a bit the surrounding code and documents the RCU status in the relevant header. No functional nor performance change is intended. The goal of this patch is clarifying that the RCU critical section used by the tc actions extends up to the classifier's caller. v1 -> v2: - preserve rcu lock in act_bpf: it's needed by eBPF helpers, as pointed out by Daniel v3 -> v4: - fixed some typos in the commit message (JiriP) Signed-off-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Paolo Abeni authored
Currently, when initializing an action, the user-space can specify and use arbitrary values for the tcfa_action field. If the value is unknown by the kernel, is implicitly threaded as TC_ACT_UNSPEC. This change explicitly checks for unknown values at action creation time, and explicitly convert them to TC_ACT_UNSPEC. No functional changes are introduced, but this will allow introducing tcfa_action values not exposed to user-space in a later patch. Note: we can't use the above to hide TC_ACT_REDIRECT from user-space, as the latter is already part of uAPI. v3 -> v4: - use an helper to check for action validity (JiriP) - emit an extack for invalid actions (JiriP) v4 -> v5: - keep messages on a single line, drop net_warn (Marcelo) Signed-off-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
YueHaibing authored
There are no in-tree callers of cn23xx_dump_iq_regs. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Christoph Hellwig says: ==================== socket poll related cleanups v2 A couple of cleanups I stumbled upon when studying the networking poll code. Changes since v1: - drop a dispute patch from this series (to be sent separately) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Christoph Hellwig authored
Fold it into the only caller to make the code simpler and easier to read. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Christoph Hellwig authored
There is no point in hiding this logic in a helper. Also remove the useless events != 0 check and only busy loop once we know we actually have a poll method. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Christoph Hellwig authored
For any open socket file descriptor sock->sk->sk_wq->wait will always point to sock->wq->wait. That means we can do the shorter dereference and removal a NULL check and don't have to not worry about any RCU protection. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Christoph Hellwig authored
The wait_address argument is always directly derived from the filp argument, so remove it. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 29 Jul, 2018 19 commits
-
-
YueHaibing authored
Replace calls to kmalloc followed by a memcpy with a direct call to kmemdup. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
-
YueHaibing authored
Replace calls to kmalloc followed by a memcpy with a direct call to kmemdup. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
-
YueHaibing authored
net/sched/act_pedit.c:289:2-3: Unneeded semicolon Remove unneeded semicolon. Generated by: scripts/coccinelle/misc/semicolon.cocci Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
YueHaibing authored
There are no in-tree callers of qed_get_cm_pq_idx_rl since it be there, so it can be removed. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Sean Wang authored
Since driver is devicetree-based, all device type and charateristic can be determined by the compatible string and its data. It's unnecessary to create another dependent function to check chip ID and then decide whether the specific funciton is being supported on a certain device. It can be totally replaced by the existing flag, so a cleanup is made by removing the function and the only user, HWLRO. MT2701 also have a missing HWLRO support in old code, so add it the same patch. Signed-off-by: Sean Wang <sean.wang@mediatek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Sean Wang authored
Improve more in the existing code by reusing dma_zalloc_coherent instead of dma_alloc_coherent with __GFP_ZERO or superfluous zeroing buffer. Signed-off-by: Sean Wang <sean.wang@mediatek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Tyler Hicks authored
Commit 5f81880d ("sysfs, kobject: allow creating kobject belonging to arbitrary users") incorrectly changed the argument passed as the parent parameter when calling sysfs_add_file_mode_ns(). This caused some sysfs attribute files to not be added correctly to certain groups. Fixes: 5f81880d ("sysfs, kobject: allow creating kobject belonging to arbitrary users") Signed-off-by: Tyler Hicks <tyhicks@canonical.com> Reported-by: Heiner Kallweit <hkallweit1@gmail.com> Tested-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linuxDavid S. Miller authored
Saeed Mahameed says: ==================== mlx5e-updates-2018-07-27 (Vxlan updates) This series from Gal and Saeed provides updates to mlx5 vxlan implementation. Gal, started with three cleanups to reflect the actual hardware vxlan state - reflect 4789 UDP port default addition to software database - check maximum number of vxlan UDP ports - cleanup an unused member in vxlan work Then Gal provides performance optimization by replacing the vxlan radix tree with a hash table. Measuring mlx5e_vxlan_lookup_port execution time: Radix Tree Hash Table --------------- ------------ ------------ Single Stream 161 ns 79 ns (51% improvement) Multi Stream 259 ns 136 ns (47% improvement) Measuring UDP stream packet rate, single fully utilized TX core: Radix Tree: 498,300 PPS Hash Table: 555,468 PPS (11% improvement) Next, from Saeed, vxlan refactoring to allow sharing the vxlan table between different mlx5 netdevice instances like PF and VF representors, this is done by making mlx5 vxlan interface more generic and decoupling it from PF netdevice structures and logic, then moving it into mlx5 core as a low level interface so it can be used by VF representors, which is illustrated in the last patch of the serious. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Petr Machata authored
There are two problems in this test case: - When indexing in bash associative array, the subscript is interpreted as string, not as a variable name to be expanded. - The keys stored to t0s and t1s are not DSCP values, but priority + base (i.e. the logical DSCP value, not the full bitfield value). In combination these two bugs conspire to make the test just work, except it doesn't really test anything and always passes. Fix the above two problems in obvious manner. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Stephen Hemminger says: ==================== mtu related changes While looking at other MTU issues, noticed a couple oppurtunties for improving user experience. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Stephen Hemminger authored
If an invalid MTU value is set through rtnetlink return extra error information instead of putting message in kernel log. For other cases where there is no visible API, keep the error report in the log. Example: # ip li set dev enp12s0 mtu 10000 Error: mtu greater than device maximum. # ifconfig enp12s0 mtu 10000 SIOCSIFMTU: Invalid argument # dmesg | tail -1 [ 2047.795467] enp12s0: mtu greater than device maximum Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Stephen Hemminger authored
Report the minimum and maximum MTU allowed on a device via netlink so that it can be displayed by tools like ip link. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Stephen Hemminger authored
When changing MTU, RTNL is held so use rtnl_dereference instead of rcu_dereference. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
Commit ee205981 ("net/dcb: Add dscp to priority selector type") added a define for the new DSCP selector type created by IEEE 802.1Qcd, but missed the comment enumerating all selector types. Update the comment. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ivan Khoronzhuk authored
Seems it was missed while adding for first net dev in dual-emac mode. Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Xin Long says: ==================== route: add support and selftests for directed broadcast forwarding Patch 1/2 is the feature and 2/2 is the selftest. Check the changelog on each of them to know the details. v1->v2: - fix a typo in changelog. - fix an uapi break that Davide noticed. - flush route cache when bc_forwarding is changed. - add the selftest for this patch as Ido's suggestion. v2->v3: - fix an incorrect 'if check' in devinet_conf_proc as David Ahern noticed. - extend the selftest after one David Ahern fix for vrf. v3->v4: - improve the output log in the selftest as David Ahern suggested. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Xin Long authored
As Ido's suggestion, this patch is to add a selftest for directed broadcast forwarding with vrf. It does the assertion by checking the src IP of the echo-reply packet in ping_test_from. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Xin Long authored
This patch implements the feature described in rfc1812#section-5.3.5.2 and rfc2644. It allows the router to forward directed broadcast when sysctl bc_forwarding is enabled. Note that this feature could be done by iptables -j TEE, but it would cause some problems: - target TEE's gateway param has to be set with a specific address, and it's not flexible especially when the route wants forward all directed broadcasts. - this duplicates the directed broadcasts so this may cause side effects to applications. Besides, to keep consistent with other os router like BSD, it's also necessary to implement it in the route rx path. Note that route cache needs to be flushed when bc_forwarding is changed. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vincent Bernat authored
When freebind feature is set of an IPv6 socket, any source address can be used when sending UDP datagrams using IPv6 PKTINFO ancillary message. Global non-local bind feature was added in commit 35a256fe ("ipv6: Nonlocal bind") for IPv6. This commit also allows IPv6 source address spoofing when non-local bind feature is enabled. Signed-off-by: Vincent Bernat <vincent@bernat.im> Signed-off-by: David S. Miller <davem@davemloft.net>
-