- 12 Mar, 2014 27 commits
-
-
Jack Morgenstein authored
Since there is no connection between the MAC/VLAN and the GID when using IP-based addressing, the proxy QP1 (running on the slave) must pass the source-mac, destination-mac, and vlan_id information separately from the GID. Additionally, the Host must pass the remote source-mac and vlan_id back to the slave, This is achieved as follows: Outgoing MADs: 1. Source MAC: obtained from the CQ completion structure (struct ib_wc, smac field). 2. Destination MAC: obtained from the tunnel header 3. vlan_id: obtained from the tunnel header. Incoming MADs 1. The source (i.e., remote) MAC and vlan_id are passed in the tunnel header to the proxy QP1. VST mode support: For outgoing MADs, the vlan_id obtained from the header is discarded, and the vlan_id specified by the Hypervisor is used instead. For incoming MADs, the incoming vlan_id (in the wc) is discarded, and the "invalid" vlan (0xffff) is substituted when forwarding to the slave. Signed-off-by: Moni Shoua <monis@mellanox.co.il> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jack Morgenstein authored
The IB side of RoCE requires the MAC table index of the MAC address used by its QPs. To obtain the real MAC index, the IB side registers the MAC (increasing its ref count, and also returning the real MAC index) during the modify-qp sequence. This protects against the ETH side deleting or modifying that MAC table entry while the QP is active. Note that until the modify-qp command returns success, the MAC and VLAN information only has "candidate" status. If the modify-qp succeeds, the "candidate" info is promoted to the operational MAC/VLAN info for the qp. If the modify fails, the candidate MAC/VLAN is unregistered, and the old qp info is preserved. The patch is a bit complex, because there are multiple qp transitions where the primary-path information may be modified: INIT-to-RTR, and SQD-to-SQD. Similarly for the alternate path information. Therefore the code must handle cases where path information has already been entered into the QP context by previous qp transitions. For the MAC address, the success logic is as follows: 1. If there was no previous MAC, simply move the candidate MAC information to the operational information, and reset the candidate MAC info. 2. If there was a previous MAC, unregister it. Then move the MAC information from candidate to operational, and reset the candidate info (as in 1. above). The MAC address failure logic is the same for all cases: - Unregister the candidate MAC, and reset the candidate MAC info. For Vlan registration, the logic is similar. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jack Morgenstein authored
The GIDs are statically distributed, as follows: PF: gets 16 GIDs VFs: Remaining GIDS are divided evenly between VFs activated by the driver. If the division is not even, lower-numbered VFs get an extra GID. For an IB interface, the number of gids per guest remains as before: one gid per guest. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jack Morgenstein authored
For IB transport, the host determines the slave GIDs. For ETH (RoCE), however, the slave's GID is determined by the IP address that the slave itself assigns to the ETH device used by RoCE. In this case, the slave must be able to write its GIDs to the HCA gid table (at the GID indices that slave "owns"). This commit adds processing for the SET_PORT_GID_TABLE opcode modifier for the SET_PORT command wrapper (so that slaves may modify their GIDS for RoCE). Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jack Morgenstein authored
This requires the following modifications: 1. Fix build_mlx4_header to properly fill in the ETH fields 2. Adjust mux and demux QP1 flow to support RoCE. This commit still assumes only one GID per slave for RoCE. The commit enabling multiple GIDs is a subsequent commit, and is done separately because of its complexity. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Jon Maloy says: ==================== tipc: simplifications in socket and port layer After the removal of the tipc native API the relation between tipc_port and its API types is strictly one-to-one, i.e, the latter can now only be a socket API. This change opens up for simplifications both in the code, data and locking structure. We start with this series, where we ensure that port and socket structures are co-allocated. Note that the first commit in the series is unrelated to the above. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jon Paul Maloy authored
As an artefact from the native interface, the message sending functions in the port takes a port ref as first parameter, and then looks up in the registry to find the corresponding port pointer. This despite the fact that the only currently existing caller, tipc_sock, already knows this pointer. We change the signature of these functions to take a struct tipc_port* argument, and remove the redundant lookups. We also remove an unmotivated extra lookup in the function socket.c:auto_connect(), and, as the lookup functions tipc_port_deref() and ref_deref() now become unused, we remove these two functions. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jon Paul Maloy authored
The practice of naming variables in TIPC is inconistent, sometimes even within the same file. In this commit we align variable names and declarations within socket.c, and function and macro names within socket.h. We also reduce the number of conversion macros to two, in order to make usage less obsure. These changes are purely cosmetic. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jon Paul Maloy authored
The three functions tipc_portimportance(), tipc_portunreliable() and tipc_portunreturnable() and their corresponding tipc_set* functions, are all grabbing port_lock when accessing the targeted port. This is unnecessary in the current code, since these calls only are made from within socket downcalls, already protected by sock_lock. We remove the redundant locking. Also, since the functions now become trivial one-liners, we move them to port.h and make them inline. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jon Paul Maloy authored
Due to the original one-to-many relation between port and user API layers, upcalls to the API have been performed via function pointers, installed in struct tipc_port at creation. Since this relation now always is one-to-one, we can instead use ordinary function calls. We remove the function pointers 'dispatcher' and ´wakeup' from struct tipc_port, and replace them with calls to the renamed functions tipc_sk_rcv() and tipc_sk_wakeup(). At the same time we change the name and signature of the functions tipc_createport() and tipc_deleteport() to reflect their new role as mere initialization/destruction functions. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jon Paul Maloy authored
After the removal of the tipc native API the relation between a tipc_port and its API types is strictly one-to-one, i.e, the latter can now only be a socket API. There is therefore no need to allocate struct tipc_port and struct sock independently. In this commit, we aggregate struct tipc_port into struct tipc_sock, hence saving both CPU cycles and structure complexity. There are no functional changes in this commit, except for the elimination of the separate allocation/freeing of tipc_port. All other changes are just adaptatons to the new data structure. This commit also opens up for further code simplifications and code volume reduction, something we will do in later commits. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jon Paul Maloy authored
The field 'peer_name' in struct tipc_sock is redundant, since this information already is available from tipc_port, to which tipc_sock has a reference. We remove the field, and ensure that peer node and peer port info instead is fetched via the functions that already exist for this purpose. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jon Paul Maloy authored
The lock for protecting the reference table is declared as an RWLOCK, although it is only used in write mode, never in read mode. We redefine it to become a spinlock. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Steffen Klassert authored
We leak an active timer, the hotcpu notifier and all allocated resources when we exit a namespace. Fix this by introducing a flow_cache_fini() function where we release the resources before we exit. Fixes: ca925cf1 ("flowcache: Make flow cache name space aware") Reported-by: Jakub Kicinski <moorray3@wp.pl> Tested-by: Jakub Kicinski <moorray3@wp.pl> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Fan Du <fan.du@windriver.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Joe Perches authored
The use of __constant_<foo> has been unnecessary for quite awhile now. Make these uses consistent with the rest of the kernel. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Joe Perches authored
The use of __constant_<foo> has been unnecessary for quite awhile now. Make these uses consistent with the rest of the kernel. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Joe Perches authored
The use of __constant_<foo> has been unnecessary for quite awhile now. Make these uses consistent with the rest of the kernel. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Joe Perches authored
The use of __constant_<foo> has been unnecessary for quite awhile now. Make these uses consistent with the rest of the kernel. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Joe Perches authored
The use of __constant_<foo> has been unnecessary for quite awhile now. Make these uses consistent with the rest of the kernel. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Joe Perches authored
The use of __constant_<foo> has been unnecessary for quite awhile now. Make these uses consistent with the rest of the kernel. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Joe Perches authored
The use of __constant_<foo> has been unnecessary for quite awhile now. Make these uses consistent with the rest of the kernel. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Claudiu Manoil authored
priv is not instantiated at gfar_of_init() time, when parsing the DT for info on supported HW queues. Before the netdev can be allocated, the number of supported queues must be known. Because the number of supported queues depends on device type, move the compatibility checks before netdev allocation. Local vars are used to hold the operation mode info before netdev allocation. This fixes the null accesses for priv->.., in gfar_of_init. Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
hayeswang authored
Add dumping the tally counter by ethtool. Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Thomas Stilwell authored
The rf233 and rf231 are sufficiently similar that we can treat rf233 like rf231. rf233 is missing some features that rf231 has, but we don't currently make use of them so there's nothing to handle differently yet. Should we add support in the future for rf231 *_NOCLK or SLEEP states, or PAD_IO drive strength, exceptions will need to be made for rf233. Signed-off-by: Thomas Stilwell <stilwellt@openlabs.co> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Eric Dumazet says: ==================== pkt_sched: allow scheduling points We have seen delays of more than 50ms in class or qdisc dumps, in case device is under high TX stress, even with the prior 4KB per skb limit. With the new 16KB limit, this could translate to 200ms delays. Add cond_resched() to give a chance to higher prio tasks to get cpu. But before doing so, we need to remove the rcu locking from tc_dump_qdisc() as David spotted. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
We have seen delays of more than 50ms in class or qdisc dumps, in case device is under high TX stress, even with the prior 4KB per skb limit. Add cond_resched() to give a chance to higher prio tasks to get cpu. Signed-off-by; Eric Dumazet <edumazet@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
Like all rtnetlink dump operations, we hold RTNL in tc_dump_qdisc(), so we do not need to use rcu protection to protect list of netdevices. This will allow preemption to occur, thus reducing latencies. Following patch adds explicit cond_resched() calls. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 11 Mar, 2014 6 commits
-
-
stephen hemminger authored
The option code is taking IP address and putting it into a generic container. Force cast to silence sparse warnings. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
stephen hemminger authored
This patch fixes sparse warnings in vlan driver. It propagates the sparse __percpu attribute from alloc_percpu into netdev_alloc_pcpu_stats. I expect it may trigger additional sparse warnings from other drivers that are missing the __percpu attribute. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Cong Wang <cwang@twopensource.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Li RongQing authored
Packets which have L2 address different from ours should be already filtered before entering into ip6_forward(). Perform that check at the beginning to avoid processing such packets. Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
hayeswang authored
Call skb_cow_head() before editing the tx packet header. The header would be reallocated if it is shared. Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Tobias Klauser authored
Instead of using an own copy of struct net_device_stats in struct cpsw_priv, use stats from struct net_device. Also remove the thus unnecessary .ndo_get_stats function, as it just returns dev->stats, which is the default. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Acked-by: Mugunthan V N <mugunthanvnm@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
It is not legal to create multiple kmem_cache having the same name. flowcache can use a single kmem_cache, no need for a per netns one. Fixes: ca925cf1 ("flowcache: Make flow cache name space aware") Reported-by: Jakub Kicinski <moorray3@wp.pl> Tested-by: Jakub Kicinski <moorray3@wp.pl> Tested-by: Fan Du <fan.du@windriver.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 10 Mar, 2014 7 commits
-
-
Gu Zheng authored
We do not need to switch the net_ns if the target net_ns the same as the current one, so here we add a pre-check of net_ns to avoid this as David suggested. Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
All skb in socket write queue should be properly timestamped. In case of FastOpen, we special case the SYN+DATA 'message' as we queue in socket wrote queue the two fallback skbs: 1) SYN message by itself. 2) DATA segment by itself. We should make sure these skbs have proper timestamps. Add a WARN_ON_ONCE() to eventually catch future violations. Fixes: 740b0f18 ("tcp: switch rtt estimations to usec resolution") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Haiyang Zhang authored
Due to a bug in the Hyper-V host verion 2008R2, we need to use a slightly smaller receive buffer size, otherwise the buffer will not be accepted by the legacy hosts. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Alexander Aring authored
Correct offset is 3 of the 6lowpanfrag_max_datagram_size value in proc entry ctl table and not 2. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
K. Y. Srinivasan says: ==================== Drivers: net: hyperv: Enable various offloads This patch set enables both checksum as well as segmentation offload. As part of this effort I have enabled scatter gather I/O a well. In version 2 of these patches, I addressed comments from David Miller and Dan Carpenter. In this version I have addressed the latest comments from David Miller. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
KY Srinivasan authored
Enable segmentation offload. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
KY Srinivasan authored
Enable send side checksum offload. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-