- 18 Jun, 2017 12 commits
-
-
Wei Wang authored
Similar as ipv4, ipv6 path also needs to call dst_hold_safe() when necessary to avoid double free issue on the dst. Signed-off-by: Wei Wang <weiwan@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Wei Wang authored
As the intend of this patch series is to completely remove dst gc, we need to call dst_dev_put() to release the reference to dst->dev when removing routes from fib because we won't keep the gc list anymore and will lose the dst pointer right after removing the routes. Without the gc list, there is no way to find all the dst's that have dst->dev pointing to the going-down dev. Hence, we are doing dst_dev_put() immediately before we lose the last reference of the dst from the routing code. The next dst_check() will trigger a route re-lookup to find another route (if there is any). Signed-off-by: Wei Wang <weiwan@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Wei Wang authored
In IPv6 routing code, struct rt6_info is created for each static route and RTF_CACHE route and inserted into fib6 tree. In both cases, dst ref count is not taken. As explained in the previous patch, this leads to the need of the dst garbage collector. This patch holds ref count of dst before inserting the route into fib6 tree and properly releases the dst when deleting it from the fib6 tree as a preparation in order to fully get rid of dst gc later. Also, correct fib6_age() logic to check dst->__refcnt to be 1 to indicate no user is referencing the dst. And remove dst_hold() in vrf_rt6_create() as ip6_dst_alloc() already puts dst->__refcnt to 1. Signed-off-by: Wei Wang <weiwan@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Wei Wang authored
With the previous preparation patches, we are ready to get rid of the dst gc operation in ipv4 code and release dst based on refcnt only. So this patch adds DST_NOGC flag for all IPv4 dst and remove the calls to dst_free(). At this point, all dst created in ipv4 code do not use the dst gc anymore and will be destroyed at the point when refcnt drops to 0. Signed-off-by: Wei Wang <weiwan@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Wei Wang authored
This patch checks all the calls to dst_hold()/skb_dst_force()/dst_clone()/dst_use() to see if dst_hold_safe() is needed to avoid double free issue if dst gc is removed and dst_release() directly destroys dst when dst->__refcnt drops to 0. In tx path, TCP hold sk->sk_rx_dst ref count and also hold sock_lock(). UDP and other similar protocols always hold refcount for skb->_skb_refdst. So both paths seem to be safe. In rx path, as it is lockless and skb_dst_set_noref() is likely to be used, dst_hold_safe() should always be used when trying to hold dst. In the routing code, if dst is held during an rcu protected session, it is necessary to call dst_hold_safe() as the current dst might be in its rcu grace period. Signed-off-by: Wei Wang <weiwan@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Wei Wang authored
As the intend of this patch series is to completely remove dst gc, we need to call dst_dev_put() to release the reference to dst->dev when removing routes from fib because we won't keep the gc list anymore and will lose the dst pointer right after removing the routes. Without the gc list, there is no way to find all the dst's that have dst->dev pointing to the going-down dev. Hence, we are doing dst_dev_put() immediately before we lose the last reference of the dst from the routing code. The next dst_check() will trigger a route re-lookup to find another route (if there is any). Signed-off-by: Wei Wang <weiwan@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Wei Wang authored
In IPv4 routing code, fib_nh and fib_nh_exception can hold pointers to struct rtable but they never increment dst->__refcnt. This leads to the need of the dst garbage collector because when user is done with this dst and calls dst_release(), it can only decrement dst->__refcnt and can not free the dst even it sees dst->__refcnt drops from 1 to 0 (unless DST_NOCACHE flag is set) because the routing code might still hold reference to it. And when the routing code tries to delete a route, it has to put the dst to the gc_list if dst->__refcnt is not yet 0 and have a gc thread running periodically to check on dst->__refcnt and finally to free dst when refcnt becomes 0. This patch increments dst->__refcnt when fib_nh/fib_nh_exception holds reference to this dst and properly release the dst when fib_nh/fib_nh_exception has been updated with a new dst. This patch is a preparation in order to fully get rid of dst gc later. Signed-off-by: Wei Wang <weiwan@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Wei Wang authored
This function should be called when removing routes from fib tree after the dst gc is no longer in use. We first mark DST_OBSOLETE_DEAD on this dst to make sure next dst_ops->check() fails and returns NULL. Secondly, as we no longer keep the gc_list, we need to properly release dst->dev right at the moment when the dst is removed from the fib/fib6 tree. It does the following: 1. change dst->input and output pointers to dst_discard/dst_dscard_out to discard all packets 2. replace dst->dev with loopback interface Signed-off-by: Wei Wang <weiwan@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Wei Wang authored
The current mechanism of freeing dst is a bit complicated. dst has its ref count and when user grabs the reference to the dst, the ref count is properly taken in most cases except in IPv4/IPv6/decnet/xfrm routing code due to some historic reasons. If the reference to dst is always taken properly, we should be able to simplify the logic in dst_release() to destroy dst when dst->__refcnt drops from 1 to 0. And this should be the only condition to determine if we can call dst_destroy(). And as dst is always ref counted, there is no need for a dst garbage list to hold the dst entries that already get removed by the routing code but are still held by other users. And the task to periodically check the list to free dst if ref count become 0 is also not needed anymore. This patch introduces a temporary flag DST_NOGC(no garbage collector). If it is set in the dst, dst_release() will call dst_destroy() when dst->__refcnt drops to 0. dst_hold_safe() will also check for this flag and do atomic_inc_not_zero() similar as DST_NOCACHE to avoid double free issue. This temporary flag is mainly used so that we can make the transition component by component without breaking other parts. This flag will be removed after all components are properly transitioned. This patch also introduces a new function dst_release_immediate() which destroys dst without waiting on the rcu when refcnt drops to 0. It will be used in later patches. Follow-up patches will correct all the places to properly take ref count on dst and mark DST_NOGC. dst_release() or dst_release_immediate() will be used to release the dst instead of dst_free() and its related functions. And final clean-up patch will remove the DST_NOGC flag. Signed-off-by: Wei Wang <weiwan@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Wei Wang authored
Existing ipv4/6_blackhole_route() code generates a blackhole route with dst->dev pointing to the passed in dst->dev. It is not necessary to hold reference to the passed in dst->dev because the packets going through this route are dropped anyway. A loopback interface is good enough so that we don't need to worry about releasing this dst->dev when this dev is going down. Signed-off-by: Wei Wang <weiwan@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Wei Wang authored
In udp_v4/6_early_demux() code, we try to hold dst->__refcnt for dst with DST_NOCACHE flag. This is because later in udp_sk_rx_dst_set() function, we will try to cache this dst in sk for connected case. However, a better way to achieve this is to not try to hold dst in early_demux(), but in udp_sk_rx_dst_set(), call dst_hold_safe(). This approach is also more consistant with how tcp is handling it. And it will make later changes simpler. Signed-off-by: Wei Wang <weiwan@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Wei Wang authored
In ipv6 tx path, rcu_read_lock() is taken so that dst won't get freed during the execution of ip6_fragment(). Hence, no need to hold dst in it. Signed-off-by: Wei Wang <weiwan@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 16 Jun, 2017 28 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linuxDavid S. Miller authored
Saeed Mahameed says: ==================== Mellanox mlx5 updates and cleanups 2017-06-16 mlx5-updates-2017-06-16 This series provide some updates and cleanups for mlx5 core and netdevice driver. From Eli Cohen, add a missing event string. From Or Gerlitz, some checkpatch cleanups. From Moni, Disalbe HW level LAG when SRIOV is enabled. From Tariq, A code reuse cleanup in aRFS flow. From Itay Aveksis, Typo fix. From Gal Pressman, ethtool statistics updates and "update stats" deferred work optimizations. From Majd Dibbiny, Fast unload support on kernel shutdown. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vivien Didelot authored
Similarly to how cross-chip VLAN works, define a bitmap of multicast group members for a switch, now including its DSA ports, so that multicast traffic can be sent to all switches of the fabric. A switch may drop the frames if no user port is a member. This brings support for multicast in a multi-chip environment. As of now, all switches of the fabric must support the multicast operations in order to program a single fabric port. Reported-by: Jason Cobham <jcobham@questertangent.com> Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Tested-by: Jason Cobham <jcobham@questertangent.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Nathan Fontenot authored
When booting into the kdump/kexec kernel, pHyp and vios are not prepared for the initialization crq request and a failover transport event is generated. This is not handled correctly. At this point in initialization the driver is still in the 'probing' state and cannot handle a full reset of the driver as is normally done for a failover transport event. To correct this we catch driver resets while still in the 'probing' state and return EAGAIN. This results in the driver tearing down the main crq and calling ibmvnic_init() again. Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Wei Wang authored
In the existing dn_route.c code, dn_route_output_slow() takes dst->__refcnt before calling dn_insert_route() while dn_route_input_slow() does not take dst->__refcnt before calling dn_insert_route(). This makes the whole routing code very buggy. In dn_dst_check_expire(), dnrt_free() is called when rt expires. This makes the routes inserted by dn_route_output_slow() not able to be freed as the refcnt is not released. In dn_dst_gc(), dnrt_drop() is called to release rt which could potentially cause the dst->__refcnt to be dropped to -1. In dn_run_flush(), dst_free() is called to release all the dst. Again, it makes the dst inserted by dn_route_output_slow() not able to be released and also, it does not wait on the rcu and could potentially cause crash in the path where other users still refer to this dst. This patch makes sure both input and output path do not take dst->__refcnt before calling dn_insert_route() and also makes sure dnrt_free()/dst_free() is called when removing dst from the hash table. The only difference between those 2 calls is that dnrt_free() waits on the rcu while dst_free() does not. Signed-off-by: Wei Wang <weiwan@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Sowmini Varadhan says: ==================== rds: tcp: misc bug fixes This series contains 2 bug fixes (patch2, patch3) and one bit of code cleanup (patch1) identified during database testing ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Sowmini Varadhan authored
Each time we get an incoming SYN to the RDS_TCP_PORT, the TCP layer accepts the connection and then the rds_tcp_accept_one() callback is invoked to process the incoming connection. rds_tcp_accept_one() may reject the incoming syn for a number of reasons, e.g., commit 1a0e100f ("RDS: TCP: Force every connection to be initiated by numerically smaller IP address"), or because we are getting spammed by a malicious node that is triggering a flood of connection attempts to RDS_TCP_PORT. If the incoming syn is rejected, no data would have been sent on the TCP socket, and we do not need to be in TIME_WAIT state, so we set linger on the TCP socket before closing, thereby closing the socket efficiently with a RST. Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Tested-by: Imanti Mendez <imanti.mendez@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Sowmini Varadhan authored
Found when testing between sparc and x86 machines on different subnets, so the address comparison patterns hit the corner cases and brought out some bugs fixed by this patch. Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Tested-by: Imanti Mendez <imanti.mendez@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Sowmini Varadhan authored
After commit 1a0e100f ("RDS: TCP: Force every connection to be initiated by numerically smaller IP address") we no longer need the logic associated with cp_outgoing, so clean up usage of this field. Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Tested-by: Imanti Mendez <imanti.mendez@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Florian Fainelli says: ==================== net: dsa: loop: Driver updates This patch series updates drivers/net/dsa/dsa_loop.c to provide more useful coverage of the DSA APIs and how we mangle the CPU ports ethtool statistics to include its switch-facing port counters. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Florian Fainelli authored
When a DSA driver implements ethtool statistics, we also override the master network device's ethtool statistics with the CPU port's statistics and this has proven to be a possible source of bugs in the past. Enhance the dsa_loop.c driver to provide statistics under the forme of ok/error reads and writes from the per-port PHY read/writes. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Florian Fainelli authored
This is a simple function that only gets used in the driver's remove function, inline it there. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Tariq Toukan says: ==================== pktgen new parameters This patchset adds two parameters to the pktgen scripts. * The first patch adds a parameter to control the number of packets sent by every pktgen thread. * The second patch adds a parameter to control the index of first thread, instead of always starting from cpu 0. Series generated against net-next commit: f7aec129 rxrpc: Cache the congestion window setting Thanks, Tariq. V2: * rebased to latest net-next. * per Jesper's comments on Patch 2, fixed $F_THREAD description and $L_THREAD evaluation. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Tariq Toukan authored
Use "-f <num>", to specify the index of the first sender thread. In default first thread is #0. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Cc: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Tariq Toukan authored
Use -n <num>, to specify the number of packets every thread sends. Zero means indefinitely. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Cc: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Antoine Tenart says: ==================== net: mvmdio: add xMDIO xSMI support This series aims to add the xSMI support on the xMDIO bus to the mvmdio driver. The xSMI interface complies with the IEEE 802.3 clause 45 and is used by 10GbE devices. On 7k and 8k (as of now), such an interface is found and is used by Ethernet controllers. Patches 1-4 and 9 are cosmetic cleanups. Patches 5-7 are prerequisites to the xSMI support. Patches 8 and 10-11 add the xSMI support to the mvmdio driver, and a node is added both in the cp110 slave and master device trees. This was tested on an Armada 8040 mcbin, as well as on both the Armada 7040 DB and the Armada 8040 DB to ensure the SMI interface was still working. @Dave: patch 11 should go through the mvebu tree as asked by Gregory, thanks! ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Antoine Ténart authored
A new compatible for Marvell xMDIO interfaces was added into the Marvell MDIO driver. Document this new compatible. Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Antoine Ténart authored
Cosmetic patch simplifying the smi read and write error paths. It also align their error paths with the ones of the xsmi functions. Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Antoine Ténart authored
This patch adds the xmdio xsmi interface support in the mvmdio driver. This interface is used in Ethernet controllers on Marvell 370, 7k and 8k (as of now). The xsmi interface supported by this driver complies with the IEEE 802.3 clause 45. The xSMI interface is used by 10GbE devices. Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Antoine Ténart authored
Add a check for the read and write smi operations, to ensure the MII_ADDR_C45 bit isn't set. This will be needed as soon as the xSMI support is added to the mvmdio driver. Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Antoine Ténart authored
Put the two poll intervals (min and max) in the driver's ops structure. This is needed to add the xmdio support later. Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Antoine Ténart authored
Introduce an ops structure to add an indirection on the is_done function, as this is needed to add the xMDIO support later. Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Russell King authored
The MDIO layer already provides per-bus locking, so there's no need for MDIO bus drivers to do their own internal locking. Remove this. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Antoine Ténart authored
Cosmetic patch to use the GENMASK helper for masks. Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Antoine Ténart authored
Cosmetic patch replacing spaces by tabs for defined values. Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Antoine Ténart authored
Cosmetic fix reordering headers alphabetically. Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Martin KaFai Lau says: ==================== bpf: xdp: Report bpf_prog ID in IFLA_XDP This is the first usage of the new bpf_prog ID. It is for reporting the ID of a xdp_prog through netlink. It rides on the existing IFLA_XDP. This patch adds IFLA_XDP_PROG_ID for the bpf_prog ID reporting. It starts with changing the generic_xdp first. After that, the hardware driver is changed one by one. Jakub Kicinski mentioned that he will soon introduce XDP_ATTACHED_HW (on top of the existing XDP_ATTACHED_DRV and XDP_ATTACHED_SKB) and he is going to reuse the prog_attached for this purpose. Hence, this patch set keeps the prog_attached even though !!prog_id also implies there is xdp_prog attached. I have tested with generic_xdp, mlx4 and mlx5. v3: 1. Replace 'if' by '?' when checking the xdp_prog pointer as suggested by Jakub Kicinski (thanks!) v2: 1. Remove READ_ONCE since it is alredy under rtnl lock 2. Keep prog_attached in 'struct netdev_xdp' as requested by Jakub Kicinski. The existing prog_attached and the new prog_id are put under a struct for XDP_QUERY_PROG. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Martin KaFai Lau authored
Add support to qede to report bpf_prog ID during XDP_QUERY_PROG. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Cc: Mintz Yuval <Yuval.Mintz@cavium.com> Acked-by: Alexei Starovoitov <ast@fb.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Martin KaFai Lau authored
Add support to nfp to report bpf_prog ID during XDP_QUERY_PROG. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Cc: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Alexei Starovoitov <ast@fb.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-