- 22 Aug, 2016 1 commit
-
-
David Ahern authored
Subash reported that commit 42a7b32b ("xfrm: Add oif to dst lookups") broke a wifi use case that uses fib rules and xfrms. The intent of 42a7b32b was driven by VRFs with IPsec. As a compromise relax the use of oif in xfrm lookups to L3 master devices only (ie., oif is either an L3 master device or is enslaved to a master device). Fixes: 42a7b32b ("xfrm: Add oif to dst lookups") Reported-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
-
- 11 Aug, 2016 1 commit
-
-
Alexey Kodanev authored
Running LTP 'icmp-uni-basic.sh -6 -p ipcomp -m tunnel' test over openvswitch + veth can trigger kernel panic: BUG: unable to handle kernel NULL pointer dereference at 00000000000000e0 IP: [<ffffffff8169d1d2>] xfrm_input+0x82/0x750 ... [<ffffffff816d472e>] xfrm6_rcv_spi+0x1e/0x20 [<ffffffffa082c3c2>] xfrm6_tunnel_rcv+0x42/0x50 [xfrm6_tunnel] [<ffffffffa082727e>] tunnel6_rcv+0x3e/0x8c [tunnel6] [<ffffffff8169f365>] ip6_input_finish+0xd5/0x430 [<ffffffff8169fc53>] ip6_input+0x33/0x90 [<ffffffff8169f1d5>] ip6_rcv_finish+0xa5/0xb0 ... It seems that tunnel.ip6 can have garbage values and also dereferenced without a proper check, only tunnel.ip4 is being verified. Fix it by adding one more if block for AF_INET6 and initialize tunnel.ip6 with NULL inside xfrm6_rcv_spi() (which is similar to xfrm4_rcv_spi()). Fixes: 049f8e2e ("xfrm: Override skb->mark with tunnel->parm.i_key in xfrm_input") Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
-
- 29 Jul, 2016 1 commit
-
-
Tobias Brunner authored
Whenever thresholds are changed the hash tables are rebuilt. This is done by enumerating all policies and hashing and inserting them into the right table according to the thresholds and direction. Because socket policies are also contained in net->xfrm.policy_all but no hash tables are defined for their direction (dir + XFRM_POLICY_MAX) this causes a NULL or invalid pointer dereference after returning from policy_hash_bysel() if the rebuild is done while any socket policies are installed. Since the rebuild after changing thresholds is scheduled this crash could even occur if the userland sets thresholds seemingly before installing any socket policies. Fixes: 53c2e285 ("xfrm: Do not hash socket policies") Signed-off-by: Tobias Brunner <tobias@strongswan.org> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
-
- 27 Jul, 2016 2 commits
-
-
Vegard Nossum authored
During fuzzing I regularly run into this WARN(). According to Herbert Xu, this "certainly shouldn't be a WARN, it probably shouldn't print anything either". Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
-
Vegard Nossum authored
AFAICT this message is just printed whenever input validation fails. This is a normal failure and we shouldn't be dumping the stack over it. Looks like it was originally a printk that was maybe incorrectly upgraded to a WARN: commit 62db5cfd Author: stephen hemminger <shemminger@vyatta.com> Date: Wed May 12 06:37:06 2010 +0000 xfrm: add severity to printk Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
-
- 18 Jul, 2016 1 commit
-
-
Vegard Nossum authored
If we hit any of the error conditions inside xfrm_dump_sa(), then xfrm_state_walk_init() never gets called. However, we still call xfrm_state_walk_done() from xfrm_dump_sa_done(), which will crash because the state walk was never initialized properly. We can fix this by setting cb->args[0] only after we've processed the first element and checking this before calling xfrm_state_walk_done(). Fixes: d3623099 ("ipsec: add support of limited SA dump") Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com> Cc: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com> Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
-
- 17 Jul, 2016 4 commits
-
-
Florian Fainelli authored
The label lio_xmit_failed is used 3 times through liquidio_xmit() but it always makes a call to dma_unmap_single() using potentially uninitialized variables from "ndata" variable. Out of the 3 gotos, 2 run after ndata has been initialized, and had a prior dma_map_single() call. Fix this by adding a new error label: lio_xmit_dma_failed which does this dma_unmap_single() and then processed with the lio_xmit_failed fallthrough. Fixes: f21fb3ed ("Add support of Cavium Liquidio ethernet adapters") Reported-by: coverity (CID 1309740) Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Florian Fainelli authored
In case nb8800_receive() fails to allocate a fragment, we would leak the SKB freshly allocated and just return, instead, free it. Reported-by: coverity (CID 1341750) Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Mans Rullgard <mans@mansr.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Florian Fainelli authored
We should be using a logical check here instead of a bitwise operation to check if the device is closed already in et131x_tx_timeout(). Reported-by: coverity (CID 146498) Fixes: 38df6492 ("et131x: Add PCIe gigabit ethernet driver et131x to drivers/net") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Paolo Abeni authored
macsec can't cope with mtu frames which need vlan tag insertion, and vlan device set the default mtu equal to the underlying dev's one. By default vlan over macsec devices use invalid mtu, dropping all the large packets. This patch adds a netif helper to check if an upper vlan device needs mtu reduction. The helper is used during vlan devices initialization to set a valid default and during mtu updating to forbid invalid, too bit, mtu values. The helper currently only check if the lower dev is a macsec device, if we get more users, we need to update only the helper (possibly reserving an additional IFF bit). Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 15 Jul, 2016 14 commits
-
-
Florian Fainelli authored
Nothing is decrementing the index "i" while we are cleaning up the fragments we could not successful transmit. Fixes: 9cde9450 ("bgmac: implement scatter/gather support") Reported-by: coverity (CID 1352048) Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Jiri Pirko says: ==================== mlxsw: Couple of fixes Couple of fixes for mlxsw driver from Ido. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ido Schimmel authored
Packets entering the switch are mapped to a Switch Priority (SP) according to their PCP value (untagged frames are mapped to SP 0). The packets are classified to a priority group (PG) buffer in the port's headroom according to their SP. The switch maintains another mapping (SP to IEEE priority), which is used to generate PFC frames for lossless PGs. This mapping is initialized to IEEE = SP % 8. Therefore, when mapping SP 'x' to PG 'y' we create a situation in which an IEEE priority is mapped to two different PGs: IEEE 'x' ---> SP 'x' ---> PG 'y' IEEE 'x' ---> SP 'x + 8' ---> PG '0' (default) Which is invalid, as a flow can use only one PG buffer. Fix this by mapping both SP 'x' and 'x + 8' to the same PG buffer. Fixes: 8e8dfe9f ("mlxsw: spectrum: Add IEEE 802.1Qaz ETS support") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ido Schimmel authored
The number of supported traffic classes that can have ETS and PFC simultaneously enabled is not subject to user configuration, so make sure we always initialize them to the correct values following a set operation. Fixes: 8e8dfe9f ("mlxsw: spectrum: Add IEEE 802.1Qaz ETS support") Fixes: d81a6bdb ("mlxsw: spectrum: Add IEEE 802.1Qbb PFC support") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ido Schimmel authored
We can't have PAUSE frames and PFC both enabled on the same port, but the fact that ieee_setpfc() was called doesn't necessarily mean PFC is enabled. Only emit errors when PAUSE frames and PFC are enabled simultaneously. Fixes: d81a6bdb ("mlxsw: spectrum: Add IEEE 802.1Qbb PFC support") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ido Schimmel authored
The device supports link autonegotiation, so let the user know about it by indicating support via ethtool ops. Fixes: 56ade8fe ("mlxsw: spectrum: Add initial support for Spectrum ASIC") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ido Schimmel authored
When setting a new speed we need to disable and enable the port for the changes to take effect. We currently only do that if the operational state of the port is up. However, setting a new speed following link training failure will require us to explicitly set the port down and then up. Instead, disable and enable the port based on its administrative state. Fixes: 56ade8fe ("mlxsw: spectrum: Add initial support for Spectrum ASIC") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queueDavid S. Miller authored
Jeff Kirsher says: ==================== Intel Wired LAN Driver Updates 2016-07-14 This series contains fixes to i40e and ixgbe. Alex fixes issues found in i40e_rx_checksum() which was broken, where the checksum was being returned valid when it was not. Kiran fixes a bug which was found when we abruptly remove a cable which caused a panic. Set the VSI broadcast promiscuous mode during VSI add sequence and prevents adding MAC filter if specified MAC address is broadcast. Paolo Abeni fixes a bug by returning the actual work done, capped to weight - 1, since the core doesn't allow to return the full budget when the driver modifies the NAPI status. Guilherme Piccoli fixes an issue where the q_vector initialization routine sets the affinity _mask of a q_vector based on v_idx value. This means a loop iterates on v_idx, which is an incremental value, and the cpumask is created based on this value. This is a problem in systems with multiple logical CPUs per core (like in SMT scenarios). Changed the way q_vector's affinity_mask is created to resolve the issue. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Grant Grundler authored
ethtool -i provides a driver version that is hard coded. Export the same value via "modinfo". Signed-off-by: Grant Grundler <grundler@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jason Baron authored
The per-socket rate limit for 'challenge acks' was introduced in the context of limiting ack loops: commit f2b2c582 ("tcp: mitigate ACK loops for connections as tcp_sock") And I think it can be extended to rate limit all 'challenge acks' on a per-socket basis. Since we have the global tcp_challenge_ack_limit, this patch allows for tcp_challenge_ack_limit to be set to a large value and effectively rely on the per-socket limit, or set tcp_challenge_ack_limit to a lower value and still prevents a single connections from consuming the entire challenge ack quota. It further moves in the direction of eliminating the global limit at some point, as Eric Dumazet has suggested. This a follow-up to: Subject: tcp: make challenge acks less predictable Cc: Eric Dumazet <edumazet@google.com> Cc: David S. Miller <davem@davemloft.net> Cc: Neal Cardwell <ncardwell@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Yue Cao <ycao009@ucr.edu> Signed-off-by: Jason Baron <jbaron@akamai.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Guilherme G. Piccoli authored
Currently, the q_vector initialization routine sets the affinity_mask of a q_vector based on v_idx value. Meaning a loop iterates on v_idx, which is an incremental value, and the cpumask is created based on this value. This is a problem in systems with multiple logical CPUs per core (like in SMT scenarios). If we disable some logical CPUs, by turning SMT off for example, we will end up with a sparse cpu_online_mask, i.e., only the first CPU in a core is online, and incremental filling in q_vector cpumask might lead to multiple offline CPUs being assigned to q_vectors. Example: if we have a system with 8 cores each one containing 8 logical CPUs (SMT == 8 in this case), we have 64 CPUs in total. But if SMT is disabled, only the 1st CPU in each core remains online, so the cpu_online_mask in this case would have only 8 bits set, in a sparse way. In general case, when SMT is off the cpu_online_mask has only C bits set: 0, 1*N, 2*N, ..., C*(N-1) where C == # of cores; N == # of logical CPUs per core. In our example, only bits 0, 8, 16, 24, 32, 40, 48, 56 would be set. This patch changes the way q_vector's affinity_mask is created: it iterates on v_idx, but consumes the CPU index from the cpu_online_mask instead of just using the v_idx incremental value. No functional changes were introduced. Signed-off-by: Guilherme G Piccoli <gpiccoli@linux.vnet.ibm.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Paolo Abeni authored
Currently the function ixgbe_poll() returns 0 when it clean completely the rx rings, but this foul budget accounting in core code. Fix this returning the actual work done, capped to weight - 1, since the core doesn't allow to return the full budget when the driver modifies the napi status Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Venkatesh Srinivas <venkateshs@google.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Kiran Patil authored
This patch sets VSI broadcast promiscuous mode during VSI add sequence and prevents adding MAC filter if specified MAC address is broadcast. Change-ID: Ia62251fca095bc449d0497fc44bec3a5a0136773 Signed-off-by: Kiran Patil <kiran.patil@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Alexander Duyck authored
There are a couple of issues I found in i40e_rx_checksum while doing some recent testing. As a result I have found the Rx checksum logic is pretty much broken and returning that the checksum is valid for tunnels in cases where it is not. First the inner types are not the correct values to use to test for if a tunnel is present or not. In addition the inner protocol types are not a bitmask as such performing an OR of the values doesn't make sense. I have instead changed the code so that the inner protocol types are used to determine if we report CHECKSUM_UNNECESSARY or not. For anything that does not end in UDP, TCP, or SCTP it doesn't make much sense to report a checksum offload since it won't contain a checksum anyway. This leaves us with the need to set the csum_level based on some value. For that purpose I am using the tunnel_type field. If the tunnel type is GRENAT or greater then this means we have a GRE or UDP tunnel with an inner header. In the case of GRE or UDP we will have a possible checksum present so for this reason it should be safe to set the csum_level to 1 to indicate that we are reporting the state of the inner header. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
- 14 Jul, 2016 1 commit
-
-
Beniamino Galvani authored
Commit e826eafa ("bonding: Call netif_carrier_off after register_netdevice") moved netif_carrier_off() from bond_init() to bond_create(), but the latter is called only for initial default devices and ones created through sysfs: $ modprobe bonding $ echo +bond1 > /sys/class/net/bonding_masters $ ip link add bond2 type bond $ grep "MII Status" /proc/net/bonding/* /proc/net/bonding/bond0:MII Status: down /proc/net/bonding/bond1:MII Status: down /proc/net/bonding/bond2:MII Status: up Ensure that carrier is initially off also for devices created through netlink. Signed-off-by: Beniamino Galvani <bgalvani@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 13 Jul, 2016 9 commits
-
-
David S. Miller authored
Willem de Bruijn says: ==================== limit sk_filter trim to payload Sockets can apply a filter to incoming packets to drop or trim them. Fix two codepaths that call skb_pull/__skb_pull after sk_filter without checking for packet length. Reading beyond skb->tail after trimming happens in more codepaths, but safety of reading in the linear segment is based on minimum allocation size (MAX_HEADER, GRO_MAX_HEAD, ..). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Willem de Bruijn authored
Dccp verifies packet integrity, including length, at initial rcv in dccp_invalid_packet, later pulls headers in dccp_enqueue_skb. A call to sk_filter in-between can cause __skb_pull to wrap skb->len. skb_copy_datagram_msg interprets this as a negative value, so (correctly) fails with EFAULT. The negative length is reported in ioctl SIOCINQ or possibly in a DCCP_WARN in dccp_close. Introduce an sk_receive_skb variant that caps how small a filter program can trim packets, and call this in dccp with the header length. Excessively trimmed packets are now processed normally and queued for reception as 0B payloads. Fixes: 7c657876 ("[DCCP]: Initial implementation") Signed-off-by: Willem de Bruijn <willemb@google.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Willem de Bruijn authored
Sockets can have a filter program attached that drops or trims incoming packets based on the filter program return value. Rose requires data packets to have at least ROSE_MIN_LEN bytes. It verifies this on arrival in rose_route_frame and unconditionally pulls the bytes in rose_recvmsg. The filter can trim packets to below this value in-between, causing pull to fail, leaving the partial header at the time of skb_copy_datagram_msg. Place a lower bound on the size to which sk_filter may trim packets by introducing sk_filter_trim_cap and call this for rose packets. Signed-off-by: Willem de Bruijn <willemb@google.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Saeed Mahameed says: ==================== mlx5 tx timeout watchdog fixes This patch set provides two trivial fixes for the tx timeout series lately applied into net 4.7. From Daniel, detect stuck queues due to BQL From Mohamad, fix tx timeout watchdog false alarm Hopefully those two fixes will make it to -stable, assuming 3947ca18 ('net/mlx5e: Implement ndo_tx_timeout callback') was also backported to -stable. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Mohamad Haj Yahia authored
Start all tx queues (including inactive ones) when opening the netdev. Stop all tx queues (including inactive ones) when closing the netdev. This is a workaround for the tx timeout watchdog false alarm issue in which the netdev watchdog is polling all the tx queues which may include inactive queues and thus once lowering the real tx queues number (ethtool -L) it will generate tx timeout watchdog false alarms. Fixes: 3947ca18 ('net/mlx5e: Implement ndo_tx_timeout callback') Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Daniel Jurgens authored
Change netif_tx_queue_stopped to netif_xmit_stopped. This will show when queues are stopped due to byte queue limits. Fixes: 3947ca18 ('net/mlx5e: Implement ndo_tx_timeout callback') Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Florian Fainelli says: ==================== net: ethoc: Error path and transmit fixes This patch series contains two patches for the ethoc driver while testing on a TS-7300 board where ethoc is provided by an on-board FPGA. First patch was cooked after chasing crashes with invalid resources passed to the driver. Second patch was cooked after seeing that an interface configured with IP 192.168.2.2 was sending ARP packets for 192.168.0.0, no wonder why it could not work. I don't have access to any other platform using an ethoc interface so it could be good to some testing on Xtensa for instance. Changes in v3: - corrected the error path if skb_put_padto() fails, thanks to Max for spotting this! Changes in v2: - fixed the first commit message ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Florian Fainelli authored
Even though the hardware can be doing zero padding, we want the SKB to be going out on the wire with the appropriate size. This fixes packet truncations observed with e.g: ARP packets. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Florian Fainelli authored
In case any operation fails before we can successfully go the point where we would register a MDIO bus, we would be going to an error label which involves unregistering then freeing this yet to be created MDIO bus. Update all error paths to go to label free which is the only one valid until either the clock is enabled, or the MDIO bus is allocated and registered. This fixes kernel oops observed while trying to dereference the MDIO bus structure which is not yet allocated. Fixes: a1702857 ("net: Add support for the OpenCores 10/100 Mbps Ethernet MAC.") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 12 Jul, 2016 6 commits
-
-
Noam Camus authored
During commit b54b8c2d ("net: ezchip: adapt driver to little endian architecture") adapting to little endian architecture, zeroing of controller was left out. Signed-off-by: Elad Kanfi <eladkan@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller authored
Pablo Neira Ayuso says: ==================== Netfilter/IPVS fixes for net The following patchset contains Netfilter/IPVS fixes for your net tree. they are: 1) Fix leak in the error path of nft_expr_init(), from Liping Zhang. 2) Tracing from nf_tables cannot be disabled, also from Zhang. 3) Fix an integer overflow on 32bit archs when setting the number of hashtable buckets, from Florian Westphal. 4) Fix configuration of ipvs sync in backup mode with IPv6 address, from Quentin Armitage via Simon Horman. 5) Fix incorrect timeout calculation in nft_ct NFT_CT_EXPIRATION, from Florian Westphal. 6) Skip clash resolution in conntrack insertion races if NAT is in place. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Pablo Neira Ayuso authored
The clash resolution is not easy to apply if the NAT table is registered. Even if no NAT rules are installed, the nul-binding ensures that a unique tuple is used, thus, the packet that loses race gets a different source port number, as described by: http://marc.info/?l=netfilter-devel&m=146818011604484&w=2 Clash resolution with NAT is also problematic if addresses/port range ports are used since the conntrack that wins race may describe a different mangling that we may have earlier applied to the packet via nf_nat_setup_info(). Fixes: 71d8c47f ("netfilter: conntrack: introduce clash resolution on insertion race") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Tested-by: Marc Dionne <marc.c.dionne@gmail.com>
-
David S. Miller authored
Jon Maloy says: ==================== tipc: three small fixes Fixes for some broadcast link problems that may occur in large systems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jon Paul Maloy authored
In test situations with many nodes and a heavily stressed system we have observed that the transmission broadcast link may fail due to an excessive number of retransmissions of the same packet. In such situations we need to reset all unicast links to all peers, in order to reset and re-synchronize the broadcast link. In this commit, we add a new function tipc_bearer_reset_all() to be used in such situations. The function scans across all bearers and resets all their pertaining links. Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jon Paul Maloy authored
After a new receiver peer has been added to the broadcast transmission link, we allow immediate transmission of new broadcast packets, trusting that the new peer will not accept the packets until it has received the previously sent unicast broadcast initialiation message. In the same way, the sender must not accept any acknowledges until it has itself received the broadcast initialization from the peer, as well as confirmation of the reception of its own initialization message. Furthermore, when a receiver peer goes down, the sender has to produce the missing acknowledges from the lost peer locally, in order ensure correct release of the buffers that were expected to be acknowledged by the said peer. In a highly stressed system we have observed that contact with a peer may come up and be lost before the above mentioned broadcast initial- ization and confirmation have been received. This leads to the locally produced acknowledges being rejected, and the non-acknowledged buffers to linger in the broadcast link transmission queue until it fills up and the link goes into permanent congestion. In this commit, we remedy this by temporarily setting the corresponding broadcast receive link state to ESTABLISHED and the 'bc_peer_is_up' state to true before we issue the local acknowledges. This ensures that those acknowledges will always be accepted. The mentioned state values are restored immediately afterwards when the link is reset. Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-