Commits · 5d261913ca3daf6c2d21d38924235667b3d07c40 · Kirill Smelkov / linux

29 Aug, 2013 29 commits

net: add lower_dev_list to net_device and make a full mesh · 5d261913

Veaceslav Falico authored Aug 28, 2013

This patch adds lower_dev_list list_head to net_device, which is the same
as upper_dev_list, only for lower devices, and begins to use it in the same
way as the upper list.

It also changes the way the whole adjacent device lists work - now they
contain *all* of upper/lower devices, not only the first level. The first
level devices are distinguished by the bool neighbour field in
netdev_adjacent, also added by this patch.

There are cases when a device can be added several times to the adjacent
list, the simplest would be:

     /---- eth0.10 ---\
eth0-		       --- bond0
     \---- eth0.20 ---/

where both bond0 and eth0 'see' each other in the adjacent lists two times.
To avoid duplication of netdev_adjacent structures ref_nr is being kept as
the number of times the device was added to the list.

The 'full view' is achieved by adding, on link creation, all of the
upper_dev's upper_dev_list devices as upper devices to all of the
lower_dev's lower_dev_list devices (and to the lower_dev itself), and vice
versa. On unlink they are removed using the same logic.

I've tested it with thousands vlans/bonds/bridges, everything works ok and
no observable lags even on a huge number of interfaces.

Memory footprint for 128 devices interconnected with each other via both
upper and lower (which is impossible, but for the comparison) lists would be:

128*128*2*sizeof(netdev_adjacent) = 1.5MB

but in the real world we usualy have at most several devices with slaves
and a lot of vlans, so the footprint will be much lower.

CC: "David S. Miller" <davem@davemloft.net>
CC: Eric Dumazet <edumazet@google.com>
CC: Jiri Pirko <jiri@resnulli.us>
CC: Alexander Duyck <alexander.h.duyck@intel.com>
CC: Cong Wang <amwang@redhat.com>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

5d261913

net: rename netdev_upper to netdev_adjacent · aa9d8560

Veaceslav Falico authored Aug 28, 2013

Rename the structure to reflect the upcoming addition of lower_dev_list.

CC: "David S. Miller" <davem@davemloft.net>
CC: Eric Dumazet <edumazet@google.com>
CC: Jiri Pirko <jiri@resnulli.us>
CC: Alexander Duyck <alexander.h.duyck@intel.com>
CC: Cong Wang <amwang@redhat.com>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

aa9d8560

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next · 6d508cce

David S. Miller authored Aug 29, 2013

Jeff Kirsher says:

====================
This series contains updates to ixgbe.

Jacob provides a fix for 82599 devices where it can potentially keep link
lights up when the adapter has gone down.

Mark provides a fix to resolve the possible use of uninitialized memory
by checking the return value on EEPROM reads.

Don provides 2 patches, one to fix a issue where we were traversing the
Tx ring with the value of IXGBE_NUM_RX_QUEUES which currently happens
to have the correct value but this is misleading.  A change later, could
easily make this no longer correct so when traversing the Tx ring, use
netdev->num_tx_queues.  His second patch does some minor clean ups of log
messages.

Emil provides the remaining ixgbe patches.  First he fixes the link test
where forcing the laser before the link check can lead to inconsistent
results because it does not guarantee that the link will be negotiated
correctly.  Then he initializes the message buffer array to 0 in order
to avoid using random numbers from the memory as a MAC address for the
VF.  Emil also fixes the read loop for the I2C data to account for the
offset for SFP+ modules.  Lastly, Emil provides several patches to add
support for QSFP modules where 1Gbps support is added as well as support
for older QSFP active direct attach cables which pre-date SFF-8436 v3.6.

v2: Fixed patch 4 description and added blank line based on feedback from
    Sergei Shtylyov
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

6d508cce

fec: Use NAPI_POLL_WEIGHT · 322555f5

Fabio Estevam authored Aug 27, 2013

Instead of using a custom 'FEC_NAPI_WEIGHT', just use the generic
'NAPI_POLL_WEIGHT' definition instead.
Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

322555f5

net: sctp: sctp_verify_init: clean up mandatory checks and add comment · 7613f5fe

Daniel Borkmann authored Aug 27, 2013

Add a comment related to RFC4960 explaning why we do not check for initial
TSN, and while at it, remove yoda notation checks and clean up code from
checks of mandatory conditions. That's probably just really minor, but makes
reviewing easier.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7613f5fe

tcp: TSO packets automatic sizing · 95bd09eb

Eric Dumazet authored Aug 27, 2013

After hearing many people over past years complaining against TSO being
bursty or even buggy, we are proud to present automatic sizing of TSO
packets.

One part of the problem is that tcp_tso_should_defer() uses an heuristic
relying on upcoming ACKS instead of a timer, but more generally, having
big TSO packets makes little sense for low rates, as it tends to create
micro bursts on the network, and general consensus is to reduce the
buffering amount.

This patch introduces a per socket sk_pacing_rate, that approximates
the current sending rate, and allows us to size the TSO packets so
that we try to send one packet every ms.

This field could be set by other transports.

Patch has no impact for high speed flows, where having large TSO packets
makes sense to reach line rate.

For other flows, this helps better packet scheduling and ACK clocking.

This patch increases performance of TCP flows in lossy environments.

A new sysctl (tcp_min_tso_segs) is added, to specify the
minimal size of a TSO packet (default being 2).

A follow-up patch will provide a new packet scheduler (FQ), using
sk_pacing_rate as an input to perform optional per flow pacing.

This explains why we chose to set sk_pacing_rate to twice the current
rate, allowing 'slow start' ramp up.

sk_pacing_rate = 2 * cwnd * mss / srtt

v2: Neal Cardwell reported a suspect deferring of last two segments on
initial write of 10 MSS, I had to change tcp_tso_should_defer() to take
into account tp->xmit_size_goal_segs
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Van Jacobson <vanj@google.com>
Cc: Tom Herbert <therbert@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

95bd09eb

ipv6: drop fragmented ndisc packets by default (RFC 6980) · b800c3b9

Hannes Frederic Sowa authored Aug 27, 2013

This patch implements RFC6980: Drop fragmented ndisc packets by
default. If a fragmented ndisc packet is received the user is informed
that it is possible to disable the check.

Cc: Fernando Gont <fernando@gont.com.ar>
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

b800c3b9

ARM: at91/dt: fix phy address in sama5xmb to match the reg property · a3a975b1

Boris BREZILLON authored Aug 27, 2013

Fix phy0 address to match the reg property defined in phy0 node.
Signed-off-by: Boris BREZILLON <b.brezillon@overkiz.com>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a3a975b1

net/cadence/macb: fix invalid 0 return if no phy is discovered on mii init · 7daa78e3

Boris BREZILLON authored Aug 27, 2013

Replace misleading -1 (-EPERM) by a more appropriate return code (-ENXIO)
in macb_mii_probe function.
Save macb_mii_probe return before branching to err_out_unregister to avoid
erronous 0 return.
Signed-off-by: Boris BREZILLON <b.brezillon@overkiz.com>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7daa78e3

bridge: inherit slave devices needed_headroom · fd094808

Florian Fainelli authored Aug 27, 2013

Some slave devices may have set a dev->needed_headroom value which is
different than the default one, most likely in order to prepend a
hardware descriptor in front of the Ethernet frame to send. Whenever a
new slave is added to a bridge, ensure that we update the
needed_headroom value accordingly to account for the slave
needed_headroom value.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

fd094808

net: sctp: reorder sctp_globals to reduce cacheline usage · 76bfd898

Daniel Borkmann authored Aug 26, 2013

Reduce cacheline usage from 2 to 1 cacheline for sctp_globals structure. By
reordering elements, we can close gaps and simply achieve the following:

Current situation:
  /* size: 80, cachelines: 2, members: 10 */
  /* sum members: 57, holes: 4, sum holes: 16 */
  /* padding: 7 */
  /* last cacheline: 16 bytes */

Afterwards:
  /* size: 64, cachelines: 1, members: 10 */
  /* padding: 7 */
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

76bfd898

net: mdio-sun4i: Convert to devm_* api · 03536cc3

Jisheng Zhang authored Aug 26, 2013

Use devm_ioremap_resource instead of of_iomap() and devm_kzalloc()
instead of kmalloc() to make cleanup paths simpler. This patch also
fixes the resource leak caused by missing corresponding iounamp()
of the of_iomap().
Signed-off-by: Jisheng Zhang <jszhang@marvell.com>
Acked-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

03536cc3

ixgbe: add support for older QSFP active DA cables · 9a84fea2

Emil Tantilov authored Aug 16, 2013

This patch adds support for QSFP active direct attach (DA) cables which
pre-date SFF-8436 v3.6.
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

9a84fea2

ixgbe: include QSFP PHY types in ixgbe_is_sfp() · 987e1d56

Emil Tantilov authored Aug 14, 2013

This patch makes sure that QSFP+ modules use the SFP+ code path for
setting up link.
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

987e1d56

ixgbe: add 1Gbps support for QSFP+ · 61aaf9e8

Emil Tantilov authored Aug 13, 2013

This patch adds GB speed support for QSFP+ modules.
Autonegotiation is not supported with QSFP+. The user will have to set
the desired speed on both link partners using ethtool advertise setting.
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

61aaf9e8

ixgbe: fix SFF data dumps of SFP+ modules from an offset · 31c7d2b0

Emil Tantilov authored Aug 13, 2013

This patch fixes the read loop for the I2C data to account for the offset.

Also includes a whitespace cleanup and removes ret_val as it is not needed.

CC: Ben Hutchings <bhutchings@solarflare.com>
Reported-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Reviewed-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

31c7d2b0

ixgbe: cleanup some log messages · 1b1bf31a

Don Skidmore authored Jul 31, 2013

Some minor log messages cleanup, changing the level one message is logged,
adding a bit of detail to another and put all the text on one line.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

1b1bf31a

ixgbe: zero out mailbox buffer on init · b08e1ed9

Emil Tantilov authored Jul 26, 2013

This patch initializes the msgbuf array to 0 in order to avoid using random
numbers from the memory as MAC address for the VF.
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

b08e1ed9

ixgbe: fix link test when connected to 1Gbps link partner · 4ec375b1

Emil Tantilov authored Jul 10, 2013

This patch is a partial reverse of:
commit dfcc4615
Author: Jacob Keller <jacob.e.keller@intel.com>
Date: Thu Nov 8 07:07:08 2012 +0000

  ixgbe: ethtool ixgbe_diag_test cleanup

Specifically forcing the laser before the link check can lead to
inconsistent results because it does not guarantee that the link will be
negotiated correctly. Such is the case when dual speed SFP+ module is
connected to a gigabit link partner.
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

4ec375b1

ixgbe: fix incorrect limit value in ring transverse · bd8a1b12

Don Skidmore authored Jun 28, 2013

We were transversing the tx_ring with IXGBE_NUM_RX_QUEUES. Now this define
happens to have the correct value but this is misleading and a change later
could easily make this no longer true. I updated it to netdev->num_tx_queues
like we use in ixgbe_get_strings().
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

bd8a1b12

ixgbe: Check return value on eeprom reads · be0c27b4

Mark Rustad authored May 24, 2013

This patch fixes the possible use of uninitialized memory by checking the
return value on eeprom reads. These issues were identified by static
analysis. In many cases error messages will be produced so that corrupted
eeprom issues will be more visible.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

be0c27b4

ixgbe: disable link when adapter goes down · f4f1040a

Jacob Keller authored Jun 25, 2013

This patch fixes an issue with the 82599 adapter where it can potentially keep
link lights up when the adapter has gone down. The patch adds a function which
ensures link is disabled, and calls this function when the adapter transitions
to a down state.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

f4f1040a

Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/bwh/sfc-next · 4c9d546f

David S. Miller authored Aug 29, 2013

Ben Hutchings says:

====================
1. Further cleanup and refactoring in preparation for EF10.
2. Remove ethtool stats that are always zero on Falcon boards.
3. Add an ethtool stat for merged TX completions.
4. Prepare to support merged RX completions.
5. Prepare to support more hwmon sensors.
6. Add support for new events that are generated by EF10 firmware.
7. Update MC reboot detection for EF10.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

4c9d546f

Merge tag 'batman-adv-for-davem' of git://git.open-mesh.org/linux-merge · cc328dea

David S. Miller authored Aug 29, 2013

Included changes:
- set the protocol field in the skb structure according to the encapsulated
  payload
- make the gateway component send a uevent in case of "gw client mode"
  de-selection
- increment version number
- minor code rearrangement
Signed-off-by: David S. Miller <davem@davemloft.net>

cc328dea

qlcnic: underflow in qlcnic_validate_max_tx_rings() · 80b17be7

Dan Carpenter authored Aug 27, 2013

This function checks the upper bound but it doesn't check for negative
numbers:

	if (txq > QLCNIC_MAX_TX_RINGS) {

I've solved this by making "txq" a u32 type.  I chose that because
->tx_count in the ethtool_channels struct is a __u32.

This bug was added in aa4a1f7d ('qlcnic: Enable Tx queue changes using
ethtool for 82xx Series adapter.').
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

80b17be7

Merge branch 'xen-netback' · 823a19e0

David S. Miller authored Aug 29, 2013

Wei Liu says:

====================
xen-netback: switch to NAPI + kthread 1:1 model

This series implements NAPI + kthread 1:1 model for Xen netback.

This model
 - provides better scheduling fairness among vifs
 - is prerequisite for implementing multiqueue for Xen network driver

The second patch has the real meat:
 - make use of NAPI to mitigate interrupt
 - kthreads are not bound to CPUs any more, so that we can take
   advantage of backend scheduler and trust it to do the right thing

Benchmark is done on a Dell T3400 workstation with 4 cores, running 4
DomUs. Netserver runs in Dom0. DomUs do netperf to Dom0 with
following command: /root/netperf -H Dom0 -fm -l120

IRQs are distributed to 4 cores by hand in the new model, while in the
old model vifs are automatically distributed to 4 kthreads.

* New model
%Cpu0  :  0.5 us, 20.3 sy,  0.0 ni, 28.9 id,  0.0 wa,  0.0 hi, 24.4 si, 25.9 st
%Cpu1  :  0.5 us, 17.8 sy,  0.0 ni, 28.8 id,  0.0 wa,  0.0 hi, 27.7 si, 25.1 st
%Cpu2  :  0.5 us, 18.8 sy,  0.0 ni, 30.7 id,  0.0 wa,  0.0 hi, 22.9 si, 27.1 st
%Cpu3  :  0.0 us, 20.1 sy,  0.0 ni, 30.4 id,  0.0 wa,  0.0 hi, 22.7 si, 26.8 st
Throughputs: 2027.89 2025.95 2018.57 2016.23 aggregated: 8088.64

* Old model
%Cpu0  :  0.5 us, 68.8 sy,  0.0 ni, 16.1 id,  0.5 wa,  0.0 hi,  2.8 si, 11.5 st
%Cpu1  :  0.4 us, 45.1 sy,  0.0 ni, 31.1 id,  0.4 wa,  0.0 hi,  2.1 si, 20.9 st
%Cpu2  :  0.9 us, 44.8 sy,  0.0 ni, 30.9 id,  0.0 wa,  0.0 hi,  1.3 si, 22.2 st
%Cpu3  :  0.8 us, 46.4 sy,  0.0 ni, 28.3 id,  1.3 wa,  0.0 hi,  2.1 si, 21.1 st
Throughputs: 1899.14 2280.43 1963.33 1893.47 aggregated: 8036.37

We can see that the impact is mainly on CPU usage. The new model moves
processing from kthread to NAPI (software interrupt).
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

823a19e0

xen-netback: rename functions · 7376419a

Wei Liu authored Aug 26, 2013

As we move to 1:1 model and melt xen_netbk and xenvif together, it would
be better to use single prefix for all functions in xen-netback.
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7376419a

xen-netback: switch to NAPI + kthread 1:1 model · b3f980bd

Wei Liu authored Aug 26, 2013

This patch implements 1:1 model netback. NAPI and kthread are utilized
to do the weight-lifting job:

- NAPI is used for guest side TX (host side RX)
- kthread is used for guest side RX (host side TX)

Xenvif and xen_netbk are made into one structure to reduce code size.

This model provides better scheduling fairness among vifs. It is also
prerequisite for implementing multiqueue for Xen netback.
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

b3f980bd

xen-netback: remove page tracking facility · 43e9d194

Wei Liu authored Aug 26, 2013

The data flow from DomU to DomU on the same host in current copying
scheme with tracking facility:

       copy
DomU --------> Dom0          DomU
 |                            ^
 |____________________________|
             copy

The page in Dom0 is a page with valid MFN. So we can always copy from
page Dom0, thus removing the need for a tracking facility.

       copy           copy
DomU --------> Dom0 -------> DomU

Simple iperf test shows no performance regression (obviously we copy
twice either way):

  W/  tracking: ~5.3Gb/s
  W/o tracking: ~5.4Gb/s
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Matt Wilson <msw@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

43e9d194

28 Aug, 2013 7 commits

batman-adv: send GW_DEL event when the gw client mode is deselected · c6eaa3f0

Antonio Quartulli authored Jul 13, 2013

Whenever the GW client mode is deselected, a DEL event has
to be sent in order to tell userspace that the current
gateway has been lost. Send the uevent on state change only
if a gateway was currently selected.
Reported-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Antonio Quartulli <antonio@open-mesh.com>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>

c6eaa3f0

batman-adv: Start new development cycle · c00a072d

Simon Wunderlich authored Jul 21, 2013

Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>

c00a072d

batman-adv: move enum definition at the top of the file · 791c2a2d
Antonio Quartulli authored Aug 17, 2013
```
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
```
791c2a2d

batman-adv: set skb priority according to content · c54f38c9

Simon Wunderlich authored Jul 29, 2013

The skb priority field may help the wireless driver to choose the right
queue (e.g. WMM queues). This should be set in batman-adv, as this
information is only available here.

This patch adds support for IPv4/IPv6 DS fields and VLAN PCP. Note that
only VLAN PCP is used if a VLAN header is present. Also initially set
TC_PRIO_CONTROL only for self-generated packets, and keep the priority
set by higher layers.
Signed-off-by: Simon Wunderlich <simon@open-mesh.com>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>

c54f38c9

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jesse/openvswitch · 5b2941b1

David S. Miller authored Aug 27, 2013

Jesse Gross says:

====================
A number of significant new features and optimizations for net-next/3.12.
Highlights are:
 * "Megaflows", an optimization that allows userspace to specify which
   flow fields were used to compute the results of the flow lookup.
   This allows for a major reduction in flow setups (the major
   performance bottleneck in Open vSwitch) without reducing flexibility.
 * Converting netlink dump operations to use RCU, allowing for
   additional parallelism in userspace.
 * Matching and modifying SCTP protocol fields.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

5b2941b1

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next · b6750b40

David S. Miller authored Aug 27, 2013

Pablo Neira Ayuso says:

====================
The following patchset contains Netfilter updates for your net-next tree,
they are:

* The new SYNPROXY target for iptables, including IPv4 and IPv6 support,
  from Patrick McHardy.

* nf_defrag_ipv6.o should be only linked to nf_defrag_ipv6.ko, from
  Nathan Hintz.

* Fix an old bug in REJECT, which replies with wrong MAC source address
  from the bridge, by Phil Oester.

* Fix uninitialized helper variable in the expectation support over
  nfnetlink_queue, from Florian Westphal.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

b6750b40

Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/bwh/sfc-next · 45cc3a0c

David S. Miller authored Aug 27, 2013

Ben Hutchings says:

====================
More refactoring and cleanup, particularly around filter management.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

45cc3a0c

27 Aug, 2013 4 commits

netfilter: ctnetlink: fix uninitialized variable · b7e092c0

Florian Westphal authored Aug 27, 2013

net/netfilter/nf_conntrack_netlink.c: In function 'ctnetlink_nfqueue_attach_expect':
'helper' may be used uninitialized in this function

It was only initialized in if CTA_EXPECT_HELP_NAME attribute was
present, it must be NULL otherwise.

Problem added recently in bd077937
(netfilter: nfnetlink_queue: allow to attach expectations to conntracks).
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

b7e092c0

netfilter: add IPv6 SYNPROXY target · 4ad36228

Patrick McHardy authored Aug 27, 2013

Add an IPv6 version of the SYNPROXY target. The main differences to the
IPv4 version is routing and IP header construction.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Tested-by: Martin Topholm <mph@one.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

4ad36228

net: syncookies: export cookie_v6_init_sequence/cookie_v6_check · 81eb6a14

Patrick McHardy authored Aug 27, 2013

Extract the local TCP stack independant parts of tcp_v6_init_sequence()
and cookie_v6_check() and export them for use by the upcoming IPv6 SYNPROXY
target.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Acked-by: David S. Miller <davem@davemloft.net>
Tested-by: Martin Topholm <mph@one.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

81eb6a14

netfilter: add SYNPROXY core/target · 48b1de4c

Patrick McHardy authored Aug 27, 2013

Add a SYNPROXY for netfilter. The code is split into two parts, the synproxy
core with common functions and an address family specific target.

The SYNPROXY receives the connection request from the client, responds with
a SYN/ACK containing a SYN cookie and announcing a zero window and checks
whether the final ACK from the client contains a valid cookie.

It then establishes a connection to the original destination and, if
successful, sends a window update to the client with the window size
announced by the server.

Support for timestamps, SACK, window scaling and MSS options can be
statically configured as target parameters if the features of the server
are known. If timestamps are used, the timestamp value sent back to
the client in the SYN/ACK will be different from the real timestamp of
the server. In order to now break PAWS, the timestamps are translated in
the direction server->client.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Tested-by: Martin Topholm <mph@one.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

48b1de4c