Commits · 995dca4ce9dddf48597bd3e0427447acd4509f1d · nexedi / linux

18 Mar, 2014 15 commits

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next · 995dca4c

David S. Miller authored Mar 18, 2014

Steffen Klassert says:

====================
One patch to rename a newly introduced struct. The rest is
the rework of the IPsec virtual tunnel interface for ipv6 to
support inter address family tunneling and namespace crossing.

1) Rename the newly introduced struct xfrm_filter to avoid a
   conflict with iproute2. From Nicolas Dichtel.

2) Introduce xfrm_input_afinfo to access the address family
   dependent tunnel callback functions properly.

3) Add and use a IPsec protocol multiplexer for ipv6.

4) Remove dst_entry caching. vti can lookup multiple different
   dst entries, dependent of the configured xfrm states. Therefore
   it does not make to cache a dst_entry.

5) Remove caching of flow informations. vti6 does not use the the
   tunnel endpoint addresses to do route and xfrm lookups.

6) Update the vti6 to use its own receive hook.

7) Remove the now unused xfrm_tunnel_notifier. This was used from vti
   and is replaced by the IPsec protocol multiplexer hooks.

8) Support inter address family tunneling for vti6.

9) Check if the tunnel endpoints of the xfrm state and the vti interface
   are matching and return an error otherwise.

10) Enable namespace crossing for vti devices.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

995dca4c

net/i40e: Avoid double setting of NETIF_F_SG for the HW encapsulation feature mask · d70e941b

Or Gerlitz authored Mar 18, 2014

The networking core does it for the driver during registration time.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

d70e941b

i40evf: Rename i40e_ptype_lookup i40evf_ptype_lookup · 37a622c1

Eric W Biederman authored Mar 18, 2014

When compiling the i40e and the i40evf driver into the same kernel I get:
LD      drivers/net/ethernet/intel/built-in.o
drivers/net/ethernet/intel/i40evf/built-in.o:(.data+0x300): multiple definition of `i40e_ptype_lookup'
drivers/net/ethernet/intel/i40e/built-in.o:(.data+0x780): first defined here
make[3]: *** [drivers/net/ethernet/intel/built-in.o] Error 1
make[2]: *** [drivers/net/ethernet/intel] Error 2
make[1]: *** [drivers/net/ethernet/] Error 2
make: *** [sub-make] Error 2

Fix this by renaming the i40evf version of this structure from
i40e_ptype_lookup to i40evf_ptype_lookup.

This build failure was introduced in:
  commit 206812b5
  Author: Jesse Brandeburg <jesse.brandeburg@intel.com>
  i40e/i40evf: i40e implementation for skb_set_hash

Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Cc: Catherine Sullivan <catherine.sullivan@intel.com>
Signed-off-by: Eric W Biederman <ebiederm@xmission.com>
Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

37a622c1

e1000e: fix the build error when PM is disabled · 72f72dcc

Kevin Hao authored Mar 18, 2014

The commit 28002099 (e1000e: Refactor PM flows) changed the
SET_SYSTEM_SLEEP_PM_OPS to open-coded assignment, but forgot to
protect them with CONFIG_PM_SLEEP. Then cause the following build
error when PM is disabled:
drivers/net/ethernet/intel/e1000e/netdev.c:7079:13:
error: 'e1000e_pm_suspend' undeclared here (not in a function)
  .suspend = e1000e_pm_suspend,
             ^
drivers/net/ethernet/intel/e1000e/netdev.c:7080:13:
error: 'e1000e_pm_resume' undeclared here (not in a function)
  .resume  = e1000e_pm_resume,
             ^
drivers/net/ethernet/intel/e1000e/netdev.c:7082:11:
error: 'e1000e_pm_thaw' undeclared here (not in a function)
  .thaw  = e1000e_pm_thaw,
           ^
Signed-off-by: Kevin Hao <haokexin@gmail.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

72f72dcc

igb: remove references to long gone command line parameters · 406d4965

Fernando Luis Vazquez Cao authored Mar 18, 2014

Command line parameters QueuePairs, Node, EEE, DMAC and InterruptThrottleRate
do not exist these days. Remove all references to them in the Documentation
folder and update code comments.
Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

406d4965

Merge branch 'altera_tse' · 33125df3

David S. Miller authored Mar 17, 2014

Vince Bridgers says:

====================
Altera Triple Speed Ethernet (TSE) Driver

This is the version 6 submission for the Altera Triple Speed Ethernet (TSE)
driver. All comments received during the version 2, 3, 4, and 5 submissions
have been accepted. Please find the change log and a description of the
submission below.

If you find the submission acceptable, please consider this patch set for
inclusion into the Linux kernel.

V6: Address comments from V5 review
    - add call to skb_tx_timestamp in the drivers transmit path
    - correct use of unsigned int where it was cast to pointer. Use types
      appropriate for intended and correct use to let the compiler warn us
      when type usage is incorrect.
    - use correct semantics for pointer arithmetic in same code path

V5: Address comments from V4 review
    - Add descriptions of statistics to driver documentation. The statstics
      supported by the driver/controller map to IEEE and RFC statistics, and
      the names and mappings are described in the user documentation.
    - Change "unsigned int" to u32 in device structure definitions
    - Change used of netdev_warn to netif_warn in altera_sgdma.c
    - Change stat name rx_fifo_drops to ether_drops to match the event
      actually counted by the hardware.

V4: Address comments from V3 review
    - Change statistics names in ethtool module to follow common use in
      other ethernet drivers.
    - remove an unnecessary case in ethtool module
    - change logging to use netdev_* where possible instead of dev_*
    - remove logging for OOM errors since those are already logged

V3: Address comments from V2 review
    - Reorder patch submission so that net/ethernet Makefile and Kconfig
      are committed last, thus not breaking bisect
    - Use of_get_mac_address instead of of_get_property
    - Change supplemental and hash configuration bindings to boolean/empty,
      and more meaningful names
    - Add check for failure from calls to of_phy_connect and
      connect_local_phy
    - Correct code to find mdio child node
    - Update bindings document
    - Remove cast to u64 when not necessary
    - add use of const for statistics strings

V2: Address comments from initial RFC review.
    - The driver files were broken up by major sections of functionality.
      These include MSGDMA, SGDMA, Misc, and Main.
    - Add patch for MAINTAINERS file, add the maintainer for this submission
    - Use 32-bit lower/upper physical address accessor functions so the driver
      is 64-bit ready.
    - Use standard bindings where applicable. Especially phy-addr, and change
      "altr,rx-fifo-depth" to "rx-fifo-depth" and "altr,tx-fifo-depth" to
      "tx-fifo-depth".
    - Add use of max-frame-size property
    - Update bindings documents accordingly
    - Correct interrupt handler to use budget parameter in the convential way
    - Use macros consistently to define bit fields across files
    - Correct include exclusion macro in altera_msgdmahw.h (typo)
    - Remove use of barriers, these were not necessary since the DMA APIs
      ensure memory & buffer consistency
    - Remove use of netif_carrier_off in driver
    - move probing of phy from the open function to the probe function
    - use of_get_phy_mode instead of custom function
    - Use the .data field in the device structure to obtain a pointer
      to SGDMA or MSGDMA device specific properties and functions.
    - remove custom function to access devicetree since Altera specific
      bindings requiring it's use have been deprecated in favor of
      standard bindings.

The Altera TSE is a 10/100/1000 Mbps Ethernet soft IP component that can be
configured and synthesized using Quartus, and programmed into Altera FPGAs.
Two types of soft DMA IP components are supported by this driver - the Altera
SGDMA and the MSGDMA. The MSGDMA DMA component is preferred over the SGDMA,
since the SGDMA will be deprecated in favor of the MSGDMA. Software supporting
both is provided for customers still using the SGDMA and to demonstrate how
multiple types of DMA engines may be supported by the TSE driver in the event
customers wish to develop their own custom soft DMA engine for particular
applications.

The design has been tested on Altera's Cyclone 4, 5, and Cyclone 5 SOC
development kits using an ARM A9 processor and an Altera NIOS2 processor.
Differences in CPU/DMA coherency management and address alignment are
addressed by proper use of driver APIs and semantics.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

33125df3

net: ethernet: Change Ethernet Makefile and Kconfig for Altera TSE driver · f7b18249

Vince Bridgers authored Mar 17, 2014

This patch changes the Ethernet Makefile and Kconfig files to add the Altera
Ethernet driver component.
Signed-off-by: Vince Bridgers <vbridgers2013@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

f7b18249

MAINTAINERS: Add entry for Altera Triple Speed Ethernet Driver · 16b8b922

Vince Bridgers authored Mar 17, 2014

Add a MAINTAINERS entry covering the Altera Triple Speed
Ethernet Driver, with support for the MSGDMA and SGDMA
soft DMA IP components.
Signed-off-by: Vince Bridgers <vbridgers2013@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

16b8b922

Altera TSE: Add Altera Ethernet Driver Makefile and Kconfig · ed33ef64

Vince Bridgers authored Mar 17, 2014

This patch adds the Altera Triple Speed Ethernet Makfile and
Kconfig file.
Signed-off-by: Vince Bridgers <vbridgers2013@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ed33ef64

Altera TSE: Add main and header file for Altera Ethernet Driver · bbd2190c

Vince Bridgers authored Mar 17, 2014

This patch adds the main driver and header file for the Altera Triple
Speed Ethernet driver.
Signed-off-by: Vince Bridgers <vbridgers2013@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bbd2190c

Altera TSE: Add Miscellaneous Files for Altera Ethernet Driver · 6c3324a9

Vince Bridgers authored Mar 17, 2014

This patch adds miscellaneous files for the Altera Ethernet Driver,
including ethtool support.
Signed-off-by: Vince Bridgers <vbridgers2013@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

6c3324a9

Altera TSE: Add Altera Ethernet Driver SGDMA file components · f64f8808

Vince Bridgers authored Mar 17, 2014

This patch adds the SGDMA soft IP support for the Altera Triple
Speed Ethernet driver.
Signed-off-by: Vince Bridgers <vbridgers2013@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

f64f8808

Altera TSE: Add Altera Ethernet Driver MSGDMA File Components · 94fb0ef4

Vince Bridgers authored Mar 17, 2014

This patch adds the MSGDMA soft IP support for the Altera Triple
Speed Ethernet driver.
Signed-off-by: Vince Bridgers <vbridgers2013@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

94fb0ef4

Documentation: networking: Add Altera Ethernet (TSE) Documentation · 04add4ab

Vince Bridgers authored Mar 17, 2014

This patch adds a bindings description for the Altera Triple Speed Ethernet
(TSE) driver. The bindings support the legacy SGDMA soft IP as well as the
preferred MSGDMA soft IP. The TSE can be configured and synthesized in soft
logic using Altera's Quartus toolchain. Please consult the bindings document
for supported options.
Signed-off-by: Vince Bridgers <vbridgers2013@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

04add4ab

dts: Add bindings for the Altera Triple Speed Ethernet driver · d6da06fc

Vince Bridgers authored Mar 17, 2014

This patch adds a bindings description for the Altera Triple Speed Ethernet
(TSE) driver. The bindings support the legacy SGDMA soft IP as well as the
preferred MSGDMA soft IP. The TSE can be configured and synthesized in soft
logic using Altera's Quartus toolchain. Please consult the bindings document
for supported options.
Signed-off-by: Vince Bridgers <vbridgers2013@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

d6da06fc

17 Mar, 2014 25 commits

netfilter: conntrack: Fix UP builds · d5d20912

Eric Dumazet authored Mar 17, 2014

ARRAY_SIZE(nf_conntrack_locks) is undefined if spinlock_t is an
empty structure. Replace it by CONNTRACK_LOCKS

Fixes: 93bb0ceb ("netfilter: conntrack: remove central spinlock nf_conntrack_lock")
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

d5d20912

Merge branch 'at86rf230' · e73087af

David S. Miller authored Mar 17, 2014

Alexander Aring says:

====================
at86rf230: various fixes and devicetree support

this patch series fix some bugs with the at86rf231 chip and cleaup some code.
Also add devicetree support for the at86rf230 driver.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

e73087af

at86rf230: add support for devicetree · fa2d3e94

Alexander Aring authored Mar 15, 2014

This patch adds devicetree support for the at86rf230 driver.

Possible gpios to configure are "reset-gpio" and "sleep-gpio".
Also add support to configure the "irq-type" for the irq polarity
register.
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

fa2d3e94

at86rf230: make reset pin optionally · 3fa27571

Alexander Aring authored Mar 15, 2014

This patch make the reset pin optionally. Some devices like the atben
from qi-hardware don't have a reset pin externally. The usually way is
to turn power off/on for the atben device to initiate a device reset.
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

3fa27571

at86rf230: change reset timings · 56f023fb

Alexander Aring authored Mar 15, 2014

While checkpatch another patch I got a:

"WARNING: msleep < 20ms can sleep for up to 20ms"

The datasheet of at86rf231 and at86rf212 says a minimum delay for reset
pulse width and spi access latency after reset is 625 nanoseconds.

This patch removes the 1 milliseconds sleep and replace it with a 1
microseconds udelay which should be also okay for the reset pulse width.

To change the state from RESET -> TRX_OFF the at86rf230 device needs 120
microseconds, this is a worst case of all at86rf* chips.
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

56f023fb

at86rf230: move locking state in xmit · 7e814618

Alexander Aring authored Mar 15, 2014

There is no need to lock the clearing of IRQ_TRX_END in status.
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7e814618

at86rf230: fix unexpected state change · 7332fcb8

Alexander Aring authored Mar 15, 2014

This patch fix a unexpected state change for the at86rf231 chip.
We can't change into STATE_FORCE_TX_ON while the chip is in one of
SLEEP, P_ON, RESET, TRX_OFF, and all *_NOCLK states.

In this case we are in the TRX_OFF state. See datasheet [1] page 71 for
more information.

Without this patch you will get the following message on a at86rf231 device:

[   20.065218] unexpected state change: 8, asked for 4
[   20.070527] ------------[ cut here ]------------
[   20.075414] WARNING: CPU: 0 PID: 160 at net/mac802154/ieee802154_dev.c:43 mac802154_slave_open+0x70/0xb8()
[   20.085594] Modules linked in: autofs4
[   20.089667] CPU: 0 PID: 160 Comm: ifconfig Not tainted 3.14.0-20140108-1-00993-g905c192 #162
[   20.098612] [<c00127b8>] (unwind_backtrace) from [<c0010b1c>] (show_stack+0x10/0x14)
[   20.106819] [<c0010b1c>] (show_stack) from [<c0033838>] (warn_slowpath_common+0x60/0x80)
[   20.115311] [<c0033838>] (warn_slowpath_common) from [<c00338e8>] (warn_slowpath_null+0x18/0x20)
[   20.124590] [<c00338e8>] (warn_slowpath_null) from [<c057b7e8>] (mac802154_slave_open+0x70/0xb8)
[   20.133880] [<c057b7e8>] (mac802154_slave_open) from [<c0488a58>] (__dev_open+0xa8/0x108)
[   20.142553] [<c0488a58>] (__dev_open) from [<c0488cb0>] (__dev_change_flags+0x8c/0x148)
[   20.151051] [<c0488cb0>] (__dev_change_flags) from [<c0488d84>] (dev_change_flags+0x18/0x48)
[   20.159968] [<c0488d84>] (dev_change_flags) from [<c04e2e9c>] (devinet_ioctl+0x2b0/0x63c)
[   20.168623] [<c04e2e9c>] (devinet_ioctl) from [<c04712e4>] (sock_ioctl+0x23c/0x29c)
[   20.176727] [<c04712e4>] (sock_ioctl) from [<c00e3cb8>] (do_vfs_ioctl+0x4a8/0x578)
[   20.184671] [<c00e3cb8>] (do_vfs_ioctl) from [<c00e3dd4>] (SyS_ioctl+0x4c/0x78)
[   20.192402] [<c00e3dd4>] (SyS_ioctl) from [<c000da00>] (ret_fast_syscall+0x0/0x48)
[   20.200392] ---[ end trace 9a34542f4ea08e47 ]---

This patch was tested on at86rf231 and at86rf212.

[1] http://www.atmel.com/images/doc8111.pdfSigned-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7332fcb8

Merge branch 'sh_eth' · d5c065e3

David S. Miller authored Mar 17, 2014

Sergei Shtylyov says:

====================
Beautify 'sh_eth' driver's messages

This patchset converts te driver to using netdev_*() and netif_*() to print out
its messages whenever possible.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

d5c065e3

sh_eth: fold netif_msg_*() and netdev_*() calls into netif_*() invocations · 8d5009f6

Sergei Shtylyov authored Mar 15, 2014

Now that we call netdev_*() under netif_msg_*() checks, we can fold these into
netif_*() macro invocations.
Suggested-by: Joe Perches <joe@perches.com>
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

8d5009f6

sh_eth: convert dev_*() to netdev_*() calls · da246855

Sergei Shtylyov authored Mar 15, 2014

Convert dev_*(&ndev->dev, ...) to netdev_*(ndev, ...) calls since they are a bit
shorter and at the same time give more information on a device.
Suggested-by: Joe Perches <joe@perches.com>
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

da246855

sh_eth: convert pr_*() to netdev_*() calls · f75f14ec

Sergei Shtylyov authored Mar 15, 2014

Convert pr_*() to netdev_*() calls as the latter provide info on a device.
Suggested-by: Joe Perches <joe@perches.com>
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

f75f14ec

sh_eth: exit probe with unknown register layout · 264be2f5

Sergei Shtylyov authored Mar 15, 2014

Exit the driver's probe() method when the register layout is unknown as the
driver would cause kernel oops in this case anyway.

While at it, move the corresponding error message printout and convert it from
pr_err() to dev_err().
Suggested-by: Joe Perches <joe@perches.com>
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

264be2f5

Merge branch 'netpoll-next' · 7c169445

David S. Miller authored Mar 17, 2014

Eric W. Biederman says:

====================
netpoll: Cleanup received packet processing

This is the long-winded, careful, and polite version of removing the netpoll
receive packet processing.

First I untangle the code in small steps.  Then I modify the code to not
force reception and dropping of packets when we are transmiting a packet
with netpoll.  Finally I move all of the packet reception under
CONFIG_NETPOLL_TRAP and delete CONFIG_NETPOLL_TRAP.

If someone wants to do a stable backport of these patches, it would
require backporting the first 18 patches that handle the budget == 0 in
the networking drivers, and the first 6 of these patches.

If anyone wants to resurrect netpoll packet reception someday it should
just be a matter of reverting the last patch.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

7c169445

netpoll: Remove dead packet receive code (CONFIG_NETPOLL_TRAP) · 9c62a68d

Eric W. Biederman authored Mar 14, 2014

The netpoll packet receive code only becomes active if the netpoll
rx_skb_hook is implemented, and there is not a single implementation
of the netpoll rx_skb_hook in the kernel.

All of the out of tree implementations I have found all call
netpoll_poll which was removed from the kernel in 2011, so this
change should not add any additional breakage.

There are problems with the netpoll packet receive code.  __netpoll_rx
does not call dev_kfree_skb_irq or dev_kfree_skb_any in hard irq
context.  netpoll_neigh_reply leaks every skb it receives.  Reception
of packets does not work successfully on stacked devices (aka bonding,
team, bridge, and vlans).

Given that the netpoll packet receive code is buggy, there are no
out of tree users that will be merged soon, and the code has
not been used for in tree for a decade let's just remove it.

Reverting this commit can server as a starting point for anyone
who wants to resurrect netpoll packet reception support.
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9c62a68d

netpoll: Move all receive processing under CONFIG_NETPOLL_TRAP · e1bd4d3d

Eric W. Biederman authored Mar 14, 2014

Make rx_skb_hook, and rx in struct netpoll depend on
CONFIG_NETPOLL_TRAP Make rx_lock, rx_np, and neigh_tx in struct
netpoll_info depend on CONFIG_NETPOLL_TRAP

Make the functions netpoll_rx_on, netpoll_rx, and netpoll_receive_skb
no-ops when CONFIG_NETPOLL_TRAP is not set.

Only build netpoll_neigh_reply, checksum_udp service_neigh_queue,
pkt_is_ns, and __netpoll_rx when CONFIG_NETPOLL_TRAP is defined.

Add helper functions netpoll_trap_setup, netpoll_trap_setup_info,
netpoll_trap_cleanup, and netpoll_trap_cleanup_info that initialize
and cleanup the struct netpoll and struct netpoll_info receive
specific fields when CONFIG_NETPOLL_TRAP is enabled and do nothing
otherwise.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

e1bd4d3d

netpoll: Consolidate neigh_tx processing in service_neigh_queue · 18b37535

Eric W. Biederman authored Mar 14, 2014

Move the bond slave device neigh_tx handling into service_neigh_queue.

In connection with neigh_tx processing remove unnecessary tests of
a NULL netpoll_info.  As the netpoll_poll_dev has already used
and thus verified the existince of the netpoll_info.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

18b37535

netpoll: Move netpoll_trap under CONFIG_NETPOLL_TRAP · ad8d4752

Eric W. Biederman authored Mar 14, 2014

Now that we no longer need to receive packets to safely drain the
network drivers receive queue move netpoll_trap and netpoll_set_trap
under CONFIG_NETPOLL_TRAP

Making netpoll_trap and netpoll_set_trap noop inline functions
when CONFIG_NETPOLL_TRAP is not set.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ad8d4752

netpoll: Don't drop all received packets. · b6bacd55

Eric W. Biederman authored Mar 14, 2014

Change the strategy of netpoll from dropping all packets received
during netpoll_poll_dev to calling napi poll with a budget of 0
(to avoid processing drivers rx queue), and to ignore packets received
with netif_rx (those will safely be placed on the backlog queue).

All of the netpoll supporting drivers have been reviewed to ensure
either thay use netif_rx or that a budget of 0 is supported by their
napi poll routine and that a budget of 0 will not process the drivers
rx queues.

Not dropping packets makes NETPOLL_RX_DROP unnecesary so it is removed.

npinfo->rx_flags is removed  as rx_flags with just the NETPOLL_RX_ENABLED
flag becomes just a redundant mirror of list_empty(&npinfo->rx_np).
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

b6bacd55

netpoll: Add netpoll_rx_processing · ff607631

Eric W. Biederman authored Mar 14, 2014

Add a helper netpoll_rx_processing that reports when netpoll has
receive side processing to perform.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ff607631

netpoll: Warn if more packets are processed than are budgeted · e97dc3fc

Eric W. Biederman authored Mar 14, 2014

There is already a warning for this case in the normal netpoll path,
but put a copy here in case how netpoll calls the poll functions
causes a differenet result.

netpoll will shortly call the napi poll routine with a budget 0 to
avoid any rx packets being processed.  As nothing does that today
we may encounter drivers that have problems so a netpoll specific
warning seems desirable.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

e97dc3fc

netpoll: Visit all napi handlers in poll_napi · eb8143b4

Eric W. Biederman authored Mar 14, 2014

In poll_napi loop through all of the napi handlers even when the
budget falls to 0 to ensure that we process all of the tx_queues, and
so that we continue to call into drivers when our initial budget is 0.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

eb8143b4

netpoll: Pass budget into poll_napi · 9852fbec

Eric W. Biederman authored Mar 14, 2014

This moves the control logic to the top level in netpoll_poll_dev
instead of having it dispersed throughout netpoll_poll_dev,
poll_napi and poll_one_napi.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9852fbec

netpoll: move setting of NETPOLL_RX_DROP into netpoll_poll_dev · b249b51b

Eric W. Biederman authored Mar 14, 2014

Today netpoll depends on setting NETPOLL_RX_DROP before networking
drivers receive packets in interrupt context so that the packets can
be dropped. Move this setting into netpoll_poll_dev from
poll_one_napi so that if ndo_poll_controller happens to receive
packets we will drop the packets on the floor instead of letting the
packets bounce through the networking stack and potentially cause problems.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

b249b51b

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next · e86e180b

David S. Miller authored Mar 17, 2014

Pablo Neira Ayuso says:

====================
Netfilter/IPVS updates for net-next

The following patchset contains Netfilter/IPVS updates for net-next,
most relevantly they are:

* cleanup to remove double semicolon from stephen hemminger.

* calm down sparse warning in xt_ipcomp, from Fan Du.

* nf_ct_labels support for nf_tables, from Florian Westphal.

* new macros to simplify rcu dereferences in the scope of nfnetlink
  and nf_tables, from Patrick McHardy.

* Accept queue and drop (including reason for drop) to verdict
  parsing in nf_tables, also from Patrick.

* Remove unused random seed initialization in nfnetlink_log, from
  Florian Westphal.

* Allow to attach user-specific information to nf_tables rules, useful
  to attach user comments to rule, from me.

* Return errors in ipset according to the manpage documentation, from
  Jozsef Kadlecsik.

* Fix coccinelle warnings related to incorrect bool type usage for ipset,
  from Fengguang Wu.

* Add hash:ip,mark set type to ipset, from Vytas Dauksa.

* Fix message for each spotted by ipset for each netns that is created,
  from Ilia Mirkin.

* Add forceadd option to ipset, which evicts a random entry from the set
  if it becomes full, from Josh Hunt.

* Minor IPVS cleanups and fixes from Andi Kleen and Tingwei Liu.

* Improve conntrack scalability by removing a central spinlock, original
  work from Eric Dumazet. Jesper Dangaard Brouer took them over to address
  remaining issues. Several patches to prepare this change come in first
  place.

* Rework nft_hash to resolve bugs (leaking chain, missing rcu synchronization
  on element removal, etc. from Patrick McHardy.

* Restore context in the rule deletion path, as we now release rule objects
  synchronously, from Patrick McHardy. This gets back event notification for
  anonymous sets.

* Fix NAT family validation in nft_nat, also from Patrick.

* Improve scalability of xt_connlimit by using an array of spinlocks and
  by introducing a rb-tree of hashtables for faster lookup of accounted
  objects per network. This patch was preceded by several patches and
  refactorizations to accomodate this change including the use of kmem_cache,
  from Florian Westphal.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

e86e180b

netfilter: connlimit: use rbtree for per-host conntrack obj storage · 7d084877

Florian Westphal authored Mar 12, 2014

With current match design every invocation of the connlimit_match
function means we have to perform (number_of_conntracks % 256) lookups
in the conntrack table [ to perform GC/delete stale entries ].
This is also the reason why ____nf_conntrack_find() in perf top has
> 20% cpu time per core.

This patch changes the storage to rbtree which cuts down the number of
ct objects that need testing.

When looking up a new tuple, we only test the connections of the host
objects we visit while searching for the wanted host/network (or
the leaf we need to insert at).

The slot count is reduced to 32.  Increasing slot count doesn't
speed up things much because of rbtree nature.

before patch (50kpps rx, 10kpps tx):
+  20.95%  ksoftirqd/0  [nf_conntrack] [k] ____nf_conntrack_find
+  20.50%  ksoftirqd/1  [nf_conntrack] [k] ____nf_conntrack_find
+  20.27%  ksoftirqd/2  [nf_conntrack] [k] ____nf_conntrack_find
+   5.76%  ksoftirqd/1  [nf_conntrack] [k] hash_conntrack_raw
+   5.39%  ksoftirqd/2  [nf_conntrack] [k] hash_conntrack_raw
+   5.35%  ksoftirqd/0  [nf_conntrack] [k] hash_conntrack_raw

after (90kpps, 51kpps tx):
+  17.24%       swapper  [nf_conntrack]    [k] ____nf_conntrack_find
+   6.60%   ksoftirqd/2  [nf_conntrack]    [k] ____nf_conntrack_find
+   2.73%       swapper  [nf_conntrack]    [k] hash_conntrack_raw
+   2.36%       swapper  [xt_connlimit]    [k] count_tree

Obvious disadvantages to previous version are the increase in code
complexity and the increased memory cost.

Partially based on Eric Dumazets fq scheduler.
Reviewed-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

7d084877