Commits · 1a9375eb7f987e9338c17504dd2b358b688b32d8 · Kirill Smelkov / linux

08 Oct, 2015 40 commits

i40e/i40evf: Store CEE DCBX DesiredCfg and RemoteCfg · 1a9375eb

Neerav Parikh authored Aug 27, 2015

This patch adds capability to query and store the CEE DCBX DesiredCfg
and RemoteCfg data from the LLDP MIB.
Added new member "desired_dcbx_config" in the i40e_hw data structure
to hold CEE only DesiredCfg data.

Change-ID: I19c550369594384eaff4cc63e690ca740231195d
Signed-off-by: Neerav Parikh <neerav.parikh@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

1a9375eb

i40e: Add parsing for CEE DCBX TLVs · 909b2d16

Neerav Parikh authored Aug 27, 2015

This patch adds parsing for CEE DCBX TLVs from the LLDP MIB.

While the driver gets the DCB CEE operational configuration from Firmware
using the "Get CEE DCBX Oper Config" AQ command there is a need to get
the CEE DesiredCfg Tx by firmware and DCB configuration Rx from peer; for
debug and other application purposes.

Change-ID: I9140edf1a25a2852c7eff805d81e5eff6266178d
Signed-off-by: Neerav Parikh <neerav.parikh@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

909b2d16

i40e: add more verbose error messages · 96c8d073

Mitch Williams authored Aug 27, 2015

Under certain circumstances, the device may not have enough resources to
enable all of the VFs that it advertises in config space. Although the
number of supported VFs is reported upon driver init, it is not obvious
when this is different from the number reported in config space. To
eliminate this confusion, add an error message explaining the problem.
Additionally, move the 'Allocating VFs' message down below the error
checks so as to prevent further confusion.

Change-ID: I45b7efca53a7aebf7777be33a8bc9d615ae48ea1
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

96c8d073

i40e: inline interrupt enable · 02d109be

Jesse Brandeburg authored Aug 27, 2015

The interrupt enable function can be inlined by moving it to the header
file, which decreases the function call overhead for a frequently called
function.

Change-ID: I3214cc99593725768642680e7b8ce7e9bba7e44d
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

02d109be

i40e: fix erroneous WARN_ON · 1e8efb42

Jesse Brandeburg authored Aug 27, 2015

The driver was issuing a WARN_ON during ring size changes
because the code was cloning the rx_ring struct but
not zeroing out the pointers before allocating new memory.

Zero out the pointers in the cloned copy before allocating
new memory for them.  In this case the code was correctly
avoiding memory leaks but still triggering the warning.

Change-ID: I186dd493948e9b7254ab0593d4aad8b68808918d
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

1e8efb42

Merge branch 'bpf_random32' · df718423

David S. Miller authored Oct 08, 2015

Daniel Borkmann says:

====================
BPF/random32 updates

BPF update to split the prandom state apart, and to move the
*once helpers to the core. For details, please see individual
patches. Given the changes and since it's in the tree for
quite some time, net-next is a better choice in our opinion.

v1 -> v2:
 - Make DO_ONCE() type-safe, remove the kvec helper. Credits
   go to Alexei Starovoitov for the __VA_ARGS__ hint, thanks!
 - Add a comment to the DO_ONCE() helper as suggested by Alexei.
 - Rework prandom_init_once() helper to the new API.
 - Keep Alexei's Acked-by on the last patch.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

df718423

bpf: split state from prandom_u32() and consolidate {c, e}BPF prngs · 3ad00405

Daniel Borkmann authored Oct 08, 2015

While recently arguing on a seccomp discussion that raw prandom_u32()
access shouldn't be exposed to unpriviledged user space, I forgot the
fact that SKF_AD_RANDOM extension actually already does it for some time
in cBPF via commit 4cd3675e ("filter: added BPF random opcode").

Since prandom_u32() is being used in a lot of critical networking code,
lets be more conservative and split their states. Furthermore, consolidate
eBPF and cBPF prandom handlers to use the new internal PRNG. For eBPF,
bpf_get_prandom_u32() was only accessible for priviledged users, but
should that change one day, we also don't want to leak raw sequences
through things like eBPF maps.

One thought was also to have own per bpf_prog states, but due to ABI
reasons this is not easily possible, i.e. the program code currently
cannot access bpf_prog itself, and copying the rnd_state to/from the
stack scratch space whenever a program uses the prng seems not really
worth the trouble and seems too hacky. If needed, taus113 could in such
cases be implemented within eBPF using a map entry to keep the state
space, or get_random_bytes() could become a second helper in cases where
performance would not be critical.

Both sides can trigger a one-time late init via prandom_init_once() on
the shared state. Performance-wise, there should even be a tiny gain
as bpf_user_rnd_u32() saves one function call. The PRNG needs to live
inside the BPF core since kernels could have a NET-less config as well.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Cc: Chema Gonzalez <chema@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

3ad00405

random32: add prandom_init_once helper for own rngs · 897ece56

Daniel Borkmann authored Oct 08, 2015

Add a prandom_init_once() facility that works on the rnd_state, so that
users that are keeping their own state independent from prandom_u32() can
initialize their taus113 per cpu states.

The motivation here is similar to net_get_random_once(): initialize the
state as late as possible in the hope that enough entropy has been
collected for the seeding. prandom_init_once() makes use of the recently
introduced prandom_seed_full_state() helper and is generic enough so that
it could also be used on fast-paths due to the DO_ONCE().
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

897ece56

random32: add prandom_seed_full_state helper · 0dd50d1b

Daniel Borkmann authored Oct 08, 2015

Factor out the full reseed handling code that populates the state
through get_random_bytes() and runs prandom_warmup(). The resulting
prandom_seed_full_state() will be used later on in more than the
current __prandom_reseed() user. Fix also two minor whitespace
issues along the way.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

0dd50d1b

once: make helper generic for calling functions once · c90aeb94

Hannes Frederic Sowa authored Oct 08, 2015

Make the get_random_once() helper generic enough, so that functions
in general would only be called once, where one user of this is then
net_get_random_once().

The only implementation specific call is to get_random_bytes(), all
the rest of this *_once() facility would be duplicated among different
subsystems otherwise. The new DO_ONCE() helper will be used by prandom()
later on, but might also be useful for other scenarios/subsystems as
well where a one-time initialization in often-called, possibly fast
path code could occur.
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

c90aeb94

net: move net_get_random_once to lib · 46234253

Hannes Frederic Sowa authored Oct 08, 2015

There's no good reason why users outside of networking should not
be using this facility, f.e. for initializing their seeds.

Therefore, make it accessible from there as get_random_once().
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

46234253

net: Do not drop to make_route if oif is l3mdev · 28335a74

David Ahern authored Oct 07, 2015

Commit deaa0a6a ("net: Lookup actual route when oif is VRF device")
exposed a bug in __ip_route_output_key_hash for VRF devices: on FIB lookup
failure if the oif is specified the current logic drops to make_route on
the assumption that the route tables are wrong. For VRF/L3 master devices
this leads to wrong dst entries and route lookups. For example:
    $ ip route ls table vrf-red
    unreachable default
    broadcast 10.2.1.0 dev eth1  proto kernel  scope link  src 10.2.1.2
    10.2.1.0/24 dev eth1  proto kernel  scope link  src 10.2.1.2
    local 10.2.1.2 dev eth1  proto kernel  scope host  src 10.2.1.2
    broadcast 10.2.1.255 dev eth1  proto kernel  scope link  src 10.2.1.2

    $ ip route get oif vrf-red 1.1.1.1
    1.1.1.1 dev vrf-red  src 10.0.0.2
        cache

With this patch:
    $  ip route get oif vrf-red 1.1.1.1
    RTNETLINK answers: No route to host

which is the correct response based on the default route
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

28335a74

bpf, skb_do_redirect: clear sender_cpu before xmit · cfc81b50

Daniel Borkmann authored Oct 07, 2015

Similar to commit c29390c6 ("xps: must clear sender_cpu before
forwarding"), we also need to clear the skb->sender_cpu when moving
from RX to TX via skb_do_redirect() due to the shared location of
napi_id (used on RX) and sender_cpu (used on TX).

Fixes: 27b29f63 ("bpf: add bpf_redirect() helper")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

cfc81b50

net: hns: fix 32-bit build warning · dfdd7230

Arnd Bergmann authored Oct 06, 2015

The recently added hns driver causes a build warning in ARM
allmodconfig builds:

drivers/net/ethernet/hisilicon/hns/hnae.c: In function 'handles_show':
drivers/net/ethernet/hisilicon/hns/hnae.c:452:13: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
          j, (u64)h->qs[i]->io_base);
             ^

This removes the pointless cast and prints the pointer address using
the "%p" format string in all three locations.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

dfdd7230

net: Microchip encx24j600 driver · d70e5326

Jon Ringle authored Oct 06, 2015

This ethernet driver supports the Micorchip enc424j600/626j600 Ethernet
controller over a SPI bus interface. This driver makes use of the regmap API to
optimize access to registers by caching registers where possible.

Datasheet:
http://ww1.microchip.com/downloads/en/DeviceDoc/39935b.pdfSigned-off-by: Jon Ringle <jringle@gridpoint.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

d70e5326

Merge branch 'broadcom-iproc' · 494f8eb9

David S. Miller authored Oct 08, 2015

Arun Parameswaran says:

====================
Add support for Broadcom's iProc MDIO and Cygnus Ethernet PHY

This patchset adds support for the iProc MDIO interface and the
Broadcom Cygnus SoC's internal Ethernet PHY.

The internal Ethernet PHY(s) in the Cygnus SoC's are accessed
via the MDIO interface found in most of the iProc based chips.

The patch also consolidates the common API's used by the
Broadcom phys to a common library. Existing Broadcom phy
drivers have been modified to use the common library API's.

This patch series is based on Linux v4.3-rc1 and is avaliable in:
https://github.com/Broadcom/cygnus-linux/tree/cygnus-net-phy-mdio-v3

The Ethernet driver for the iProc family will be submitted soon,
as will the device tree configurations for the different iProc
family SoCs.

Changes from v2:
- Modified drivers/net/phy/Kconfig to modify the BCM_CYGNUS_PHY
  driver to 'depends on MDIO_BCM_IPROC' instead of 'select'.
- Added github branch to the cover letter

Changes from v1:
- Updated device tree documentation for the iProc MDIO driver
  based on Florian's feedback.
- Moved the core register defines from the Cygnus PHY driver to
  'include/linux/brcmphy.h' based on Florian's feedback.
- Created a new patch/commit to modify the bcm7xxx phy driver
  to use the new core register defines.
- Modified the Kconfig entry for the Broadcom PHY library to
  'tristate' instead of 'bool'
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

494f8eb9

net: phy: bcm7xxx: Modified to use global core register defines · 9200c27a

Arun Parameswaran authored Oct 06, 2015

Modified the bcm7xxx phy driver to remove local core register
defines and use the common ones from "include/linux/brcmphy.h"
Signed-off-by: Arun Parameswaran <arunp@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9200c27a

net: phy: Broadcom Cygnus internal Etherent PHY driver · 8e185d69

Arun Parameswaran authored Oct 06, 2015

Add support for the Broadcom Cygnus SoCs internal PHY's.
The PHYs are 1000M/100M/10M capable with support for 'EEE'
and 'APD' (Auto Power Down).

This driver supports the following Broadcom Cygnus SoCs:
 - BCM583XX (BCM58300, BCM58302, BCM58303, BCM58305)
 - BCM113XX (BCM11300, BCM11320, BCM11350, BCM11360)

The PHY's on these SoC's require some workarounds for
stable operation, both during configuration time and
during suspend/resume. This driver handles the
application of the workarounds.
Signed-off-by: Arun Parameswaran <arunp@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

8e185d69

net: phy: Add Broadcom phy library for common interfaces · a1cba561

Arun Parameswaran authored Oct 06, 2015

This patch adds the Broadcom phy library to consolidate common
interfaces shared by Broadcom phy's.

Moved the common interfaces to the 'bcm-phy-lib.c' and updated
the Broadcom PHY drivers to use the new APIs.
Signed-off-by: Arun Parameswaran <arunp@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a1cba561

net: phy: Broadcom iProc MDIO bus driver · ddc24ae1

Arun Parameswaran authored Oct 06, 2015

This patch adds support for the Broadcom iProc MDIO bus interface.
The MDIO interface can be found in the Broadcom iProc family Soc's.

The MDIO bus is accessed using a combination of command and data
registers. This MDIO driver provides access to the Etherent GPHY's
connected to the MDIO bus.
Signed-off-by: Arun Parameswaran <arunp@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ddc24ae1

dt-bindings: net: Broadcom iProc MDIO bus driver device tree binding · bb257c38

Arun Parameswaran authored Oct 06, 2015

Add device tree binding documentation for the Broadcom iProc MDIO
bus driver.
Signed-off-by: Arun Parameswaran <arunp@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bb257c38

Merge branch 'net/rds/4.3-v3' of git://git.kernel.org/pub/scm/linux/kernel/git/ssantosh/linux · 91d2f14b

David S. Miller authored Oct 08, 2015

Santosh Shilimkar says:

====================
RDS: connection scalability and performance improvements

[v4]
Re-sending the same patches from v3 again since my repost of
patch 05/14 from v3 was whitespace damaged.

[v3]
Updated patch "[PATCH v2 05/14] RDS: defer the over_batch work to
send worker" as per David Miller's comment [4] to avoid the magic
value usage. Patch now makes use of already available but unused
send_batch_count module parameter. Rest of the patches are same as
earlier version v2 [3]

[v2]:
Dropped "[PATCH 05/15] RDS: increase size of hash-table to 8K" from
earlier version [1]. I plan to address the hash table scalability using
re-sizable hash tables as suggested by David Laight and David Miller [2]

This series addresses RDS connection bottlenecks on massive workloads and
improve the RDMA performance almost by 3X. RDS TCP also gets a small gain
of about 12%.

RDS is being used in massive systems with high scalability where several
hundred thousand end points and tens of thousands of local processes
are operating in tens of thousand sockets. Being RC(reliable connection),
socket bind and release happens very often and any inefficiencies in
bind hash look ups hurts the overall system performance. RDS bin hash-table
uses global spin-lock which is the biggest bottleneck. To make matter worst,
it uses rcu inside global lock for hash buckets.
This is being addressed by simply using per bucket rw lock which makes the
locking simple and very efficient. The hash table size is still an issue and
I plan to address it by using re-sizable hash tables as suggested on the list.

For RDS RDMA improvement, the completion handling is revamped so that we
can do batch completions. Both send and receive completion handlers are
split logically to achieve the same. RDS 8K messages being one of the
key usecase, mr pool is adapted to have the 8K mrs along with default 1M
mrs. And while doing this, few fixes and couple of bottlenecks seen with
rds_sendmsg() are addressed.

Series applies against 4.3-rc1 as well net-next. Its tested on Oracle
hardware with IB fabric for both bcopy as well as RDMA mode. RDS TCP is
tested with iXGB NIC. Like last time, iWARP transport is untested with
these changes. The patchset is also available at below git repo:

git://git.kernel.org/pub/scm/linux/kernel/git/ssantosh/linux.git net/rds/4.3-v3

As a side note, the IB HCA driver I used for testing misses at least 3
important patches in upstream to see the full blown IB performance and
am hoping to get that in mainline with help of them.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

91d2f14b

Merge branch 'pass_net_through_output_path' · 6b92d0c4

David S. Miller authored Oct 08, 2015

Eric W. Biederman says:

====================
net: Pass net through the output path v2

This is the next installment of my work to pass struct net through the
output path so the code does not need to guess how to figure out which
network namespace it is in, and ultimately routes can have output
devices in another network namespace.

The first patch in this series is a fix for a bug that came in when sk
was passed through the functions in the output path, and as such is
probably a candidate for net.  At the same time my later patches depend
on it so sending the fix separately would be confusing.

The second patch in this series is another fix that for an issue that
came in when sk was passed through the output path.  I don't think it
needs a backport as I don't think anyone uses the path where the code
was incorrect.

The rest of the patchset focuses on the path from xxx_local_out to
dst_output and in the end succeeds in passing sock_net(sk) from the
socket a packet locally originates on to the dst->output function.

Given the size reduction in the code I think this counts as a cleanup as
much as feature work.

There remain a number of helper functions (like ip option processing) to
take care of before the network stack can support destination devices in
other network namespaces but with this set of changes the backbone of
the work is done.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

6b92d0c4

dst: Pass net into dst->output · ede2059d

Eric W. Biederman authored Oct 07, 2015

The network namespace is already passed into dst_output pass it into
dst->output lwt->output and friends.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ede2059d

ipv4, ipv6: Pass net into ip_local_out and ip6_local_out · 33224b16

Eric W. Biederman authored Oct 07, 2015

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

33224b16

ipv4, ipv6: Pass net into __ip_local_out and __ip6_local_out · cf91a99d

Eric W. Biederman authored Oct 07, 2015

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

cf91a99d

ipvlan: Cache net in ipvlan_process_v4_outbound and ipvlan_process_v6_outbound · 57c4bf85

Eric W. Biederman authored Oct 07, 2015

Compute net once in ipvlan_process_v4_outbound and
ipvlan_process_v6_outbound and store it in a variable so that net does
not need to be recomputed next time it is used.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

57c4bf85

ppp: Cache net in pptp_xmit · a7093fef

Eric W. Biederman authored Oct 07, 2015

Compute net and store it in a variable in pptp_xmit, so that the value
can be reused the next time it is needed.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a7093fef

ipv4: Cache net in ip_build_and_send_pkt and ip_queue_xmit · 77589ce0

Eric W. Biederman authored Oct 07, 2015

Compute net and store it in a variable in the functions
ip_build_and_send_pkt and ip_queue_xmit so that it does not need to be
recomputed next time it is needed.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

77589ce0

ipv4: Cache net in iptunnel_xmit · f859b0f6

Eric W. Biederman authored Oct 07, 2015

Store net in a variable in ip_tunnel_xmit so it does not need
to be recomputed when it is used again.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

f859b0f6

ipv6: Merge ip6_local_out and ip6_local_out_sk · 79288330

Eric W. Biederman authored Oct 07, 2015

Stop hidding the sk parameter with an inline helper function and make
all of the callers pass it, so that it is clear what the function is
doing.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

79288330

ipv6: Merge __ip6_local_out and __ip6_local_out_sk · 9f8955cc

Eric W. Biederman authored Oct 07, 2015

Only __ip6_local_out_sk has callers so rename __ip6_local_out_sk
__ip6_local_out and remove the previous __ip6_local_out.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9f8955cc

ipv4: Merge ip_local_out and ip_local_out_sk · e2cb77db

Eric W. Biederman authored Oct 07, 2015

It is confusing and silly hiding a parameter so modify all of
the callers to pass in the appropriate socket or skb->sk if
no socket is known.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

e2cb77db

ipv4: Merge __ip_local_out and __ip_local_out_sk · b92dacd4

Eric W. Biederman authored Oct 07, 2015

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

b92dacd4

dst: Pass a sk into .local_out · 4ebdfba7

Eric W. Biederman authored Oct 07, 2015

For consistency with the other similar methods in the kernel pass a
struct sock into the dst_ops .local_out method.

Simplifying the socket passing case is needed a prequel to passing a
struct net reference into .local_out.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

4ebdfba7

net: Pass net into dst_output and remove dst_output_okfn · 13206b6b

Eric W. Biederman authored Oct 07, 2015

Replace dst_output_okfn with dst_output
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

13206b6b

xfrm: Only compute net once in xfrm_policy_queue_process · 3f5312ae

Eric W. Biederman authored Oct 07, 2015

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

3f5312ae

ipv4: Fix ip_queue_xmit to pass sk into ip_local_out_sk · 850dcc4d

Eric W. Biederman authored Oct 07, 2015

After a packet has been encapsulated by a tunnel we should use the
tunnel sockets local multicast loopback flag to control if the
encapsulated packet should be locally loopback back.

Pass sk into ip_local_out_sk so that in the rare case we are dealing
with a tunneled packet whose tunnel destination address is a multicast
address the kernel properly decides to loopback this packet.

In practice I don't think this matters as ip_queue_xmit is used by
tcp, l2tp and sctp none of which I am aware of uses ip level
multicasting as they are all point to point communications protocols.
Let's fix this before someone uses ip_queue_xmit for a tunnel protocol
that does use multicast.

Fixes: aad88724 ("ipv4: add a sock pointer to dst->output() path.")
Fixes: b0270e91 ("ipv4: add a sock pointer to ip_queue_xmit()")
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

850dcc4d

ipv4: Fix ip_local_out_sk by passing the sk into __ip_local_out_sk · fd2874b3

Eric W. Biederman authored Oct 07, 2015

In the rare case where sk != skb->sk ip_local_out_sk arranges
to call dst->output differently if the skb is queued or not.
This is a bug.

Fix this bug by passing the sk parameter of ip_local_out_sk through
from ip_local_out_sk to __ip_local_out_sk (skipping __ip_local_out).

Fixes: 7026b1dd ("netfilter: Pass socket pointer down through okfn().")
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

fd2874b3

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue · e28383dd

David S. Miller authored Oct 08, 2015

Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates 2015-10-07

This series contains updates to i40e and i40evf only.

Paul updates i40e to simply increase the amount of time we wait for a
reset to complete since we have seen in some rare occasions the reset
can take longer to complete.

Shannon updates the driver to turn on Wake-on-LAN by default if it is
enabled in the hardware config to begin with, rather than always disable
it and wait for the user to expressly turn it on.  Added new device id's
and support for future devices.  Fixed a possible type compare problem
between a size and possible negative number.  Also fixed a shift value
that was wrong, which ended up with a bad bitmask.  Did general house
cleaning of the driver to cleanup several low lying fruit in the
driver.  Fixed an issue where new unicast address's would be added to
the VSI list and then immediately removed and would never actually
make it down to the hardware.  Resolved the issue by removing the
separation from unicast and multicast in the search for filters to be
deleted.

Mitch fixes an issue where the hardware would continue to access the
memory formerly used by the rings for a VF which have been removed,
causing memory corruption or DMAR errors.  To relieve this condition,
explicitly stop all rings associated with each VF before releasing its
resources.  Also fixed a panic if the driver is unable to enable MSI-X
or its unable to acquire enough vectors, so propagate interrupt
allocation failure information to the calling function.  Cleaned up
opcode that is not required.

Carolyn extends the size of the test available for the interrupt names
so that all the descriptive data available for the Flow Director
interrupts is not truncated.

Catherine fixes an issue where there was a possibility of speed getting
set to 0 if advertised is set to 0 (which is the case when autoneg is
disabled).

Jesse fixes the checksum on big endian machines, so added code to swap
it correctly.  Also fixed a bug in the return from get_link_status()
where only true or false was being returned, but false could mean
multiple things.  So allow the caller to get all the return values
in the call chain bubbled back to the source so that the reason for
the failure does not get lost.

Anjali adds statistics to keep track of how many times we ask the stack
to linearize the SKB because the hardware cannot handle SKBs with more
than 8 frags per segment/single packet.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

e28383dd