Commits · e5de25dce9243a3d29b5ebc131cc9d59008f39f7 · Kirill Smelkov / linux

11 Jul, 2016 15 commits

drivers/net: fixup comments after "Future-proof tunnel offload handlers" · e5de25dc

Sabrina Dubroca authored Jul 11, 2016

Some comments weren't updated to reflect the renaming of ndo's and the
change of arguments.
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Acked-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

e5de25dc

MAINTAINERS: release Scott from being a rocker maintainer · 752e6d5d

Jiri Pirko authored Jul 10, 2016

As requested by Scott, removing him.
Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

752e6d5d

tunnels: correct conditional build of MPLS and IPv6 · aa9667e7

Simon Horman authored Jul 10, 2016

Using a combination if #if conditionals and goto labels to unwind
tunnel4_init seems unwieldy. This patch takes a simpler approach of
directly unregistering previously registered protocols when an error
occurs.

This fixes a number of problems with the current implementation
including the potential presence of labels when they are unused
and the potential absence of unregister code when it is needed.

Fixes: 8afe97e5 ("tunnels: support MPLS over IPv4 tunnels")
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

aa9667e7

Merge branch 'sctp-rfc7496-support' · 4154cb40

David S. Miller authored Jul 11, 2016

Xin Long says:

====================
sctp: implement rfc7496 in sctp

This patchset implements "Additional Policies for the Partially Reliable
Stream Control Transmission Protocol Extension" described on RFC7496.

The Partially Reliable SCTP (PR-SCTP) extension defined in [RFC3758]
provides a generic method for senders to abandon user messages. The
decision to abandon a user message is sender side only, and the exact
condition is called a "PR-SCTP policy". This patchset implements 3
policies:

 1. Timed Reliability:  This allows the sender to specify a timeout for
    a user message after which the SCTP stack abandons the user message.

 2. Limited Retransmission Policy:  Allows limitation of the number of
    retransmissions.

 3. Priority Policy:  Allows removal of lower-priority messages if space
    for higher-priority messages is needed in the send buffer.

Patch 1-3 add some sockopts in sctp to set/get pr_sctp policy status.
Patch 4-6 implement these 3 policies one by one.
====================
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

4154cb40

sctp: implement prsctp PRIO policy · 8dbdf1f5

Xin Long authored Jul 09, 2016

prsctp PRIO policy is a policy to abandon lower priority chunks when
asoc doesn't have enough snd buffer, so that the current chunk with
higher priority can be queued successfully.

Similar to TTL/RTX policy, we will set the priority of the chunk to
prsctp_param with sinfo->sinfo_timetolive in sctp_set_prsctp_policy().
So if PRIO policy is enabled, msg->expire_at won't work.

asoc->sent_cnt_removable will record how many chunks can be checked to
remove. If priority policy is enabled, when the chunk is queued into
the out_queue, we will increase sent_cnt_removable. When the chunk is
moved to abandon_queue or dequeue and free, we will decrease
sent_cnt_removable.

In sctp_sendmsg, we will check if there is enough snd buffer for current
msg and if sent_cnt_removable is not 0. Then try to abandon chunks in
sctp_prune_prsctp when sendmsg from the retransmit/transmited queue, and
free chunks from out_queue in right order until the abandon+free size >
msg_len - sctp_wfree. For the abandon size, we have to wait until it
sends FORWARD TSN, receives the sack and the chunks are really freed.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

8dbdf1f5

sctp: implement prsctp RTX policy · 01aadb3a

Xin Long authored Jul 09, 2016

prsctp RTX policy is a policy to abandon chunks when they are
retransmitted beyond the max count.

This patch uses sent_count to count how many times one chunk has
been sent, and prsctp_param is the max rtx count, which is from
sinfo->sinfo_timetolive in sctp_set_prsctp_policy(). So similar
to TTL policy, if RTX policy is enabled, msg->expire_at won't
work.

Then in sctp_chunk_abandoned, this patch checks if chunk->sent_count
is bigger than chunk->prsctp_param to abandon this chunk.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

01aadb3a

sctp: implement prsctp TTL policy · a6c2f792

Xin Long authored Jul 09, 2016

prsctp TTL policy is a policy to abandon chunks when they expire
at the specific time in local stack. It's similar with expires_at
in struct sctp_datamsg.

This patch uses sinfo->sinfo_timetolive to set the specific time for
TTL policy. sinfo->sinfo_timetolive is also used for msg->expires_at.
So if prsctp_enable or TTL policy is not enabled, msg->expires_at
still works as before.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a6c2f792

sctp: add SCTP_PR_ASSOC_STATUS on sctp sockopt · 826d253d

Xin Long authored Jul 09, 2016

This patch adds SCTP_PR_ASSOC_STATUS to sctp sockopt, which is used
to dump the prsctp statistics info from the asoc. The prsctp statistics
includes abandoned_sent/unsent from the asoc. abandoned_sent is the
count of the packets we drop packets from retransmit/transmited queue,
and abandoned_unsent is the count of the packets we drop from out_queue
according to the policy.

Note: another option for prsctp statistics dump described in rfc is
SCTP_PR_STREAM_STATUS, which is used to dump the prsctp statistics
info from each stream. But by now, linux doesn't yet have per stream
statistics info, it needs rfc6525 to be implemented. As the prsctp
statistics for each stream has to be based on per stream statistics,
we will delay it until rfc6525 is done in linux.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

826d253d

sctp: add SCTP_DEFAULT_PRINFO into sctp sockopt · f959fb44

Xin Long authored Jul 09, 2016

This patch adds SCTP_DEFAULT_PRINFO to sctp sockopt. It is used
to set/get sctp Partially Reliable Policies' default params,
which includes 3 policies (ttl, rtx, prio) and their values.

Still, if we set policy params in sndinfo, we will use the params
of sndinfo against chunks, instead of the default params.

In this patch, we will use 5-8bit of sp/asoc->default_flags
to store prsctp policies, and reuse asoc->default_timetolive
to store their values. It means if we enable and set prsctp
policy, prior ttl timeout in sctp will not work any more.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

f959fb44

sctp: add SCTP_PR_SUPPORTED on sctp sockopt · 28aa4c26

Xin Long authored Jul 09, 2016

According to section 4.5 of rfc7496, prsctp_enable should be per asoc.
We will add prsctp_enable to both asoc and ep, and replace the places
where it used net.sctp->prsctp_enable with asoc->prsctp_enable.

ep->prsctp_enable will be initialized with net.sctp->prsctp_enable, and
asoc->prsctp_enable will be initialized with ep->prsctp_enable. We can
also modify it's value through sockopt SCTP_PR_SUPPORTED.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

28aa4c26

Revert "net: ethernet: bcmgenet: use phy_ethtool_{get|set}_link_ksettings" · bac65c4b

Philippe Reynes authored Jul 09, 2016

This reverts commit 4386f566 ("net: ethernet: bcmgenet: use
phy_ethtool_{get|set}_link_ksettings")

This patch is wrong, the function phy_ethtool_{get|set}_link_ksettings
don't check if the device is running, but the driver bcmgenet need this
check.

The function {get|set}_settings need to access the mdio bus, and this
bus may only be used when the device is running. Otherwise, the clock
is disable and a mdio access will fail.
Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bac65c4b

Merge branch 'b53-nsp-switch' · ddeec083

David S. Miller authored Jul 11, 2016

Florian Fainelli says:

====================
net: dsa: b53: Add Broadcom NSP switch support

This patch series updates the B53 driver to support Broadcom's Northstar Plus
Soc integrated switch.

Unlike the version of the core present in BCM5301x/Northstar, we cannot read the
full chip id of the switch, so we need to get the information about our switch
id from Device Tree.

Other than that, this is a regular Broadcom Ethernet switch which is register
compatible for all practical purposes with the existing switch driver.

Since DSA requires a working CPU Ethernet MAC driver this depends on Jon
Mason's AMAC/BGMAC driver changes to support NSP. Board specific changes depend
on patches present in Broadcom's ARM SoC branches and will be posted in a short
while.
====================
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

ddeec083

net: dsa: b53: Add support for BCM585xx/586xx/88312 integrated switch · 991a36bb

Florian Fainelli authored Jul 08, 2016

Update the SRAB, core driver and binding document to support the
BCM585xx/586xx/88312 integrated switch (Northstar Plus SoCs family).
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

991a36bb

net: dsa: b53: Allow SRAB driver to specify platform data · fefae690

Florian Fainelli authored Jul 08, 2016

For Northstart Plus SoCs, we cannot detect the switch because only the
revision information is provied in the Management page, instead, rely on
Device Tree to tell us the chip id, and pass it down using the
b53_platform_data structure.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

fefae690

net: ethernet: Add TSE PCS support to dwmac-socfpga · fb3bbdb8

Tien Hock Loh authored Jul 07, 2016

This adds support for TSE PCS that uses SGMII adapter when the phy-mode of
the dwmac is set to sgmii.
Signed-off-by: Tien Hock Loh <thloh@altera.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

fb3bbdb8

09 Jul, 2016 25 commits

ipv6: do not abuse GFP_ATOMIC in inet6_netconf_notify_devconf() · 927265bc

Eric Dumazet authored Jul 08, 2016

All inet6_netconf_notify_devconf() callers are in process context,
so we can use GFP_KERNEL allocations if we take care of not holding
a rwlock while not needed in ip6mr (we hold RTNL there)

Fixes: d67b8c61 ("netconf: advertise mc_forwarding status")
Fixes: f3a1bfb1 ("rtnl/ipv6: use netconf msg to advertise forwarding status")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

927265bc

ipv4: do not abuse GFP_ATOMIC in inet_netconf_notify_devconf() · fa17806c

Eric Dumazet authored Jul 08, 2016

inet_forward_change() runs with RTNL held.
We are allowed to sleep if required.

If we use __in_dev_get_rtnl() instead of __in_dev_get_rcu(),
we no longer have to use GFP_ATOMIC allocations in
inet_netconf_notify_devconf(), meaning we are less likely to miss
notifications under memory pressure, and wont touch precious memory
reserves either and risk dropping incoming packets.

inet_netconf_get_devconf() can also use GFP_KERNEL allocation.

Fixes: edc9e748 ("rtnl/ipv4: use netconf msg to advertise forwarding status")
Fixes: 9e551110 ("rtnl/ipv4: add support of RTM_GETNETCONF")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

fa17806c

Merge branch 'bgmac-platform-device' · db9a1ba5

David S. Miller authored Jul 09, 2016

Jon Mason says:

====================
net: ethernet: bgmac: Add platform device support

David Miller, Please consider including patches 1-5 in net-next

Florian Fainelli, Please consider including patches 6 & 7 in
  devicetree/next

Changes in v2:
* Made device tree binding changes suggested by Sergei Shtylyov,
  Ray Jui, Rob Herring, Florian Fainelli, and Arnd Bergmann
* Removed devm_* error paths in the bgmac_platform.c suggested by
  Florian Fainelli
* Added Arnd Bergmann's Acked-by to the first 5 (there were changes
  outlined in the bullets above, but I believe them to be minor enough
  for him to not revoke his acks)

This patch series adds support for other, non-bcma iProc SoC's to the
bgmac driver.  This series only adds NSP support, but we are interested
in adding support for the Cygnus and NS2 families (with more possible
down the road).

To support non-bcma enabled SoCs, we need to add the standard device
tree "platform device" support.  Unfortunately, this driver is very
tighly coupled with the bcma bus and much unwinding is needed.  I tried
to break this up into a number of patches to make it more obvious what
was being done to add platform device support.  I was able to verify
that the bcma code still works using a 53012K board (NS SoC), and that
the platform code works using a 58625K board (NSP SoC).
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

db9a1ba5

net: ethernet: bgmac: Add platform device support · f6a95a24

Jon Mason authored Jul 07, 2016

The bcma portion of the driver has been split off into a bcma specific
driver.  This has been mirrored for the platform driver.  The last
references to the bcma core struct have been changed into a generic
function call.  These function calls are wrappers to either the original
bcma code or new platform functions that access the same areas via MMIO.
This necessitated adding function pointers for both platform and bcma to
hide which backend is being used from the generic bgmac code.
Signed-off-by: Jon Mason <jon.mason@broadcom.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

f6a95a24

net: ethernet: bgmac: convert to feature flags · db791eb2

Jon Mason authored Jul 07, 2016

The bgmac driver is using the bcma provides device ID and revision, as
well as the SoC ID and package, to determine which features are
necessary to enable, reset, etc in the driver.   In anticipation of
removing the bcma requirement for this driver, these must be changed to
not reference that struct.  In place of that, each "feature" has been
given a flag, and the flags are enabled for their respective device and
SoC.
Signed-off-by: Jon Mason <jon.mason@broadcom.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

db791eb2

net: ethernet: bgmac: move BCMA MDIO Phy code into a separate file · 55954f3b

Jon Mason authored Jul 07, 2016

Move the BCMA MDIO phy into a separate file, as it is very tightly
coupled with the BCMA bus.  This will help with the upcoming BCMA
removal from the bgmac driver.  Optimally, this should be moved into
phy drivers, but it is too tightly coupled with the bgmac driver to
effectively move it without more changes to the driver.

Note: the phy_reset was intentionally removed, as the mdio phy subsystem
automatically resets the phy if a reset function pointer is present.  In
addition to the moving of the driver, this reset function is added.
Signed-off-by: Jon Mason <jon.mason@broadcom.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

55954f3b

net: ethernet: bgmac: add dma_dev pointer · a0b68486

Jon Mason authored Jul 07, 2016

The dma buffer allocation, etc references a dma_dev device pointer from
the bcma core.  In anticipation of removing the bcma requirement for
this driver, these must be changed to not reference that struct.  Add a
dma_dev device pointer to the bgmac stuct and reference that instead.
Signed-off-by: Jon Mason <jon.mason@broadcom.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a0b68486

net: ethernet: bgmac: change bgmac_* prints to dev_* prints · d00a8281

Jon Mason authored Jul 07, 2016

The bgmac_* print wrappers call dev_* prints with the dev pointer from
the bcma core.  In anticipation of removing the bcma requirement for
this driver, these must be changed to not reference that struct.  So,
simply change all of the bgmac_* prints to their dev_* counterparts.  In
some cases netdev_* prints are more appropriate, so change those as
well.
Signed-off-by: Jon Mason <jon.mason@broadcom.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

d00a8281

net: tracepoint napi:napi_poll add work and budget · 1db19db7

Jesper Dangaard Brouer authored Jul 07, 2016

An important information for the napi_poll tracepoint is knowing
the work done (packets processed) by the napi_poll() call. Add
both the work done and budget, as they are related.

Handle trace_napi_poll() param change in dropwatch/drop_monitor
and in python perf script netdev-times.py in backward compat way,
as python fortunately supports optional parameter handling.
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

1db19db7

Merge branch 'r8152-next' · 89141e1c

David S. Miller authored Jul 09, 2016

Hayes Wang says:

====================
r8152: remove the redundant code

Remove the unnacessary code.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

89141e1c

r8152: remove cancel_delayed_work_sync in rtl8152_set_speed · c23d86ae

hayeswang authored Jul 07, 2016

There is no conflict between the work_queue function and
rtl8152_set_speed(), so we don't have to cancel the delayed work in
rtl8152_set_speed().
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c23d86ae

r8152: remove a netif_carrier_off in rtl8152_open function · c79262f3

hayeswang authored Jul 07, 2016

After commit 90186af4 ("r8152: fix lockup when runtime PM is enabled"),
the autoresume wouldn't start the device before rtl8152_open() is finished.
Therefore, we don't have to reset the linking status before and after
autoresume. That is, one of netif_carrier_off() in rtl8152_open() could be
removed.
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c79262f3

r8152: remove rtl_phy_reset function · b1648066

hayeswang authored Jul 07, 2016

In rtl_hw_phy_work_func_t(), the flag of PHY_RESET is set in
rtl_ops.hw_phy_cfg() and cleared in rtl8152_set_speed(). Therefore,
the rtl_phy_reset() is never run and is unnecessary.
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

b1648066

Merge branch 'mpls-in-ipv4-and-udp' · fb577316

David S. Miller authored Jul 09, 2016

Simon Horman says:

====================
net: support MPLS in IPv4 and UDP

This short series provides support for MPLS in IPv4 (RFC4023), and by
virtue of FOU, MPLS in UDP (RFC7510).

The changes are as follows:
1. Teach tunnel4.c about AF_MPLS, it already understands AF_INET and
   AF_INET6
2. Enhance IPIP and SIT to handle MPLS. Both already handle IPv4.
   SIT also already handles IPv6.
3. Trivially enhance MPLS to allow routes over SIT and IPIP tunnels.

A corresponding patch set for iproute2 has also been provided.

Changes since v1
* Correct inverted IPIP protocol logic in SIT patch
* Provide usage example below

Sample configuration follows:

* The following creates a tunnel and routes MPLS packets whose outermost
  label is 100 over it. The forwarded packets will have the outermost label
  stack entry, 100, removed and two label stack entries added, the
  outermost having label 200 and the next having label 300.

  The local end-point for the tunnel is 10.0.99.192 and the remote
  endpoint is 10.0.99.193.

  The local address for encapsulated packets is 10.0.98.192 and the
  remote address is 10.0.98.193.

  # Create an MPLS over IPv4 tunnel using the IPIP driver
  ip link add name tun1 type ipip remote 10.0.99.193 local 10.0.99.192 \
	ttl 225 mode mplsip

  # Bring the tunnel up and an add an IPv4 address and route
  ip link set up dev tun1
  ip addr add 10.0.98.192/24 dev tun1

  # Set MPLS route
  # Allow MPLS forwarding of packets recieved on eth0
  echo 1 > /proc/sys/net/mpls/conf/eth0/input
  # Larger than label to be routed (100)
  echo 101 > /proc/sys/net/mpls/platform_labels
  ip -f mpls route add 100 as 200/300 via inet 10.0.98.193

* For FOU (in this case MPLS over UDP) a tunnel may created using:

  # Packets recieved on UDP port 6635 are MPLS over UDP (IP proto 137)
  ip fou add port 6635 ipproto 137
  # Create the tunnel netdev
  ip link add name tun1 type ipip remote 10.0.99.193 local 10.0.99.192 \
	ttl 225 mode mplsip encap fou encap-sport auto encap-dport 6635

  IPv4 address, link and route, and MPLS routing commands are as per
  the MPLS over IPv4 example

* To use the SIT driver instead of the IPIP driver "ipip" may be substituted
  for "sit" in the above examples.

* To create a tunnel that forwards and receives all supported
  inner-protocols "mplsip" may be substituted for "any" in the above
  examples.

  For the IPIP driver this configures both IPv4 and MPLS over IPv4.
  For the SIT driver this configures IPv6, IPv4 and MPLS over IPv4.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

fb577316

mpls: allow routes on ipip and sit devices · 407f31be

Simon Horman authored Jul 07, 2016

Allow MPLS routes on IPIP and SIT devices now that they
support forwarding MPLS packets.
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Reviewed-by: Dinan Gunawardena <dinan.gunawardena@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

407f31be

ipip: support MPLS over IPv4 · 1b69e7e6

Simon Horman authored Jul 07, 2016

Extend the IPIP driver to support MPLS over IPv4. The implementation is an
extension of existing support for IPv4 over IPv4 and is based of multiple
inner-protocol support for the SIT driver.
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Reviewed-by: Dinan Gunawardena <dinan.gunawardena@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

1b69e7e6

sit: support MPLS over IPv4 · 49dbe7ae

Simon Horman authored Jul 07, 2016

Extend the SIT driver to support MPLS over IPv4. This implementation
extends existing support for IPv6 over IPv4 and IPv4 over IPv4.
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Reviewed-by: Dinan Gunawardena <dinan.gunawardena@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

49dbe7ae

tunnels: support MPLS over IPv4 tunnels · 8afe97e5

Simon Horman authored Jul 07, 2016

Extend tunnel support to MPLS over IPv4.  The implementation extends the
existing differentiation between IPIP and IPv6 over IPv4 to also cover MPLS
over IPv4.
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Reviewed-by: Dinan Gunawardena <dinan.gunawardena@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

8afe97e5

net: bridge: extend MLD/IGMP query stats · a65056ec

Nikolay Aleksandrov authored Jul 06, 2016

As was suggested this patch adds support for the different versions of MLD
and IGMP query types. Since the user visible structure is still in net-next
we can augment it instead of adding netlink attributes.
The distinction between the different IGMP/MLD query types is done as
suggested in Section 7.1, RFC 3376 [1] and Section 8.1, RFC 3810 [2] based
on query payload size and code for IGMP. Since all IGMP packets go through
multicast_rcv() and it uses ip_mc_check_igmp/ipv6_mc_check_mld we can be
sure that at least the ip/ipv6 header can be directly used.

[1] https://tools.ietf.org/html/rfc3376#section-7
[2] https://tools.ietf.org/html/rfc3810#section-8.1Suggested-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

a65056ec

sctp: fix panic when sending auth chunks · f1533cce

Marcelo Ricardo Leitner authored Jul 07, 2016

When we introduced GSO support, if using auth the auth chunk was being
left queued on the packet even after the final segment was generated.
Later on sctp_transmit_packet it calls sctp_packet_reset, which zeroed
the packet len while not accounting for this left-over. This caused more
space to be used the next packet due to the chunk still being queued,
but space which wasn't allocated as its size wasn't accounted.

The fix is to only queue it back when we know that we are going to
generate another segment.

Fixes: 90017acc ("sctp: Add GSO support")
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

f1533cce

bnxt: fix a condition · 09a7636a

Dan Carpenter authored Jul 07, 2016

This code generates as static checker warning because htons(ETH_P_IPV6)
is always true.  From the context it looks like the && was intended to
be !=.

Fixes: 94758f8d ('bnxt_en: Add GRO logic for BCM5731X chips.')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

09a7636a

bpf: introduce bpf_get_current_task() helper · 606274c5

Alexei Starovoitov authored Jul 06, 2016

over time there were multiple requests to access different data
structures and fields of task_struct current, so finally add
the helper to access 'current' as-is. Tracing bpf programs will do
the rest of walking the pointers via bpf_probe_read().
Note that current can be null and bpf program has to deal it with,
but even dumb passing null into bpf_probe_read() is still safe.
Suggested-by: Brendan Gregg <brendan.d.gregg@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

606274c5

net: dsa: initialize the routing table · d390238c

Vivien Didelot authored Jul 06, 2016

The routing table of every switch in a tree is currently initialized to
all zeros. This is an issue since 0 is a valid port number.

Add a DSA_RTABLE_NONE=-1 constant to initialize the signed values of the
routing table pointing to other switches.

This fixes the device mapping of the mv88e6xxx driver where the port
pointing to the switch itself and to non-existent switches was wrongly
configured to be 0. It is now set to the expected 0xf value.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

d390238c

tun: Don't assume type tun in tun_device_event · 86dfb4ac

Craig Gallek authored Jul 06, 2016

The referenced change added a netlink notifier for processing
device queue size events.  These events are fired for all devices
but the registered callback assumed they only occurred for tun
devices.  This fix adds a check (borrowed from macvtap.c) to discard
non-tun device events.

For reference, this fixes the following splat:
[   71.505935] BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
[   71.513870] IP: [<ffffffff8153c1a0>] tun_device_event+0x110/0x340
[   71.519906] PGD 3f41f56067 PUD 3f264b7067 PMD 0
[   71.524497] Oops: 0002 [#1] SMP DEBUG_PAGEALLOC
[   71.529374] gsmi: Log Shutdown Reason 0x03
[   71.533417] Modules linked in:[   71.533826] mlx4_en: eth1: Link Up

[   71.539616]  bonding w1_therm wire cdc_acm ehci_pci ehci_hcd mlx4_en ib_uverbs mlx4_ib ib_core mlx4_core
[   71.549282] CPU: 12 PID: 7915 Comm: set.ixion-haswe Not tainted 4.7.0-dbx-DEV #8
[   71.556586] Hardware name: Intel Grantley,Wellsburg/Ixion_IT_15, BIOS 2.58.0 05/03/2016
[   71.564495] task: ffff887f00bb20c0 ti: ffff887f00798000 task.ti: ffff887f00798000
[   71.571894] RIP: 0010:[<ffffffff8153c1a0>]  [<ffffffff8153c1a0>] tun_device_event+0x110/0x340
[   71.580327] RSP: 0018:ffff887f0079bbd8  EFLAGS: 00010202
[   71.585576] RAX: fffffffffffffae8 RBX: ffff887ef6d03378 RCX: 0000000000000000
[   71.592624] RDX: 0000000000000000 RSI: 0000000000000028 RDI: 0000000000000000
[   71.599675] RBP: ffff887f0079bc48 R08: 0000000000000000 R09: 0000000000000001
[   71.606730] R10: 0000000000000004 R11: 0000000000000000 R12: 0000000000000010
[   71.613780] R13: 0000000000000000 R14: 0000000000000001 R15: ffff887f0079bd00
[   71.620832] FS:  00007f5cdc581700(0000) GS:ffff883f7f700000(0000) knlGS:0000000000000000
[   71.628826] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   71.634500] CR2: 0000000000000010 CR3: 0000003f3eb62000 CR4: 00000000001406e0
[   71.641549] Stack:
[   71.643533]  ffff887f0079bc08 0000000000000246 000000000000001e ffff887ef6d00000
[   71.650871]  ffff887f0079bd00 0000000000000000 0000000000000000 ffffffff00000000
[   71.658210]  ffff887f0079bc48 ffffffff81d24070 00000000fffffff9 ffffffff81cec7a0
[   71.665549] Call Trace:
[   71.667975]  [<ffffffff810eeb0d>] notifier_call_chain+0x5d/0x80
[   71.673823]  [<ffffffff816365d0>] ? show_tx_maxrate+0x30/0x30
[   71.679502]  [<ffffffff810eeb3e>] __raw_notifier_call_chain+0xe/0x10
[   71.685778]  [<ffffffff810eeb56>] raw_notifier_call_chain+0x16/0x20
[   71.691976]  [<ffffffff8160eb30>] call_netdevice_notifiers_info+0x40/0x70
[   71.698681]  [<ffffffff8160ec36>] call_netdevice_notifiers+0x16/0x20
[   71.704956]  [<ffffffff81636636>] change_tx_queue_len+0x66/0x90
[   71.710807]  [<ffffffff816381ef>] netdev_store.isra.5+0xbf/0xd0
[   71.716658]  [<ffffffff81638350>] tx_queue_len_store+0x50/0x60
[   71.722431]  [<ffffffff814a6798>] dev_attr_store+0x18/0x30
[   71.727857]  [<ffffffff812ea3ff>] sysfs_kf_write+0x4f/0x70
[   71.733274]  [<ffffffff812e9507>] kernfs_fop_write+0x147/0x1d0
[   71.739045]  [<ffffffff81134a4f>] ? rcu_read_lock_sched_held+0x8f/0xa0
[   71.745499]  [<ffffffff8125a108>] __vfs_write+0x28/0x120
[   71.750748]  [<ffffffff8111b137>] ? percpu_down_read+0x57/0x90
[   71.756516]  [<ffffffff8125d7d8>] ? __sb_start_write+0xc8/0xe0
[   71.762278]  [<ffffffff8125d7d8>] ? __sb_start_write+0xc8/0xe0
[   71.768038]  [<ffffffff8125bd5e>] vfs_write+0xbe/0x1b0
[   71.773113]  [<ffffffff8125c092>] SyS_write+0x52/0xa0
[   71.778110]  [<ffffffff817528e5>] entry_SYSCALL_64_fastpath+0x18/0xa8
[   71.784472] Code: 45 31 f6 48 8b 93 78 33 00 00 48 81 c3 78 33 00 00 48 39 d3 48 8d 82 e8 fa ff ff 74 25 48 8d b0 40 05 00 00 49 63 d6 41 83 c6 01 <49> 89 34 d4 48 8b 90 18 05 00 00 48 39 d3 48 8d 82 e8 fa ff ff
[   71.803655] RIP  [<ffffffff8153c1a0>] tun_device_event+0x110/0x340
[   71.809769]  RSP <ffff887f0079bbd8>
[   71.813213] CR2: 0000000000000010
[   71.816512] ---[ end trace 4db6449606319f73 ]---

Fixes: 1576d986 ("tun: switch to use skb array for tx")
Signed-off-by: Craig Gallek <kraig@google.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

86dfb4ac

Merge tag 'rxrpc-rewrite-20160706' of... · cc3baecb

David S. Miller authored Jul 08, 2016

Merge tag 'rxrpc-rewrite-20160706' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs

David Howells says:

====================
rxrpc: Improve conn/call lookup and fix call number generation [ver #3]

I've fixed a couple of patch descriptions and excised the patch that
duplicated the connections list for reconsideration at a later date.

For reference, the excised patch is sitting on the rxrpc-experimental
branch of my git tree, based on top of the rxrpc-rewrite branch.  Diffing
it against yesterday's tag shows no differences.

Would you prefer the patch set to be emailed afresh instead of a git-pull
request?

David
---
Here's the next part of the AF_RXRPC rewrite.  The two main purposes of
this set are to fix the call number handling and to make use of RCU when
looking up the connection or call to pass a received packet to.

Important changes in this set include:

 (1) Avoidance of placing stack data into SG lists in rxkad so that kernel
     stacks can become vmalloc'd (Herbert Xu).

 (2) Calls cease pinning the connection they used as soon as possible,
     which allows the connection to be discarded sooner and allows the call
     channel on that connection to be reused earlier.

 (3) Make each call channel on a connection have a separate and independent
     call number space rather than having a shared number space for the
     connection.  Call numbers should increment monotonically per channel
     on the client, and the server should ignore a call with a lower call
     number for that channel than the latest it has seen.  The RESPONSE
     packet sets the minimum values of each call ID counter on a
     connection.

 (4) Look up calls by indexing the channel array on a connection rather
     than by keeping calls in an rbtree on that connection.  Also look up
     calls using the channel array rather than using a hashtable.

     The call hashtable can then be removed.

 (5) Call terminal statuses are cached in the channel array for the last
     call.  It is assumed that if we the server have seen call N, then the
     client no longer cares about call N-1 on the same channel.

     This will allow retransmission of the terminal status in future
     without the need to keep the rxrpc_call struct around.

 (6) Peer lookups are moved out of common connection handling code and into
     service connection handling code as client connections (a) must point
     to a peer before they can be used and (b) are looked up by a
     machine-unique connection ID directly, so we only need to look up the
     peer first if we're going to deal with a service call.

 (7) The reference count on a connection is held elevated by 1 whilst it is
     alive (ie. idle unused connections have a refcount of 1).  The reaper
     will attempt to change the refcount from 1->0 and skip if this cannot
     be done, whilst look ups only increment the refcount if it's non-zero.

     This makes the implementation of RCU lookups easier as we don't have
     to get a ref on the connection or a lock on the connection list to
     prevent a connection being reaped whilst we're contemplating queueing
     a packet that initiates a new service call upon it.

     If we need to get a connection, but there's a dead connection in the
     tree, we use rb_replace_node() to replace the dead one with a new one.

 (8) Use a seqlock to validate the walk over the service connection rbtree
     attached to a peer when it's being walked in RCU mode.

 (9) Make the incoming call/connection packet handling code use RCU mode
     and locks and make it only take a reference if the call/connection
     gets queued on a workqueue.

The intention is that the next set will introduce the connection lifetime
management and capacity limits to prevent clients from overloading the
server.

There are some fixes too:

 (1) Verifying that a packet coming in to a client connection came from the
     expected source.

 (2) Fix handling of connection failure in client call creation where we
     don't reinitialise the list linkage block and a second attempt to
     unlink the failed connection oopses and also we don't set the state
     correctly, which causes an assertion failure.

 (3) New service calls were being added to the socket's accept queue under
     the wrong lock.

Changes:

 (V2) In rxrpc_find_service_conn_rcu() initialised the sequence number to 0.

      Fixed the RCU handling in conn_service.c by introducing and using
      rb_replace_node_rcu() as an RCU-safe alternative in
      rxrpc_publish_service_conn().

      Modified and used rcu_dereference_raw() to avoid RCU sparse warnings
      in rxrpc_find_service_conn_rcu().

      Added in some missing RCU dereference wrappers.  It seems to be
      necessary to turn on CONFIG_PROVE_RCU_REPEATEDLY as well as
      CONFIG_SPARSE_RCU_POINTER to get the static __rcu annotation checking
      to happen.

      Fixed some other sparse warnings, including a missing ntohs() in
      jumbo packet processing.

 (V3) Fixed some commit descriptions.

      Excised the patch that duplicated the connection list to separate out
      the procfs list for reconsideration at a later date.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

cc3baecb