Commits · 5425077d73e0c8e7e9267ca8397cc0e2293c1fb9 · nexedi / linux

13 Mar, 2017 8 commits

net: ipv6: Add early demux handler for UDP unicast · 5425077d

subashab@codeaurora.org authored Mar 08, 2017

While running a single stream UDPv6 test, we observed that amount
of CPU spent in NET_RX softirq was much greater than UDPv4 for an
equivalent receive rate. The test here was run on an ARM64 based
Android system. On further analysis with perf, we found that UDPv6
was spending significant time in the statistics netfilter targets
which did socket lookup per packet. These statistics rules perform
a lookup when there is no socket associated with the skb. Since
there are multiple instances of these rules based on UID, there
will be equal number of lookups per skb.

By introducing early demux for UDPv6, we avoid the redundant lookups.
This also helped to improve the performance (800Mbps -> 870Mbps) on a
CPU limited system in a single stream UDPv6 receive test with 1450
byte sized datagrams using iperf.

v1->v2: Use IPv6 cookie to validate dst instead of 0 as suggested
by Eric
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

5425077d

net: sched: make default fifo qdiscs appear in the dump · 49b49971

Jiri Kosina authored Mar 08, 2017

The original reason [1] for having hidden qdiscs (potential scalability
issues in qdisc_match_from_root() with single linked list in case of large
amount of qdiscs) has been invalidated by 59cc1f61 ("net: sched: convert
qdisc linked list to hashtable").

This allows us for bringing more clarity and determinism into the dump by
making default pfifo qdiscs visible.

We're not turning this on by default though, at it was deemed [2] too
intrusive / unnecessary change of default behavior towards userspace.
Instead, TCA_DUMP_INVISIBLE netlink attribute is introduced, which allows
applications to request complete qdisc hierarchy dump, including the
ones that have always been implicit/invisible.

Singleton noop_qdisc stays invisible, as teaching the whole infrastructure
about singletons would require quite some surgery with very little gain
(seeing no qdisc or seeing noop qdisc in the dump is probably setting
the same user expectation).

[1] http://lkml.kernel.org/r/1460732328.10638.74.camel@edumazet-glaptop3.roam.corp.google.com
[2] http://lkml.kernel.org/r/20161021.105935.1907696543877061916.davem@davemloft.netSigned-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>

49b49971

net: hyperv: use new api ethtool_{get|set}_link_ksettings · 5e8456fd

Philippe Reynes authored Mar 08, 2017

The ethtool api {get|set}_settings is deprecated.
We move this driver to new api {get|set}_link_ksettings.

As I don't have the hardware, I'd be very pleased if
someone may test this patch.
Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Tested-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

5e8456fd

net: fjes: use new api ethtool_{get|set}_link_ksettings · 031e5144

Philippe Reynes authored Mar 08, 2017

The ethtool api {get|set}_settings is deprecated.
We move this driver to new api {get|set}_link_ksettings.

As I don't have the hardware, I'd be very pleased if
someone may test this patch.
Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

031e5144

net: via: via-velocity: use new api ethtool_{get|set}_link_ksettings · 92f4a13e

Philippe Reynes authored Mar 08, 2017

The ethtool api {get|set}_settings is deprecated.
We move this driver to new api {get|set}_link_ksettings.

As I don't have the hardware, I'd be very pleased if
someone may test this patch.
Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

92f4a13e

net: via: via-rhine: use new api ethtool_{get|set}_link_ksettings · f918b986

Philippe Reynes authored Mar 07, 2017

The ethtool api {get|set}_settings is deprecated.
We move this driver to new api {get|set}_link_ksettings.

As I don't have the hardware, I'd be very pleased if
someone may test this patch.
Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

f918b986

net: intel: ixgbe: use new api ethtool_{get|set}_link_ksettings · 8704f21c

Philippe Reynes authored Mar 07, 2017

The ethtool api {get|set}_settings is deprecated.
We move this driver to new api {get|set}_link_ksettings.

As I don't have the hardware, I'd be very pleased if
someone may test this patch.
Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

8704f21c

net: tundra: tsi108: use new api ethtool_{get|set}_link_ksettings · 99bdb400

Philippe Reynes authored Mar 06, 2017

The ethtool api {get|set}_settings is deprecated.
We move this driver to new api {get|set}_link_ksettings.

As I don't have the hardware, I'd be very pleased if
someone may test this patch.
Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

99bdb400

10 Mar, 2017 32 commits

Merge branch 'fib-notifications-cleanup' · 01461abe

David S. Miller authored Mar 10, 2017

Jiri Pirko says:

====================
ipv4: fib: FIB notifications cleanup

Ido says:

The first patch moves the core FIB notification code to a separate file,
so that code related to FIB rules is placed in fib_rules.c and not
fib_trie.c. The reason for the change will become even more apparent in
follow-up patchset where we extend the FIB rules notifications.

Second patch removes a redundant argument.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

01461abe

ipv4: fib: Remove redundant argument · d05f7a7d

Ido Schimmel authored Mar 10, 2017

We always pass the same event type to fib_notify() and
fib_rules_notify(), so we can safely drop this argument.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

d05f7a7d

ipv4: fib: Move FIB notification code to a separate file · c0243892

Ido Schimmel authored Mar 10, 2017

Most of the code concerned with the FIB notification chain currently
resides in fib_trie.c, but this isn't really appropriate, as the FIB
notification chain is also used for FIB rules.

Therefore, it makes sense to move the common FIB notification code to a
separate file and have it export the relevant functions, which can be
invoked by its different users (e.g., fib_trie.c, fib_rules.c).
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c0243892

Merge branch 'mlxsw-vrf-offload-prep' · 71e8727e

David S. Miller authored Mar 10, 2017

Jiri Pirko says:

====================
mlxsw: Preparations for VRF offload

Ido says:

This patchset aims to prepare the mlxsw driver for VRF offload. The
follow-up patchsets that introduce VRF support can be found here:
https://github.com/idosch/linux/tree/idosch-next

The first four patches are mainly concerned with the netdevice
notification block. There are no functional changes, but merely
restructuring to more easily integrate VRF enslavement.

Patches 5-10 remove various assumptions throughout the code about a
single virtual router (VR) and also restructure the internal data
structures to more accurately represent the device's operation.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

71e8727e

mlxsw: spectrum_router: Make abort mechanism VR-aware · b5d90e6d

Ido Schimmel authored Mar 10, 2017

When the abort mechanism is invoked it binds the first virtual router
(VR) to an LPM tree and inserts a default route to direct packets to the
CPU.

With VRFs, we can have router interfaces (RIFs) bound to multiple VRs,
so we need to make sure packets are trapped from all VRs and not just
the first one.

Upon abort invocation, bind all active VRs to the same LPM tree and
insert a default route in each.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

b5d90e6d

mlxsw: spectrum_router: Explicitly Associate RIFs with VRs · 6913229e

Ido Schimmel authored Mar 10, 2017

Up until now we implicitly associated all the router interfaces (RIFs)
with the first virtual router (VR). This must be changed in order to
enable VRF offload. Otherwise, a packet received via a VRF slave would
do a FIB lookup in the same table used by other VRFs.

Instead, bind the RIF to a VR according to the table where FIB lookup
should be performed for packets received via the RIF.

Currently, we only care about the MAIN and LOCAL tables (which we squash
together).
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

6913229e

mlxsw: spectrum_router: Refactor virtual router handling · 76610ebb

Ido Schimmel authored Mar 10, 2017

A virtual router (VR) is an entity within the device to which routing
tables and interfaces can be bound to. It can be used to implement VRFs.

In the initial implementation we associated the VR with a specific
protocol (e.g., IPv4) and an LPM tree. However, this isn't really
accurate, as the same VR can be used for both IPv4 and IPv6 traffic, by
binding a different LPM tree to a {VR, Proto} pair.

This patch aims to restructure the VR code according to the above logic,
so that VRs are more accurately represented by the driver's data
structures. The main motivation behind this change is to prepare the
driver for VRF offload.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

76610ebb

mlxsw: spectrum_router: Simplify LPM tree allocation · 382dbb40

Ido Schimmel authored Mar 10, 2017

When looking for a new LPM tree we should always consider all the unused
trees. It doesn't matter if the new tree is required due to changes in
currently used prefixes inside an existing routing table or because a
route was inserted into an empty table.

Both cases are functionally identical and therefore should be treated
the same.

When looking for a new LPM tree, consider all unused trees and don't
reserve trees for specific cases.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

382dbb40

mlxsw: spectrum_router: Place RIF related code with router code · 4724ba56

Ido Schimmel authored Mar 10, 2017

The inetaddr notification block is currently implemented in the main
driver file, but this isn't really appropriate, as it mainly creates and
destroys router interfaces (RIFs) which belong with the rest of the
router code.

This will become even more apparent later on when we'll need to bind
these RIFs to virtual routers according to the VRF's table.

Structure the driver better and prevent unnecessary function exports by
moving the RIF related code with the rest of the router code.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

4724ba56

mlxsw: spectrum_router: Allow more route types to be programmed · 97989ee0

Ido Schimmel authored Mar 10, 2017

Allow 'unreachable', 'blackhole' and 'prohibit' route types to be
programmed into the device by sending any packet hitting them to the
CPU.

This is needed so that users will be able to program a default route
into the VRF's table, thereby preventing lookup from leaking to other
tables.

Audit the code paths to make sure we don't rely on the presence of a
nexthop netdev, as it doesn't exist for above mentioned route types.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

97989ee0

mlxsw: spectrum: Destroy RIFs based on last removed address · f4a761d2

Ido Schimmel authored Mar 10, 2017

We only use the RIF reference count to determine when the last IP
address was removed, but instead we can just test 'in_dev->ifa_list'.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

f4a761d2

mlxsw: spectrum: Associate PVID vPort with appropriate netdev · 186962eb

Ido Schimmel authored Mar 10, 2017

When a VLAN device is configured on top of a LAG device (f.e.,
bond0.10), a vPort is created on top of each of the LAG's slaves and its
'dev' pointer is set to the VLAN device.

This is in contrast to the implicit PVID vPort (representing 'bond0'),
whose 'dev' pointer keeps pointing to the port netdev itself (f.e.,
'sw1p1').

Make both cases consistent by setting their 'dev' pointer to the actual
netdev they represent. Either the LAG device itself (in the case of the
PVID vPort) or the VLAN device on top of it.

This will later allow us to more easily understand for which netdev we
should create the router interface (RIF) upon enslavement to a VRF
master.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

186962eb

mlxsw: spectrum: Don't assume upper device's type · 1f88061e

Ido Schimmel authored Mar 10, 2017

When an upper device is configured on top of a vPort we make sure it's a
bridge master during PRECHANGEUPPER and fail otherwise. Therefore, when
CHANGEUPPER is later received we don't bother checking the upper's type.

Make the code more extendable in preparation for VRF uppers, by checking
the upper's type.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

1f88061e

mlxsw: spectrum: Sanitize bridge's upper devices · b414970e

Ido Schimmel authored Mar 10, 2017

We're going to allow bridges stacked on top of port netdevs to be
enslaved to a VRF, but for now, only VLAN uppers of the VLAN-aware
bridge are supported.

Sanitize any other bridge upper. This is consistent with the way we
sanitize port netdevs' uppers.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

b414970e

Merge branch 'mlxsw-VLAN-offload-cls_flower' · 1015d743

David S. Miller authored Mar 09, 2017

Jiri Pirko says:

====================
mlxsw: spectrum: Add support for VLAN offload for cls_flower

This patchset adds support to offload VLAN modify TC action and adds support
to offload cls_flower rules that include VID and PCP matching.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

1015d743

mlxsw: spectrum: Add support for flower matches on VLAN ID, PCP · 9caab08a

Petr Machata authored Mar 09, 2017

Introduce MLXSW_AFK_ELEMENT_VID, PCP and declare them in afk_element
infos that contain them.  Use the elements when VLAD ID or priority are
used in the flow.

Also add MLXSW_AFK_ELEMENT_VID, PCP to mlxsw_sp_acl_tcam_pattern_ipv4.
Both items are included in mlxsw_sp_afk_element_info_l2_dmac,
resp. _smac, and both MLXSW_AFK_ELEMENT_SMAC and _DMAC are already in
the pattern.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9caab08a

mlxsw: spectrum: Add support for vlan modify TC action · a150201a

Petr Machata authored Mar 09, 2017

Add VLAN action offloading. Invoke it from Spectrum flower handler for
"vlan modify" actions.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a150201a

tcp: rename *_sequence_number() to *_seq_and_tsoff() · a30aad50

Alexey Kodanev authored Mar 09, 2017

The functions that are returning tcp sequence number also setup
TS offset value, so rename them to better describe their purpose.

No functional changes in this patch.
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a30aad50

net: ks8851: Added support for half-duplex SPI · 9efd3831

Sergey Shcherbakov authored Mar 09, 2017

In original driver was implemented support for half-
and full-duplex modes, but it was not enabled. Instead
of it ks8851_rx_1msg method always returns "true" that
means "full-duplex" mode.

This patch replaces hard-coded functionality with
flexible solution that supports both SPI modes.
Signed-off-by: Sergey Shcherbakov <shchers@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9efd3831

Merge branch 'bonding-winter-cleanup' · 825d2c50

David S. Miller authored Mar 09, 2017

Mahesh Bandewar says:

====================
bonding: winter cleanup

Few cleanup patches that I have accumulated over some time now.

(a) First two patches are basically to move the work-queue initialization
    from every ndo_open / bond_open operation to once at the beginning while
    port creation. Work-queue initialization is an unnecessary operation
    for every 'ifup' operation. However we have some mode-specific work-queues
    and mode can change anytime after port creation. So the second patch is
    to ensure the correct work-handler is called based on the mode.

(b) Third patch is simple and straightforward that removes hard-coded value
    that was added into the initial commit and replaces it with the default
    value configured.

(c) The final patch in the series removes the unimplemented "port-moved" state
    from the LACP state machine. This state is defined but never set so
    removing from the state machine logic makes code little cleaner.

(d) Reduce scope of some global variables to local.

Note: None of these patches are making any functional changes.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

825d2c50

bonding: reduce scope of some global variables · dc9c4d0f

Mahesh Bandewar authored Mar 08, 2017

Many of the bond param variables are declared global while it's not
really necessary for these variables to be global. So moving them to
the location these are used.
Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

dc9c4d0f

bonding: remove "port-moved" state that was never implemented · ec891c8b

Mahesh Bandewar authored Mar 08, 2017

LACP state-machine defines "port-moved" state when the same ActorSystemID
and Port are seen in a LACPDU received on different port. The state is
never set since it's not implemented. However the state-machine attempts
to clear that state occasionally. LACP state machine is already complicated
and since this state is not implemented, removing it's checks makes the
state-machine little simpler.
Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ec891c8b

bonding: remove hardcoded value · 8b426dc5

Mahesh Bandewar authored Mar 08, 2017

Eliminate hard-coded value and use the default that is set.
Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

8b426dc5

bonding: initialize work-queues during creation of bond · 4493b81b

Mahesh Bandewar authored Mar 08, 2017

Initializing work-queues every time ifup operation performed is unnecessary
and can be performed only once when the port is created.
Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

4493b81b

bonding: restructure arp-monitor · d5e73f7b

Mahesh Bandewar authored Mar 08, 2017

In preparation to move the work-queue initialization to port creation
from current port_open phase. Work-queue initialization does not make
sense every time we do 'ifup/ifdown'. So moving to port creation phase.

Arp monitoring work depends on the bonding mode and that is not tied
to the port creation and can change anytime during the life after port
creation. So this restructuring allows us to move the initialization
at creation without losing the ability to arm the correct work for the
mode user has selected.
Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

d5e73f7b

Merge branch 'nfp-crc32-rss-hash-port-name-reporting-and-misc-fastpath-cleanups' · 96d7552e

David S. Miller authored Mar 09, 2017

Jakub Kicinski says:

====================
nfp: CRC32 RSS hash, port name reporting and misc fastpath cleanups

This series adds support for CRC32 RSS hash function to kernel API
of which NFP driver immediately makes use.  There is also a
.ndo_get_phys_port_name() implementation conforming to switchdev
name format.  Small patch takes advantage of napi_complete_done()'s
return code.  Simon provides a fix for potentially trusting values
returned from FW too much.

A handful of unassuming fast path adjustments is also thrown in to make
the upcoming xdp_adjust_head() series easier to review.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

96d7552e

nfp: prevent theoretical buffer overrun in nfp_eth_read_ports · 5692dbb5

Simon Horman authored Mar 08, 2017

Prevent theoretical buffer overrun by returning an error if
the number of entries returned by the firmware does not match those
present.

Also use a common handling error path.

Found by inspection.
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Tested-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

5692dbb5

nfp: add metadata format bit · b9dcf88a

Jakub Kicinski authored Mar 08, 2017

We only need FW version in the first cache line of adapter struct
because we need to know the metadata format.  To save space add a
metadata format bit.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

b9dcf88a

nfp: avoid rearming the interrupts when in busy poll · 7de5f115

Jakub Kicinski authored Mar 08, 2017

Make use of return code from napi_complete_done() to avoid rearming
interrupts when busy polling is on.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7de5f115

nfp: store device pointer for the fastpath · fa43d2a8

Jakub Kicinski authored Mar 08, 2017

We really only need the device pointer on the fast path, stash it at
the beginning of the adapter structure and move pci_dev pointer down.
This saves up a few lines of code.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

fa43d2a8

nfp: reorder variables in nfp_net_tx() · bef6b1b7

Jakub Kicinski authored Mar 08, 2017

Reorder variables longest to shortest to comply with netdev coding style.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bef6b1b7

nfp: move more ring debug info to debugfs · 43860c12

Jakub Kicinski authored Mar 08, 2017

We already print most of ring configuration including descriptors
in debugfs, add the few missing pieces and remove debug prints.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

43860c12